--- type: theoretical backlinks: - "[[Overview#Multiprogramming]]" - "[[Overview#Multitasking/Timesharing]]" --- ## Process A program in execution. Consists of: * The program code - **text section** * Current activity - **PC**, registers * **Stack** -> Function parameters, return addresses, local variables * Data section * **Heap** -> dynamically allocated (at run time) memory The difference between a process and a program is that the program is the executable file stored on disk, while the process **is running** (shocker). ### Creation Four events could cause processes to be created 1. System init - Daemons 2. Executing a process by "running a program" 3. A user process request to create a new process 4. Initiation of a batch[^1] job ### `fork()` A Linux [system call](Overview.md#System%20calls). ```mermaid graph LR; A["`fork()`"] --> |parent| B["wait"] A --> |child|C["`exec()`"] C --> D["`exit()`"] D --> B B --> E["Resumes"] ``` ### Hierarchy Linux creates a parent-child relationship between processes, Windows doesn't. Linux: ```mermaid graph TD; init["init pid = 1"] login["login pid = 8415"] kthreadd["kthreadd pid = 2"] sshd["sshd pid=3028"] bash["bash pid=8416"] ps["ps pid=9298"] emacs["emacs pid=9204"] khelper["khelper pid=6"] pdflush["pdflush pid=200"] init --> login init --> kthreadd init --> sshd login --> bash bash --> ps bash --> emacs kthreadd --> khelper kthreadd --> pdflush ``` ### Termination 1. Normal Process should return a code to its parent. Child processes should wait until they know that the parent received it, becoming **zombie processes**. If the parent dies before the child, the child is called an orphan. Absolutely fucking crazy naming. Every linux process should have a parent process [source: unicef](https://unicef.org). 2. Error - just a special return code 3. Fatal error, involuntary - division by zero, invalid opcode; process is immediately terminated by the system 4. Killed ### States As a state machine 1. Running 2. Ready 3. Blocked (blocking == waiting) ```mermaid graph TD; A["Running"] B["Ready"] C["Blocked"] A --> |1| C A --> |2| B B --> |3| A C --> |4| B ``` #### Ready State - In this state the process is not waiting for a resoucrce - Can be executed - Put in a queue (ready queue) #### I/O queue - I/O device has its own - Multiple queues are created by OS ## Timesharing: In-depth from [Multitasking/Timesharing](Overview.md#Multitasking/Timesharing) The output of running programs should not change when we stop and switch back to the same program later on. ### Context switching Switching implies that we have to store the values of registers, flags, PC, etc. of the current process and load them into the next one. Then we continue. ### The process control block - PCB The OS needs a place to store the status of each process. This is that data structure. ### Process table A list of PCBs (one per process) ![Figure: PS](Pasted%20image%2020250419141856.png) * Timer (`ISR`[^2]) generates multiple interrupts per second * Store the status of the process in PCB ## Threads A thread is a basic unit of CPU utilization, consisting of a program counter, a stack, and a set of registers, ( and a thread ID. ) A light-weight process. ![](Pasted%20image%2020250419143713.png) ### Processes vs threads | Processes | Thread | | ----------------------------------------- | ------------------------------------------------ | | Heavyweight | Lighter | | Each process has its own memory | Threads use memory of the process they belong to | | Inter-Process Communication (IPC) is slow | Way faster inter-thread communication | | Context switching is more expensive | Less expensive | | Do not share memory | do share memory | ### Multithreading - Traditional processes have a single thread of control[^3] * If a process has multiple threads of control, it can perform more than one task ### Ways to Implement Threads * Kernel-Level Threads (KLT) * Managed by the OS kernel * Each thread is a separate scheduling entity * `pthread`, `thread` * User-Level Threads (ULT) * Managed by user-space libraries, OS is unaware * Faster context switching * Green threads ### User Threads and Kernel Threads - **User threads** - Implemented by a thread library at the user level - thread creation and scheduling are done in user space - **Kernel Threads** - Managed by OS ### Relationship models #### Many-to-one * User-level threads to one kernel treads * Management done by thread library in user space * The entire process blocks whenever a thread makes a blocking sycalls * Only **one** thread can access the kernel at a time (you can't run multiple threads in parallel on multiprocessors) #### One-to-one Each user thread is mapped to a kernel thread - Provides more concurrency Unfortunately: - Creating a user thread requires creating the corresponding kernel thread - Overhead of creating kernel threads retricts the number of threads #### Many-to-many Multiplexes many user threads to a $\leq$ number of kernel threads. - Allows creation of however many threads the user wants - The kernel can schedule another thread for execution whenever a thread performs a blocking system call #### Fork-join Parent creates forks (children threads) and then waits for the children to terminate, joining with them, at which point it can retrieve and combine results. This is also called **synchronous threading**. Parent **cannot** continue until the work has been completed. [^1]: A batch job is a scheduled task or a set of commands that are executed without manual intervention - **cron** [^2]: interrupt service routine - like in LC3 [^3]: sequence of programmed instructions that can be managed independently by a scheduler within a computer program ##### Parallelism ![](Pasted%20image%2020250421222538.png) ## Thread pool Issue wih threads: - Overhead when creating - Exhausting system resources Solution: thread pools - creating a number of threads at startup and place them into a pool where they sit and wait for work. This optimizes everything because: Sharing threads: - If a thread is blocked (e.g., waiting for I/O), it doesn't remain idle; it can be reassigned to another task - Each thread has its own task queue - Whenever a thread finishes its tasks it looks through the other threads' queues and "steals" tasks.