Processes and Threads

A "processor" can only run one unit of execution (process/thread) at a time. The processor is switched (context switch) among multiple applications so all will appear to be progressing (albeit potentially at reduced speed). The processor and I/O devices can be used efficiently: When application performs I/O, the processor can be used for a different application.

What is a process?

An abstraction of a running program.

The Process Model

A process has a program, input, output, and state (data).
A process is an instance of an executing program and includes
- Variables ( memory )
- Code
- Program counter ( really hardware resource)
- Registers
- ...

Process: a running program

A process includes:

Address space
Process table entries (state, registers): Open files, thread(s) state, resources field

A process tree:

A created two child processes, B and C
B created three child processes, D, E and F

     A
   /   \
   B    C
 / | \
 D E F

Address Space

Defines where sections of data and code are located in 32 or 64 address space.
Defines protection of such sections: ReadOnly, ReadWrite, Execute
Confined "private" addressing concept: requires form of address virtualization

Process Creation

System initialization
- At boot time
- Foreground
Background (daemons)
Execution of a process creation system call by a running process
A user request
A batch job
Created by OS to provide a service
Interactive login

Process Termination

Normal exit (voluntary)
Error exit (voluntary)
Fatal error (involuntary)
Killed by another process (involuntary)

Implementation of Processes

OS maintains a process table Process procs[];
An array (or a hash table) of structures
One entry per process (pid is the uniq id)

Implementation of Processes: Process Control Block (PCB)

Contains the process elements
It is possible to interrupt a running process and later resume execution as if the interrupt had not occurred → state
Created and managed by the operating system
Key tool that allows support for multiple processes

Includes: Identifier, state, priority, program counter, memory pointers, context data, I/O status information and accounting information.

Fork

Creation of a new process by fork(). Executing a program in that new process. Signal notifications.

The kernel boot manually creates ONE process (the init process, pid=0) and all other processes are created by fork().

`fork()`

#include <stdio.h>
#include <unistd.h>
int main(int argc, char **argv) {
  pid_t pid = fork(); // syscall that creates new PCB and duplicates Address Space
  if (pid == 0) {
    // child process
  } else if (pid > 0) {
    // parent process
  } else {
    // fork failed
    printf("fork() failed!\n");
    return 1;
  }
}

`execv()`

The exec() family of functions replaces the current process image with a new process image.

Process State Model: Five-State Model

Using Queues to Manage Processes

Multiprogramming

One CPU and several processes
CPU switches from process to process quickly

Running the same program several times will not result in the same execution times due to:

interrupts
multi-programming

Concurrency vs. Parallelism

Concurrency is when two or more tasks can start, run, and complete in overlapping time periods. It doesn't necessarily mean they'll ever both be running at the same instant. For example, multitasking on a single-core machine.
Parallelism is when tasks literally run at the same time, e.g., on a multicore processor.

Threads

Multiple threads of control within a process: unique execution
All threads of a process share the same address space and resources (with exception of stack)

Why Threads?

For some applications many activities can happen at once:
- With threads, programming becomes easier
  - Otherwise application needs to actively manage different logical executions in the process
  - This requires significant state management
- Benefit applications with I/O and processing that can overlap
Lighter weight than processes
Can be used to implement concurrency
- Faster to create and restore: we just really need a stack and an execution unit, but don't have to create new address space etc.

Processes vs. Threads

Process groups resources: Address Space, files
Threads are entities scheduled for execution on CPU
Threads can be in any of several states: running, blocked, ready, and terminated (remember the process state model?)
No protections among threads (unlike processes) [Why?] → this is important
The unit of dispatching is referred to as a thread or lightweight process (lwp)
The unit of resource ownership is referred to as a process or task (unfortunately in linux struct task represents both a process and thread)
Multithreading: The ability of an OS to support multiple, concurrent paths of execution within a single process
Process is the unit for resource allocation and a unit of protection.
Process has its own (one) address space.
A thread has:
- an execution state (Running, Ready, etc.)
- saved thread context when not running
- an execution stack
- some per-thread static storage for local variables
- access to the memory and resources of its process (all threads of a process share this)

Kernel-Level Threads (KLTs)

Thread management is done by the kernel. No thread management is done by the application.

Advantages:

The kernel can simultaneously schedule multiple threads from the same process on multiple processors
If one thread in a process is blocked, the kernel can schedule another thread of the same process
Kernel routines can be multithreaded

Disadvantages:

The transfer of control from one thread to another within the same process requires a mode switch to the kernel

Implementing Threads in Kernel Space

Kernel knows about and manages the threads
No runtime is needed in each process
Creating/destroying/(other thread related operations) a thread involves a system call

Advantages:

When a thread blocks (due to page fault or blocking system calls) the OS can execute another thread from the same process

Disadvantages:

Scalability (operating systems had limited memory dedicated to them)
~~Cost of system call is very high~~ (Disagree because if you want to implement interruption to do thread scheduling you have to use signal(SIGVTALARM) which is much more expensive.)

User-Level Threads (ULTs)

All thread management is done by the application
Initially developed to run on kernels that are not multithreading capable
The kernel is not aware of the existence of threads

Implementing Threads in User Space

Threads are implemented by a library
Kernel knows nothing about threads
Each process needs its own private thread table in userspace
Thread table is managed by the runtime system

Advantages

Thread switch does not require kernel-mode
Scheduling (of threads) can be application specific
Can run on any OS
Scales better

Disadvantages

A system-call by one thread can block all threads of that process
Page fault blocks the whole process
In pure ULT, multithreading cannot take advantage of multiprocessing

PCB vs. TCB

Process Control Block handles global process resources. Thread Control Block handles thread execution resources.

Per process items

Per thread items

Address space

Program counter

Global variables

Registers

Open files

Stack

Child processes

State

Pending alarms

Signals and signal handlers

Accounting information

1:1, M:1, M:N

Thread Models are also knows as general ratio of user threads over kernels threads

1:1: each user thread == kernel thread
M:1: user level thread mode
M:N: hybrid model

Context Switch

Scenarios:

Current process (or thread) blocks OR
Preemption

Operations to be done:

Must release CPU resources (registers)
Requires storing "all" non provileged registers to the PCB or TCB save area
Tricky as you need registers to do this
All written in assembler
Typically an architecture has a few privileged registers so the kernel can accomplish this

PreviousIntroduction NextScheduling

Last updated 3 years ago