Understanding Operating Systems: Processes, Threads, and System Calls

OS Kernel

Definition: The software layer that sits between applications and hardware.

The kernel provides:

  • Protection via preemption, interposition, and privilege
  • Resource management via virtualization and scheduling

Key Concepts:

  • One of the major tasks of an operating system is the protection and isolation of different processes from each other and from the system’s kernel.
  • Stable OS interfaces allow us to write code that, once compiled, runs on different machines with the same or a compatible instruction set architecture (ISA) and can interface with its environment.
  • FALSE: Dual-mode operation refers to the permission system resulting from having unprivileged users (e.g., your account on rlogin) and privileged admin users, i.e., admin mode vs. user mode. OS permissions are separate from hardware modes.
  • It is possible for two processes to use the same addresses for their variables without conflicting with each other.
  • Because the OS kernel is usually a program systems designers trust to be correct, defending against vulnerabilities in kernel code is substantially harder than defending against vulnerabilities in user mode.
  • Operating systems provide abstractions that can be accessed through interfaces.
  • FALSE: Examples of abstractions provided by an OS include CPU control registers (e.g., %cr3), memory-mapped I/O devices, and devices such as timer chips/circuitry. These are examples of details abstractions are designed to hide.
  • FALSE: The OS kernel is the boot program that allows a computer to start up, at which point it will finish and pass control to application programs. The kernel is not just a boot program but provides services to applications and manages resources while the machine is running.
  • Dual-mode operation provides a line of defense against bugs in untrusted user code.
  • When a process terminates, the OS will close all of its low-level file descriptors automatically.

Processes

Definition: An instance of a program that is being executed.

The OS provides the following for each process:

  • Logical flows of control
  • Private and protected address spaces
  • Abstract resources (file descriptors, reading/writing)

Process States:

  • RUNNING: The process is currently being executed by the CPU.
  • READY: The process is waiting to be assigned to a CPU for execution.
  • BLOCKED: The process is waiting for an event, such as I/O completion or a signal.

Process State Transitions:

  • RUNNING to BLOCKED: The process is running but encounters a situation requiring it to wait (e.g., waiting for input, exclusive access to a resource, a signal from another thread/process, for time to pass, or a child process to terminate).
  • BLOCKED to READY: The process was waiting for an event, and the event has occurred, making the process ready for execution.
  • READY to RUNNING: The scheduler chooses the process from the ready queue and assigns it to a CPU for execution.
  • RUNNING to READY: The running process gets descheduled, either because its time slice has expired or to give another ready process a turn.

Common Misconceptions:

  • FALSE: To help prevent attacks, on most OS, the number of processes that are currently in the RUNNING state is a closely guarded secret accessible only to system administrators. The number of running processes is typically easily accessible, even on shared machines. For example, run the command ‘uptime’ on rlogin or ‘cat /proc/loadavg’.
  • FALSE: On a well-balanced machine with n CPUs (or cores), a user controlling the machine would typically find n processes in the READY and n processes in the RUNNING state. A machine with additional READY processes could keep additional CPUs busy. It would not be well-balanced but running under a higher load.
  • FALSE: A laptop’s battery life is heavily related to how much blocking is encountered on the machine, that is, the time-averaged number of BLOCKED processes in the system. BLOCKED processes don’t consume any resources (besides some memory), so any impact on battery life is minimal.

Events Causing a Process to Enter the BLOCKED State

  • The process issues the sleep() system call with an argument of 10 seconds (yes)
  • The process enters an infinite loop (no)
  • The scheduler switches out the process to run a different process (time sharing) (no)
  • The process performs the getpid() system call (no)
  • The process attempts to lock a (process-shared) mutex that is held by another process (yes)
  • The process performs a read on a file descriptor corresponding to a file whose data has not yet been read from an I/O device (yes)
  • The process uses C11 atomics to loop on a flag (e.g., “while (!done) …”) (no)

Routines Likely Requiring a System Call

  • Routine that performs a string comparison (no)
  • Routine that spawns a thread that can simultaneously execute on another CPU (yes)
  • Routine that computes a cryptographic signature over a piece of data (no)
  • Routine that writes data to an I/O device (yes)
  • Routine that sets up a communication channel between two processes (yes)
  • Routine that spawns multiple lightweight cooperatively scheduled (non-preemptive) tasks (no)
  • Routine that computes a thumbnail of a JPG image (no)

Context Switching and Time-Sharing

Context Switching: The process of loading multiple processes into memory and switching to another process when the current process is blocked.

Time-Sharing: Switching to another process periodically to ensure each process makes progress.

Dual Mode

Kernel Mode to User Mode

  • Special privileged instruction (specific CPU instruction – Intel: iret)
  • Return from interrupt or careful action to resume user program execution (after handling syscall, carefully switch back)

User Mode to Kernel Mode

  • When a user needs to perform an action that requires higher privileges (hardware stuff), it switches to kernel mode by calling a syscall to a known, safe entry point in the kernel.
  • External reasons (hardware): Asynchronous, can happen anytime no matter what the CPU is doing, hardware problems
  • Internal reasons (software): Synchronous, exceptions, errors while executing the program

POSIX SPAWN

int posix_spawn(pid_t *restrict pid, const char *restrict path, const posix_spawn_file_actions_t *file_actions, const posix_spawnattr_t *restrict attrp, char *const argv[restrict], char *const envp[restrict]);

posix_spawn creates a new child process, executing the program specified by path, with pid storing the child’s process ID, file_actions defining file descriptor operations to be performed before execution, and envp providing the environment for the new process.

Shell Commands

a) Run the GCC compiler with the -v flag on file bad.c, then search what it outputs to standard output or standard error for the word”erro”. Redirect all lines containing this word into a file called errors.txt.

gcc -v bad.c 2>&1 | grep error > errors.txt

b) Start Nginx in the background, redirecting its standard output stream to /tmp/nginx.log.

nginx > /tmp/nginx.log &

c) Bring background job #3 into the foreground, then attempt to terminate it. (Assume your shell currently has a job using job ID #3 running.)

fg 3; ^C (This is csh syntax, bash would be fg %3 instead)

Piping

pipe2(): This system call creates a pair of file descriptors pointing to a pipe inode, which can be used for inter-process communication (IPC). The O_CLOEXEC flag indicates that the file descriptors should be closed on execution of an exec type system call, which replaces the current process image with a new one.

fork(): A system call used to create a new process. The new process is called the child process, and it runs concurrently with the process that made the fork() call (the parent process).

clone(): Creates a new process or thread. The clone system call is a more flexible version of fork(), allowing for specific details of the new process’s behavior to be defined.

join(): A function used with threads to wait for the thread to terminate. When one thread calls join() on another thread, it is put into a waiting state until the specified thread finishes execution. It is used to cleanly exit a thread. When the thread’s task is completed, the resources it was using can be released, and the thread’s status can be collected.

dup2(): A system call that duplicates a given file descriptor, making the new file descriptor point to the same open file entry as the old one, which can be used to redirect input or output streams.

execve(): A system call that replaces the current process image with a new process image, typically used to run a new program, with the specified environment and argument list provided by the caller.

exec(): This function replaces the current process in memory with a new process specified by the executable file argument, essentially running a new program within the same process.

setpgid(): This system call sets the process group ID of a specified process, enabling job control and the grouping of related processes for signaling in Unix-like systems.

close(): This system call closes a file descriptor so that it no longer refers to any file and may be reused. Any record locks held on the file it was associated with are released. It is important to close file descriptors when they are no longer needed to prevent resource leaks, which can lead to the process running out of file descriptors and thus being unable to open new files or sockets.

wait4(): This system call suspends execution of the calling process until a child specified by the pid argument has changed state. It is used by a parent process to wait for the termination of its child processes. The WIFEXITED macro (used in the condition) determines if the child terminated normally, and WEXITSTATUS obtains the exit status of the child, which is the code the child passed to _exit() or exit(), or the value the child process returned from main().

Closing file descriptors: Important to prevent descriptor table overflow, which can lead to errors and unstable behavior in programs. It’s a way of telling the operating system that you are done using the resource, and it can be cleaned up.

Waiting for child processes: With wait or wait4 is necessary for proper process synchronization. When a child process ends, it becomes a zombie process until the parent process retrieves the child’s exit status, at which point the operating system can clean up the remaining information about the child process. Not waiting for child processes can lead to a resource drain, as the system maintains information for these zombie processes.

Multithreading

Threads: Separate logical flows of control within a process. They share the address space and file descriptors but do not share stacks and registers.

pthread_t x;

pthread_create(&x, NULL, thread_function, &info);

pthread_join(x, (void**) &status);

Data race: Occurs when two or more threads in a concurrent application access a shared variable simultaneously, with at least one thread writing, leading to unpredictable results due to changes in the timing and order of execution. This condition can cause serious errors in programs, as the outcome depends on the non-deterministic scheduling and interleaving of threads by the CPU.

Semaphores: A synchronization mechanism that uses a counter to control access to a shared resource by multiple processes or threads in a concurrent system, allowing a specific number of threads to hold the semaphore and access the resource simultaneously. (sem_init(&x, 0, init_val), sem_wait(&x), sem_post(&x), sem_t x); Look at the initial value: if 1, the semaphore represents a mutex; if 0, the semaphore is used for ordering.

  • wait(): If the counter > 0, decrement; otherwise, block until it > 0, then decrement.
  • post(): Increment the counter and ensure any threads blocked in wait are unblocked.

Condition variable: A synchronization primitive that allows threads to wait for certain conditions to be met before proceeding, typically used to coordinate the order of thread execution and to efficiently pause and resume threads when resources become available or certain states are achieved. MUST ALWAYS BE CHECKED IN A LOOP

  • wait(): Add the current thread to the queue and enter the BLOCKED state.
  • signal(): If any threads are on the queue, take the first one and make it READY.
  • broadcast(): Remove all threads from the queue and make them READY.

The Fork-Join model: A parallel programming approach where a process (the “fork”) splits into multiple parallel threads or tasks to perform work concurrently and then synchronizes back together (the “join”) to aggregate results or continue execution as a single thread.

Monitor: A synchronization construct that allows threads to have both mutual exclusion (using an internal mutex) for accessing shared variables and the ability to wait for certain conditions to be met (using condition variables), enabling safe interaction and coordination between threads on shared resources.

Proper way of locking and using a conditional variable:

Improper way

pthread_mutex_lock(&pool->mutex);

while (!condition) { // 'condition' should be some state that is checked
   pthread_cond_wait(&pool->cond, &pool->mutex);
}
// ... code to handle the condition ...
pthread_mutex_unlock(&pool->mutex);

pthread_mutex_lock(&pool->mutex);

pthread_cond_wait(&pool->cond, &pool->mutex);

pthread_mutex_unlock(&pool->mutex);

This code always runs the risk of the signal that would otherwise unblock the thread from the condition wait call having already been sent (and missed) before the call to condition wait is made. If this was the last signal, the caller will remain deadlocked. Proper use of the monitor pattern demands that a thread acquire the lock, then check whether it must wait (or whether what it would be waiting for has already occurred).

Locking

Locking: Necessary in concurrent programming when multiple threads or processes need to access shared resources to prevent them from modifying the data simultaneously, which can cause data races and undefined behavior.

When to lock:

  • When entering a critical section where shared variables are read or written.
  • When performing operations that must be atomic to maintain data integrity.

When to unlock:

  • After the critical section has been executed to allow other threads or processes to enter their critical sections.
  • When a waiting condition is met, and another thread needs to be allowed to proceed.
  • If an error occurs and resources need to be cleaned up to prevent deadlocks.

Deadlock: Occurs when two or more processes are unable to proceed because each is waiting for the other to release a resource they need, creating a cycle of dependencies with no process able to proceed.

Starvation: Happens when one or more processes are perpetually denied the resources they need to proceed, often because the system’s scheduling or resource allocation algorithm continuously favors other processes.

Linking and Loading

C8OzebzSSjFxqxyREcUAc1WDLELQC9PSnfRKcmPijqbDb22RUFIfMWvfeVJypT9M5s0-0sR_7R4fMQkzn5tuZepR9cV3_wVA4YFHFWrnQimECfTVLvQHeT0-eBoFqrpP8A-ar5i3L7eKsYPUpGwNw2A

Compilation Toolchain

(i) The -c flag tells GCC to run the preprocessor, compiler, and assembler to create an object file but not to run the linker.

The preprocessor handles directives like #include and #define, the compiler translates the preprocessed C code into assembly, and the assembler then turns the assembly code into object code, which is machine code that the linker can use but not yet executable.

Generated Filename

(ii) The -c flag generates an object file with the same base name as the source file, so prog.c becomes prog.o.

Symbols for Linker

(iii) The nm tool would list the symbols found in prog.o:

  • U add_10 indicates an unresolved symbol for the add_10 function. “U” stands for “undefined”, meaning this function is expected to exist in another object file or library.
  • T main suggests that the symbol main is in the text section (the part of the object file that contains executable code), and “T” indicates that main is defined in this object file.
  • D x denotes that x is a global integer variable initialized with a value (hence in the data segment), and “D” means it’s defined.
  • d y shows that y is a static (local to this file) pointer variable, also initialized, and its scope is limited to this object file.

GCC Command: gcc prog.c

Compilation Toolchain (i): Omitting the -c flag means the full compilation process runs, which includes preprocessing, compiling, assembling, and linking. The linker tries to resolve all external references to make an executable.

Expected Shell Output (ii): Since the function add_10 is declared but not defined within any file provided to the compiler, and there is no indication that the function is defined in a separate module or a library, the linker will not find the necessary code to resolve the reference to add_10. Therefore, it will produce an error message indicating that there is an undefined reference to add_10, which prevents the creation of an executable.

C Code for Writing and Reading

9RSQC5w2GBdcQ5u5yrWZii_Au_zFwErgXETTAQ8FcXRubdfyJQzQBLAs14gl7ZUHLSVER9rzsXz1zK58UO7yrmrYpmhYH5E4ViW16JKGoF09B9zLAsvzTjmEkyqb3zBbugtNnm7-DDoj7H38IFfKTCQ

Java Code for Writing and Reading

RXfeiNRQQ0EDhV8hNWGMy_F87n7OjlGZqWfE9E_dq8sSZWVl2VEcD6KSHxgxaz6xWi5mvOQiBoKRN7c5Emlmf0zokDOo3-NNU1SoviigPrcBAD14ixT23CaEvFvLG4oyItORuEZw_FTyEa6cV5ytq_U

Another Example

iDhR60k_VcMvoMGIC3S_RWcnB1jJt61A7nfNgYTbM6Boszc5AF7HUYjpj1bMCJCI1fbngA2ceJCj1-lRP9jKAkWFyxXauhFrW3xXrJlAkonNHEEyKqaD45OypG4J1GHmfN9ZvrmhCAo9U8W_mOT7K5w