Understanding Processes, System Calls, and Pipes in Unix-like Systems

32bit: 4 byte int, 4 byte long and pointer is 4 bytes
64bit: 4 byte int, 8 byte long and pointer is 8 bytes

File descriptor: a small non-negative integer that is a reference to an open file
Use open() gain access to a file
Use read(fd, buf, size) to read bytes from a file
Use write(fd, buf, size) to write bytes to a file
Before compilation, C source files are preprocessed (e.G., translated)
The C Preprocessor (cpp) handles: Include files: #include<foo.H>, Macro expansion: #define NMAX 3, Conditional compilation,#define MAX_COUNT 3, #define ADD(x,y) x + yDereference a pointer with “*” in expressions: x = *p; *p = 1;
Use address of “&” to get the address of a variable: p = &x; foo_ptr = &foo;
All pointers are the same size: 32 bits (4 bytes), 64 bits (8 bytes)

Process Definitions


A program in execution, An instance of a program running on a computer, A process has: User-level state:
memory + register values. Memory: stack, heap, code, Kernel
Level state: stack, process info, memory maps, file descriptors, network connections, etc.

Process Execution


 A kernel provides a mechanism to create new processes. fork() in UNIX. On creation, memory is allocated for the process. Code is loaded into memory, the stack is initialized, The process begins at a pre-determined location: main()

The Stack


The stack is a special region of memory used to support program execution. Primarily used for supporting function calls. Passing parameters, local variables, return address. Used by compilers to “spill” registers to memory. Most programming languages use a stack to support program execution

Process execution needs


Stack pointer (sp), Instruction pointer (ip or pc), General registers (r1, r2, eax, ebx, etc.)

UNIX Concepts:


Processes, System calls, File descriptors, File and directories, Pipes

Overview of System Calls:


The OS kernel is an extended machine, The kernel provides convenient abstractions for system resources: Processes, Memory, Files, Network. User mode processes access kernel abstractions using system calls. In C, a system call looks like a function call: pid = getpid(). However, unlike normal functions, a system call transfers control to the kernel
System Calls vs Library Functions: A system call is executed in the kernel, A library function is executed in user space, Some library calls are implemented with system calls, printf() really calls the write() system call. Programs use both system calls and library functions

UNIX System Calls:


Files and I/O, Processes, Pipes, Signals, Time, Network I/O with sockets

UNIX Files and I/O:


Steps for accessing a file (and devices): open a file/device (existing or new), open returns a file descriptor (fd), Use the fd to access the file/device (read/write/etc.), close(fd) when finished, It is possible to open more than one file at a time, You should alway close a file when you are finished

In user space:


Code (C functions), Data (static and dynamic), Stack, Registers

In kernel space:


Process Control Block, Priority, File descriptors, Memory map, Others

UNIX Processes:


Processes create other processes with the fork() system call. fork() creates an identical copy of the parent process. We say the parent has cloned itself to create a We can tell the two process apart use the return value of fork(). In parent: fork() returns the PID of the new child, In child: fork() returns 0

Process Creation Details:


fork() creates a copy of the current process, It is the only way to create a new process, What is copied on fork(): Memory of process (code, data, stack), Registers (stack pointer, program counter, etc.), File descriptors, Command-line arguments, Environment, Only the return value is different

File Descriptor Table (FD Table):


Each Process has a file descriptor table in the kernel (in the Process Control Block), fork() duplicates the file descriptor table, Draw picture of two file descriptor tables with stdin, stdout, stderr.

Modifying the FD Table:


In order to redirect input/output, we need to modify the file descriptor table. New system call: int dup(int oldfd). Makes a copy of the given fd in the first available fd table slot. We can use close() to open up slots

Interprocess Communication (IPC):


There are two basic types of IPC

Explicit mechanisms: System call interface such as pipes, sockets, message passing, and remote procedure call (RPC)Shared memory: Using virtual memory, allow processes to share a portion of memory.  Processes communicate by accessing shared variables.

The UNIX pipe is a form of local IPC. Using pipes, processes that are on the same machine can communicate with each other. UNIX Sockets allow for IPC among processes that do not necessarily reside on the same machine.

UNIX Pipes:


A UNIX pipe is an IPC mechanism, A pipe allows for the exchange of data between processes, A pipe is used to send a stream of character data, Create a pipe using the pipe() system call, At the user level, a pipe is just a pair of file descriptors, One for reading (0), One for writing (1), Use read() and write() system calls to receive and send data, Order: FIFO (First in, first out), Implementation: a circular buffer that resides in the kernel

Uses for Pipes

General mechanism to allow related processes running on the same machine to communicate, Note: we need something else to allow inter-machine communication (i.E., sockets), Use by the shell to “connect” programs ls | wc, Gives the number of files in current directory, who | wc
Gives the number of users logged in, ls | sort -r | head, 
gives the last ten files in a directory in alphabetical order, Note that shell, pipes are unidirectional, System pipes can be bidirectional

The Pipe System Call:


 int pipe(int fildes[2]), 
Check return value for possible errors, int fildes[2] is just an array of two ints, fildes[0] is for reading (the “read end”), fildes[1] is for writing (the “write end”), Remember the read end is 0 (like stdin) and the write end is 1 (like stdout), Usage int fildes[2];, pipe(fildes);

Connecting Processes:


Consider a parent that wants to send data to a child. In Parent: create a pipe, pipe(filedes), fork() a child, close(filedes[0]) (don’t leave open ends), write to child with, write(filedes[1], …), close(filedes[1]) on completion (important). In Child: close(filedes[1]) (don’t leave open ends), optional: redirect stdin and use execl(), read from parent with, read(filedes[0], …), close(filedes[0]) when done

Notes on Pipes:


Can setup many configurations, Parent writes to a child (1 pipe), Child writes to a parent (1 pipe), Parent writes to a child and child writes to parent (2 pipes), Parent connects to child (1 pipe, like the shell), Parent connects two children (1 pipe), One processes with a common parent can communicate with a standard pipe, There is a “named” pipe that can exist in the filesystem, You need to close the ends of the pipe to ensure correct termination, Use dup() to redirect stdin or stdout to a pipe