Linux System Fundamentals: Files, Processes, and Hierarchy
Linux File System Architecture
The Linux file system is a hierarchical structure that organizes data on storage devices. Unlike Windows, which uses drive letters (C:, D:), Linux uses a single, unified structure starting from the root directory (/).
1. Files, Inodes, and Structure
A. Linux Files: Everything is a File
In Linux, everything is treated as a file, which simplifies system interaction. This includes:
- Regular Files: Text files, executable programs, images, documents.
- Directories: Special files that contain lists of other files and directories.
- Device Files: Interfaces to hardware (e.g.,
/dev/sdafor a hard drive,/dev/tty1for a console). - Links: Pointers to other files (Hard Links and Symbolic/Soft Links).
B. Inodes and File Structure
The core concept that defines a file in Linux is the inode (index node).
- Inode Definition: This is a data structure on the disk that stores all information about a file except its name and its actual data.
- Inode Contents (Metadata): Metadata such as the file’s type, size, permissions (read/write/execute), owner ID, group ID, creation/modification timestamps, and crucially, pointers (addresses) to the disk blocks where the file’s data is stored.
- File Name Mapping: The file name (and its directory path) is stored in the directory entry. The directory entry simply maps the human-readable file name to its corresponding inode number.
2. File System Components
A complete Linux file system on a partition generally consists of four main areas:
- Boot Block: Contains the bootloader and instructions needed to start the operating system.
- Superblock: Contains vital information about the entire file system, such as total size, number of free inodes, number of free data blocks, and the location of the inode table. It is crucial for mounting the file system.
- Inode Table (or Inode List): The section where all the inodes reside. Each file or directory on the partition has a unique entry (inode number) here.
- Data Blocks: This is where the actual file content is stored. The addresses of these blocks are listed in the file’s corresponding inode.
3. Standard Linux File System Hierarchy (FHS)
The Filesystem Hierarchy Standard (FHS) defines the standard directory structure in Linux. Everything branches out from the root directory (/).
| Directory | Purpose | Example Contents |
|---|---|---|
/ | Root Directory: The highest level of the file system hierarchy. | Contains all other directories. |
/bin | Binaries: Essential user command binaries (e.g., ls, cat, date). | |
/etc | Etcetera: Configuration files for the entire system (e.g., network settings, user passwords). | /etc/passwd, /etc/fstab |
/home | Home Directories: Personal files and settings for regular users. | /home/alice, /home/bob |
/usr | Unix System Resources: Second major hierarchy, typically read-only data, utilities, and applications. | /usr/bin, /usr/lib |
/var | Variable Data: Files that change frequently, like logs, temporary internet files, and print queues. | /var/log, /var/mail |
/tmp | Temporary: Files that are deleted between reboots. | Temporary application files. |
/dev | Devices: Contains files that represent hardware devices. | /dev/sda, /dev/null |
/proc | Processes: A virtual filesystem providing information about running processes and kernel status. |
4. Common Linux File System Types
Linux supports various file system types, which define how data is stored and retrieved on a partition.
- ext2/ext3/ext4 (Extended File System):
- ext2: The original standard, no journaling.
- ext3: Introduced journaling, which improves reliability and speeds up recovery after a system crash by logging changes before they are made.
- ext4: The modern default, offering increased speed, larger file size support, and improved performance features (extents).
- XFS: A high-performance, highly scalable 64-bit journaling file system developed by SGI, popular for large file systems and high-throughput environments.
- Btrfs (B-tree file system): A modern, copy-on-write (CoW) file system focusing on fault tolerance, repair capabilities, and easy administration (supports snapshots, RAID, and volume management).
- FAT/NTFS: Used primarily for interoperability with Windows or older systems (e.g., USB drives).
Linux Process Management and Control
A process in Linux is an instance of a running program. Every command you execute runs as a process, each identified by a unique Process ID (PID).
1. Initialization Processes (PID 1)
The system is bootstrapped by the first process, which has a PID of 1. This process is responsible for managing all other processes and services on the system.
- init (Historical): The traditional initialization system in older Linux distributions.
- systemd (Modern): The current standard in most major distributions (like Ubuntu, Fedora, Debian). systemd is a complex system and service manager that handles system startup, manages services, and maintains all other processes. It remains active until the system is shut down.
2. Mechanism of Process Creation: Fork and Exec
New processes are created in a two-step sequence using system calls:
fork(): The running process (the parent) callsfork(). This creates a near-exact copy of itself (the child process). The child process inherits the parent’s memory, file descriptors, and environment variables. The only major difference is the PID.exec(): Immediately after forking, the child process calls one of theexecfamily of functions (e.g.,execve). This function replaces the child process’s memory space and code with the code of the new program that is to be executed. The child process then starts running the new program.
3. Starting and Stopping Processes
A. Starting Processes (Foreground vs. Background)
- Foreground: By default, a command runs in the foreground. The terminal is locked until the process finishes.
- Example:
vi my_file.txt
- Example:
- Background: To run a process in the background and immediately regain control of the terminal, append an ampersand (
&) to the command.- Example:
long_running_script.sh &
- Example:
B. Stopping and Managing Processes (Signals)
Processes are managed using signals, which are software interrupts sent to a process. The kill command is the standard way to send signals.
| Command | Signal Name | Signal Number | Action | Description |
|---|---|---|---|---|
kill <PID> | SIGTERM | 15 | Graceful Termination | Requests the process to shut down cleanly (allows cleanup). This is the default signal. |
kill -9 <PID> | SIGKILL | 9 | Forceful Termination | Immediately stops the process without allowing it to clean up. Use as a last resort. |
kill -15 <PID> | SIGTERM | 15 | Graceful Termination | Explicitly sends the graceful termination signal. |
killall <name> | SIGTERM (or other) | 15 (or other) | Terminates all processes with a specified name. | killall firefox |
pkill <name> | SIGTERM (or other) | 15 (or other) | Kills processes based on name and other criteria (more flexible than killall). | pkill -u user_name |
4. Job Control in Linux Shells
Job control refers to managing multiple processes (jobs) within a single shell session, often utilizing background and foreground switching.
| Command / Key | Function | Description |
|---|---|---|
jobs | Lists all running or stopped jobs in the current shell. | Shows the job number (e.g., [1], [2]). |
fg %<job_id> | Brings a background job to the foreground. | Example: fg %1 |
bg %<job_id> | Resumes a stopped job in the background. | Example: bg %2 |
| Ctrl + Z | Suspends (stops) the current foreground process, moving it to the background. | You can then use bg or fg. |
| Ctrl + C | Sends the SIGINT signal (Interrupt), usually terminating the foreground process gracefully. |
5. Scheduling Commands in Linux
Linux offers several tools to execute commands at a specific time or on a recurring schedule.
A. Single Execution Scheduling
| Tool | Function | Description | Example |
|---|---|---|---|
at | Executes a command once at a specified future time. | Commands are typed directly or piped to at. | echo "backup.sh" | at 23:00 |
batch | Executes commands once when the system load level drops below a specific value (load average < 1.5). | Commands are typed directly or piped to batch. | batch (then type commands) |
B. Recurring Scheduling
| Tool | Function | Description | Key Command |
|---|---|---|---|
cron | Executes commands periodically (hourly, daily, monthly, etc.). | Jobs are defined in a file called a crontab, using a five-field time/date syntax. | crontab -e (to edit the user’s schedule) |
C. Time Measurement
| Tool | Function | Description | Example |
|---|---|---|---|
time | Measures the execution time of a command. | Reports real (wall clock), user (CPU time spent in user mode), and sys (CPU time spent in kernel mode) time. | time find / -name "*.conf" |
