Operating System Fundamentals: Core Concepts & Components

Understanding Operating Systems

An operating system acts as an intermediary between a user of a computer and the computer hardware.

Scheduling Challenges

Scheduling involves managing machine time. For example, a user might sign up for an hour but finish their job in 45 minutes. This would result in wasted computer idle time. Conversely, a user might run into the problem of not finishing their job in the allotted time.

Operating System Types

  1. Serial Processing
  2. Batch Processing
  3. Multitasking or Time-Sharing System
  4. Multiprocessing
  5. Multiprogramming
  6. Distributed System
  7. Real-Time Operating System

Core Functions of an OS

  1. Memory Management
  2. Process Management
  3. Device Management
  4. File Management
  5. Other Important Activities

The Operating System Kernel

A kernel is a central component of an operating system. It acts as an interface between user applications and the hardware. The sole aim of the kernel is to manage the communication between the software and the hardware.

Kernel Architectures

  1. Monolithic Kernels: Monolithic Kernels are those where user services and kernel services are implemented in the same memory space; i.e., different memory for user services and kernel services are not used in this case.
  2. Microkernels: This architecture primarily addresses the problem of the ever-growing size of kernel code, which is difficult to control in the monolithic approach. This architecture allows some basic services like device driver management, protocol stack, and file system to run in user space.
  3. Hybrid Kernel: A Hybrid Kernel is a combination of both Monolithic Kernel and Microkernel. It leverages the speed of Monolithic Kernels and the modularity of Microkernels.
  4. Nanokernel: In a Nanokernel, as the name suggests, the entire code of the kernel is very small; i.e., the code executing in the privileged mode of the hardware is very small.
  5. Exokernel: Exokernel is an Operating System kernel developed by the MIT Parallel and Distributed Operating Systems group.

Kernel Functions

  1. CPU Management
  2. RAM Management
  3. Memory Management
  4. Device Management

Operating System Architectures

  1. Monolithic Architecture
  2. Layered Architecture
  3. Virtual Memory Architecture
  4. Client/Server Architecture

Understanding the Shell

A shell is a program that provides the traditional text-only user interface for Linux and other Unix-like operating systems. Its primary function is to read commands typed into a console or terminal window and then execute them.

Types of Shells

  1. The C Shell: It incorporated features such as aliases and command history. It includes helpful programming features like built-in arithmetic and C-like expression syntax.
  2. The Bourne Shell: It is the original UNIX shell. It is faster and more preferred. It lacks features for interactive use, like the ability to recall previous commands.
  3. The Korn Shell: It includes features like built-in arithmetic and C-like arrays, functions, and string-manipulation facilities. It is faster than the C shell and compatible with scripts written for the C shell.
  4. GNU Bourne-Again Shell (Bash): It is compatible with the Bourne shell and includes features from both Korn and Bourne shells.

Process Concepts

A process is a program in execution. A process is not the same as program code but a lot more than it. A process is an ‘active’ entity, as opposed to a program which is considered to be a ‘passive’ entity. Attributes held by a process include hardware state, memory, CPU, etc.

Process Control Block (PCB)

A Process Control Block (PCB) is a data structure used for storing information about processes. This information is used by the CPU at runtime. The Process Control Block is maintained by the Operating System for every process. The PCB is identified by an integer Process ID (PID).

Process Operations

  1. Process Creation
  2. Process Termination
  3. Process Hierarchies
  4. Process Implementation
  5. Cooperating Processes

Scheduling Algorithms

Round Robin (RR) Scheduling

One of the oldest, simplest, fairest, and most widely used algorithms is Round Robin (RR). In Round Robin scheduling, processes are dispatched in a FIFO (First-In, First-Out) manner but are given a limited amount of CPU time called a time-slice or a quantum. If a process does not complete before its CPU time expires, the CPU is preempted and given to the next process waiting in a queue. The preempted process is then placed at the back of the ready list.

Understanding Threads

A thread is a flow of execution through the process code, with its own program counter that keeps track of which instruction to execute next, system registers which hold its current working variables, and a stack which contains the execution history.

Types of Threads

  1. User-Level Threads: The thread management kernel is not aware of the existence of threads. The thread library contains code for creating and destroying threads, for passing messages and data between threads, for scheduling thread execution, and for saving and restoring thread contexts.
  2. Kernel-Level Threads: Thread management is done by the kernel. There is no thread management code in the application area. Kernel threads are supported directly by the operating system.

Multithreading Models

Some operating systems provide a combined user-level thread and kernel-level thread facility. Solaris is a good example of this combined approach. In a combined system, multiple threads within the same application can run in parallel on multiple processors, and a blocking system call need not block the entire process.

Common Multithreading Models

  1. Many-to-Many Model: The many-to-many model multiplexes any number of user threads onto an equal or smaller number of kernel threads.
  2. Many-to-One Model: The many-to-one model maps many user-level threads to one kernel-level thread. Thread management is done in user space by the thread library.
  3. One-to-One Model: There is a one-to-one relationship of user-level threads to kernel-level threads. This model provides more concurrency than the many-to-one model. It also allows another thread to run when a thread makes a blocking system call. It supports multiple threads to execute in parallel on multiple processors.

Synchronization & IPC

Inter-Process Communication (IPC) is used for exchanging data between multiple threads in one or more processes or programs. The processes may be running on single or multiple computers connected by a network. The full form of IPC is Inter-Process Communication.

Race Conditions

A race condition occurs when two threads access a shared variable at the same time. The first thread reads the variable, and the second thread reads the same value from the variable. Then the first thread and second thread perform their operations on the value, and they race to see which thread can write the value last to the shared variable.

Critical Sections

A critical section or region is a code segment that accesses shared variables and has to be executed as an atomic action. It means that in a group of cooperating processes, at a given point in time, only one process must be executing its critical section.

Serializability

Serializability identifies data transactions as occurring serially, independent of one another, even though they may have occurred concurrently. A schedule or list of transactions is deemed to be correct if they are serialized; otherwise, they may contain errors that can lead to duplication or overlap.

Process Scheduling

When a computer is multi-programmed, it frequently has multiple processes competing for the CPU at the same time. This situation occurs whenever two or more processes are simultaneously in the ready state.

Types of Scheduling

  1. Preemptive Scheduling: A preemptive scheduling algorithm picks a job/process and lets it run for a maximum of some fixed time. If it is still running at the end of the time interval, it is suspended, and the scheduler picks another process to run (if one is available).
  2. Non-Preemptive Scheduling: A non-preemptive scheduling algorithm picks a process/job to run and then lets it run until it blocks (either for input/output or waiting for another process) or until it voluntarily releases the CPU.
  3. Batch Scheduling: In a batch processing system, there are users waiting at their terminals for a quick response. So, non-preemptive or preemptive algorithms with long time periods are acceptable. It reduces process switches and thus improves performance.
  4. Interactive Scheduling: Interactive scheduling policies assign a time-slice to each process. Once the time-slice is over, the process is swapped even if not yet terminated. It can also be said that scheduling of this kind is preemptive.
  5. Real-Time Scheduling: A real-time scheduling system is composed of the scheduler, clock, and processing hardware elements. In a real-time system, a process or task has schedulability; tasks are accepted by a real-time system and completed as specified by the task deadline, depending on the characteristic of the scheduling algorithm.

Deadlock Conditions

Deadlock is a situation which involves the interaction of more than one resource and processes with each other. We can visualize the occurrence of deadlock as a situation where there are two people on a staircase. One is ascending the staircase while the other is descending. The staircase is so narrow that it can only fit one person at a time.

Ostrich Algorithm

In computer science, the Ostrich Algorithm is a strategy of ignoring potential problems on the basis that they may be exceedingly rare. It is named for the Ostrich Effect, which is defined as “to stick one’s head in the sand and pretend there is no problem.” It is used when it is more cost-effective to allow the problem to occur than to attempt its prevention.

Deadlock Handling Methods

  1. Deadlock Prevention or Avoidance: Do not allow the system to get into a deadlocked state.
  2. Deadlock Detection and Recovery: Abort a process or preempt some resources when deadlocks are detected.
  3. Ignore The Problem Altogether: If deadlocks only occur once a year or so, it may be better to simply let them happen and reboot as necessary than to incur the constant overhead and system performance penalties associated with deadlock prevention or detection.

Deadlock Prevention Strategies

Havender, in his pioneering work, showed that since all four of the conditions are necessary for deadlock to occur, it follows that deadlock might be prevented by denying any one of the conditions.

Starvation in OS

Starvation is closely related to deadlock. Under certain conditions, some processes never get resources even though they are not deadlocked; these processes experience starvation, due to which they become unresponsive or terminated.

Memory Hierarchy

Every programmer desires private, infinitely large, and fast memory that is also non-volatile and inexpensive. Unfortunately, technology does not provide such memories at present. Over the years, the concept of memory hierarchy was developed to address this.

Memory Fragmentation

Fragmentation occurs in a dynamic memory allocation system when most of the free blocks are too small to satisfy any request. It is generally referred to as the inability to use the available memory.

Contiguous Memory Allocation

In contiguous memory allocation, each process is contained in a single contiguous block of memory. Memory is divided into several fixed-size partitions. Each partition contains exactly one process. When a partition is free, a process is selected from the input queue and loaded into it. The free blocks of memory are known as holes.

Non-Contiguous Memory Allocation

Fragmentation is a main problem in contiguous memory allocation. We have seen a method called compaction to resolve this problem. Since it’s an I/O operation, system efficiency gets reduced. So, a better method to overcome the fragmentation problem is to make our logical address space non-contiguous.

Virtual Memory Concepts

Virtual Memory is a space where large programs can store themselves in the form of pages during their execution, and only the required pages or portions of processes are loaded into the main memory.

Paging in OS

A solution to the fragmentation problem is paging. Paging is a memory management mechanism that allows the physical address space of a process to be non-contiguous. Here, physical memory is divided into blocks of equal size called pages (size is a power of 2, between 512 bytes and 8192 bytes). The size of the process is measured in the number of pages.

Understanding Thrashing

The set of pages that a process is currently using is called its working set. If the entire working set is in memory, the process will run without causing many page faults until it moves into another execution phase.

Memory Segmentation

Segmentation is a memory management technique in which each job is divided into several segments of different sizes, one for each module that contains pieces that perform related functions.

Segmentation with Paging

Both paging and segmentation have their advantages and disadvantages. It is better to combine these two schemes to improve on each. The combined scheme is known as ‘Page the Elements’.

Device Controllers & Drivers

Device drivers are software modules that can be plugged into an Operating System to handle a particular device. The Operating System uses device drivers to handle all input/output devices.

Direct Memory Access (DMA)

Slow devices like keyboards will generate an interrupt to the main CPU after each byte is transferred. If a fast device such as a disk generated an interrupt for each byte, the operating system would spend most of its time handling these interrupts. So, a typical computer uses Direct Memory Access (DMA) hardware to reduce this overhead.

Polling Input/Output

Polling is the simplest way for an input/output device to communicate with the processor. The process of periodically checking the status of the device to see if it is time for the next input/output operation is called polling.

Interrupt-Driven I/O

An alternative scheme for dealing with input/output is the interrupt-driven method. An interrupt is a signal to the microprocessor from a device that requires attention.

Operating System Security

Security refers to providing a protection system to computer system resources such as CPU, memory, disk, software programs, and most importantly, data/information stored in the computer system.

Understanding Intruders

An intruder is a person who attempts to gain unauthorized access to a system, to damage that system, or to disturb data on that system. In summary, this person attempts to violate security by interfering with system availability, data integrity, or data confidentiality.

User Authentication

The primary goal of authentication is to allow access to legitimate system users and deny access to unauthorized users. It is the responsibility of the Operating System to create a protection system which ensures that a user who is running a particular program is authentic. Operating Systems generally identify/authenticate users using the following three ways:

Password Security

The most widely used method to prevent unauthorized access is to use passwords. A password is a string of characters used to authenticate a user to access a system. The password needs to be kept secret and is only intended for the specific user.

Phishing & Password Vulnerability

This type of attack causes victims to believe they are accessing legitimate content, usually email or websites, when in fact they are accessing fake content produced by the attackers. This type of content usually leads victims to fill in existing login and password data from other legitimate sites or services, such as Google and Facebook, which when filled in, allows the attacker to store the passwords before redirecting the victims to a legitimate site.

Social Engineering Threats

Social engineering is somewhat similar to phishing attacks and is a widespread spying method aimed at gaining access to confidential data. To extract confidential information, scammers very often exploit good faith, helpfulness, but also the insecurity of people. Whether over the phone, pretending to be someone else, or the Internet, they are ready to do anything to get access to personal data.

User Authorization

Authorization is the process of giving someone permission to do or have something. In multi-user computer systems, a system administrator defines for the system which users are allowed access to the system and what privileges they have.

Cryptography Basics

Cryptography is a method to transform (encrypt) information in such a way that it cannot be understood by anyone except the intended recipient, who possesses the means to reverse the transformation (decrypt). Encryption means modifying the data in such a way that it becomes useless or unidentifiable or unreadable to anyone (including intruders), but only the desired receiver can view the actual data.

System Resource Model

We know that computer systems are full of resources that can only be used by one process at a time. The resources include printers, drivers, and slots in the system’s internal table. Having two processes simultaneously reading/writing to the resources creates a problem. For many applications, a process needs exclusive access to not just one resource, but several.

Types of System Resources

  1. Preemptable: A preemptable resource is one which can be allocated to a given process for a period of time, then be allocated to another process, and then be reallocated to the first process without any ill effects.
  2. Non-Preemptable: A non-preemptable resource cannot be taken from one process and given to another without side effects. One obvious example is a printer: certainly, we would not want to take the printer away from one process and give it to another in the middle of a print job.