Compiler Design: Analysis and Synthesis Phases
Compiler Phases: Analysis, Optimization, and Synthesis
A compiler translates a high-level program (like C or Java) into machine code understood by the hardware.
The compilation process is divided into multiple phases, each with a specific role. These phases work together to convert source code into an efficient executable program.
The phases are generally grouped into:
- Front End – Analysis Phases
- Middle End – Optimization Phase
- Back End – Synthesis Phases
Below is the complete flow:
Source Program → Lexical Analysis → Syntax Analysis → Semantic Analysis → Intermediate Code Generation → Code Optimization → Code Generation → Target Machine Code
1. Lexical Analysis (The Scanner)
Purpose
Breaks the source program into tokens.
What is a Token?
A token is a meaningful unit:
Identifiers (x, sum), Keywords (if, while), Operators (+, =), Literals (10, 3.14), Punctuation (;, ,)
Tasks Performed
- Removes whitespace and comments
- Groups characters into valid tokens
- Reports lexical errors (invalid characters)
- Maintains symbol table entries for identifiers
Example
sum = a + 20;Tokens → sum, =, a, +, 20, ;
2. Syntax Analysis (The Parser)
Purpose
Checks whether tokens follow the grammar of the language.
Builds a parse tree / syntax tree.
Tasks Performed
- Verifies structure using Context-Free Grammar (CFG)
- Detects syntax errors
- Constructs parse tree
Example
Expression a + b * c
The parser ensures * has higher precedence and produces a syntax tree with correct associativity.
3. Semantic Analysis
Purpose
Ensures that the parse tree follows semantic rules of the language.
Tasks Performed
- Type checking
Example:int a; a = "hello";→ type mismatch error - Function argument checks
- Variable declaration checks
- Scope resolution
- Inserts/updates information in the symbol table
Example
int x;
x = 3.5;The semantic analyzer reports an error: assigning float to int.
4. Intermediate Code Generation (ICG)
Purpose
Generates machine-independent intermediate code.
A common form is Three-Address Code (TAC).
TAC Example
For expression: a + b * c
Intermediate code:
t1 = b * c
t2 = a + t1Benefits
- Easy to optimize
- Independent of machine architecture
5. Code Optimization
Purpose
Improves the intermediate code to make the final program faster and more efficient without changing meaning.
Types of Optimizations
Local Optimization
Within a basic block
Example:
x = y * 2
z = y * 2 → eliminate this, reuse previous resultGlobal Optimization
Across basic blocks
Example: moving invariant computations out of loops.
Machine-Independent Optimization
Constant folding, dead code elimination
Example:
a = 10 * 20 → replaced with a = 2006. Code Generation (Target Code)
Purpose
Converts optimized intermediate code into machine code / assembly code.
Tasks Performed
- Selects machine instructions
- Allocates CPU registers
- Translates three-address code to machine instructions
- Performs basic low-level optimizations
Example
TAC:
t1 = b * c
t2 = a + t1Possible assembly (example):
MUL R1, b, c
ADD R2, a, R17. Symbol Table Management
Purpose
Stores information about identifiers:
| Identifier | Type | Scope | Memory Location |
|---|---|---|---|
x | int | local | stack offset |
sum | int | global | data segment |
Used by:
- Lexical analyzer
- Semantic analyzer
- Code generator
8. Error Handling
Types of Errors
- Lexical errors → invalid tokens
- Syntax errors → grammar violation
- Semantic errors → type/scope violations
- Runtime errors → division by zero
- Logical errors → wrong logic
The compiler tries to recover and continue analysis using:
- Panic mode
- Phrase-level recovery
- Error productions
Compiler Phases Flow Diagram
Source Program
↓
Lexical Analysis → Tokens
↓
Syntax Analysis → Parse Tree
↓
Semantic Analysis → Annotated Tree
↓
Intermediate Code Generation → TAC
↓
Code Optimization → Optimized TAC
↓
Code Generation → Machine Code
↓
Target Program / ExecutableConclusion and Core Components
The compiler works step-by-step from source code to machine code through eight main components:
- Lexical Analysis
- Syntax Analysis
- Semantic Analysis
- Intermediate Code Generation
- Code Optimization
- Code Generation
- Symbol Table
- Error Handling
Understanding these phases clearly is essential because almost every question in Compiler Design (CS3501) is built around these core concepts.
This is a clean, structured, exam-fit 16-mark answer exactly as required.
Summary: Syntax Analysis Revisited
Purpose
Checks whether tokens follow the grammar of the language.
Builds a parse tree / syntax tree.
Tasks Performed
- Verifies structure using Context-Free Grammar (CFG)
- Detects syntax errors
- Constructs parse tree
Example
Expression a + b * c
The parser ensures * has higher precedence and produces a syntax tree with correct associativity.
