Compiler Fundamentals: Translators and Lexical Analysis
1. Who is a translator?
A program that receives source code as input and produces output in another language.
2. Examples of Translators
Assemblers and compilers.
3. What is an Assembler?
A program that translates assembly language to machine language.
4. Programming Language Definition
Defined by its syntax and semantics.
5. Compiler Structure
Source code → Preprocessor → Compiler → Assembler code → Assembler → Object code → Linker → Executable code.
6. Lexical Analysis Stage
Reads characters in the source program and groups them into strings representing lexical components.
7. Syntactic Analysis Phase
Groups lexical components into grammatical sentences used by the compiler to synthesize output.
8. Semantic Analysis Stage
Detects instructions with correct syntax but no operational meaning.
9. Intermediate Code Phase
Generates an explicit intermediate representation of the source program after analysis.
10. Object Code Generation
Generates machine language or assembly language code.
11. Symbol Table
A data structure containing records for each identifier, including attributes.
12. Error Handler
Manages errors centrally during compilation.
13. Lexical Analyzer and Parser
The lexical analyzer performs linear east-tokens, transforming input characters into components. The parser generates a syntax tree.
14. Token Types
Specific strips (type only) and nonspecific strips (type and value).
15. Lexical Analyzer Creation
Creates tokens from a sequence of input characters.
16. Linear Transformation by Analyzer
Transforms input symbols into a sequence of components.
17. Phases for Constructing a Linear Analyzer
Definition of reserved words, automata construction for keywords, and transition table construction.
18. Lexical Analysis Process
Work done by the scanner during compilation.
19. Parser Goal
Generate a syntax tree of the source program as defined by a grammar.
20. Pattern
Represents the rule for a sequence of characters considered a lexical unit.
21. Lexeme
The actual value of a character set that satisfies a pattern.
22. Token
A group of characters with a type and value.
23. Lexico-categories
Units that classify valid strings in a language.
24. Role of Lexical Analyzer
Processes input and sends results to the parser.
25. Pattern Description
Represents the rule for a sequence of characters considered a lexical unit.
1. Finite State Acceptor (Finite Automata)
A mathematical model for describing token recognition.
2. Finite State Diagram
Represents transitions for accepting numbers with at least one decimal digit.
3. Nodes
Represent states in a finite state acceptor.
4. Arcs
Indicate state transitions.
5. Acceptance State
The final states in a finite state acceptor.
6. Transition Diagram Consideration
Each state must be reached with the same set of characters at all transitions.
7. Informal Descriptions of Lexical Units
Identifier, comment, eof, and error.
8. Pattern Selectors
Identifier, comment, eof, and error.
9. Buffer Information Representation
buffer = first block, buffer + n / 2 = second block, buffer + n ap.
10. Lexical Analyzer Table
Final or acceptance states are left blank in columns where no transition occurs.
11. Lexer Functions
- inspects and advanced inspected: Returns an accepted character without incrementing ap.
- forward: Increments ap to the next symbol to read.
12. Lexical Analyzer Memory
Stores the source program content and tracks progress.