Foundations of Computation and Compiler Techniques
A Turing Machine (TM) is a theoretical model that defines an abstract machine capable of performing computations using a tape, a head for reading and writing symbols, and a finite set of states. The tape is infinite and divided into cells, each holding a symbol.
The machine reads the current symbol, changes it, moves left or right, and transitions between states based on a predefined set of rules. TMs can simulate any algorithm and are fundamental to understanding the limits of computation.
However, not all problems can be solved using a TM. An example of an undecidable problem is the Halting Problem, which asks whether a given TM will halt on a specific input.
Alan Turing proved that no algorithm can universally determine the answer for all machine-input pairs. Another classic undecidable problem is the Post Correspondence Problem (PCP), which involves matching sequences of strings from two lists in such a way that their concatenations are equal. These problems demonstrate that there exist well-defined questions that cannot be answered algorithmically, highlighting the limitations of mechanical computation and establishing a boundary between decidable and undecidable problems within theoretical computer science.
Pushdown Automata (PDA)
And Context-Free Grammars (CFGs) are equivalent in their ability to define context-free languages. A CFG consists of a set of production rules used to generate strings in a language, while a PDA is an automaton equipped with a stack that provides additional memory. This stack enables a PDA to recognize patterns like nested parentheses or equal numbers of specific symbols, which finite automata cannot handle. For example, consider the language L = { aⁿbⁿ | n ≥ 0 }. A CFG for this language could be S → aSb | ε, which recursively matches a’s with b’s. A corresponding PDA would push a symbol onto the stack for every ‘a’ read, and pop one for each ‘b’, accepting the input if the stack is empty at the end. The equivalence lies in the fact that for any CFG, a PDA can be constructed to recognize the same language, and for every PDA, a CFG can be constructed to generate its language. This relationship is critical in parsing and compiling, where CFGs provide syntax specification and PDAs form the basis of parsers.
Bottom-up parsing is a strategy in syntax analysis where the parser builds the parse tree from the leaves (input symbols) up to the root (start symbol). It reverses the production rules to reduce a sequence of symbols to a non-terminal, effectively recognizing the structure of the input from bottom to top. The most common bottom-up parser is the shift-reduce parser, which uses a stack and input buffer. The parser performs four main actions: shift (push input to stack), reduce (replace stack top matching a production’s right-hand side with its left-hand side), accept (input fully parsed), and error (invalid input). For example, consider the grammar E → E + T | T and T → id. For the input id + id, the parser shifts ‘id’ onto the stack, reduces it to T, then to E, shifts ‘+’, then the next ‘id’, reduces it to T, and finally reduces E + T to E, completing the parse. Bottom-up parsing is robust and handles a wide range of grammar constructs, including left recursion, making it suitable for real-world programming language parsers.
Syntax-Directed Translation (SDT)
Extends grammar rules with semantic actions that define how to translate or evaluate a string. These actions depend on the evaluation of attributes associated with grammar symbols. Attributes can be synthesized (computed from children) or inherited (passed from parent or siblings). The order in which these attributes are evaluated depends on their dependencies. S-attributed SDTs use only synthesized attributes and can be evaluated using post-order traversal during bottom-up parsing. L-attributed SDTs include inherited attributes and require a left-to-right evaluation order, compatible with top-down parsing. For instance, in an arithmetic expression grammar rule like E → E1 + T with the semantic rule E.Val = E1.Val + T.Val, the values of E1 and T must be computed before E’s value is known. Given input 2 + 3, E1 evaluates to 2, T to 3, and E becomes 5. Proper evaluation order ensures that all attribute values are computed accurately and in a consistent manner, enabling tasks like type checking, code generation, and interpretation within a compiler.
