CSCI 3136 Principles of Programming Languages Syntactic Analysis and Context-Free Grammars - 1 Summer 2013 Faculty of Computer Science Dalhousie University 1 / 13
Constructing a Scanner: Example (1) Language : Strings of 0s and 1s containing an even number of 0s. Regular expression : NFA : 2 / 13
Constructing a Scanner: Example (2) DFA : Minimized DFA : 3 / 13
Constructing a Scanner: Example (2) DFA : Minimized DFA : 4 / 13
Extended Example of a Scanner 5 / 13
Scanner Implementation 1. From finite automaton • Case (switch) statements represent transitions of DFA • Table-based implementation: where the Table represents transitions (interpreted by a driver) 2. Ad hoc • Write by hand when high performance is an issue, e.g., in production compilers 6 / 13
Phases of Compilation Character Stream Scanner (lexical analysis) Token Stream Parser (syntactic analysis) Parse Tree Symbol Table Semantic Analysis and Inter- mediate Code Generation Abstract Syntax Tree or Other Intermediate Form Machine-independent Code Improvement (Optional) Modified Intermediate Form Target Code Generation Target Language (e.g., assembly) Machine-specific Code Im- provement (Optional) Modified Target Language 7 / 13
Scanner Parser (DFA) (PDA) recognizes recognizes Regular Expression Context-free Grammar are gen- are gen- erated erated by by Regular Language Context-free Language
Context-Free Grammar (CFG) (motivation) —————————— Set of rules or productions • non-terminals or variables —————————— V = { P , S , A , N , V } P → N • terminals P → A P Σ= S → P V P { Jim, ate, cheese, big, green } A → big|green • rules or productions N → cheese|Jim P = { P → N , P → A P , V → ate S → P V P , A → big|green , ——————————– N → cheese|Jim , V → ate } Are the following sentences in the language described by the • start variable or start non-terminal above grammar? S = S • big Jim ate green cheese ————————————— • green Jim ate green cheese A context-free grammar is a 4-tuple • Jim ate cheese ( V , Σ , P , S ). • cheese ate Jim 9 / 13
Context-Free Grammar (CFG) A context-free grammar is a 4-tuple ( V , Σ , P , S ), where • V is a finite set of non-terminals or variables , • Σ is a finite set of terminals , • P is a finite set of rules or productions in the form N ∈ V → (Σ ∪ V ) ∗ • S ∈ V is the start variable . 10 / 13
Productions (Rules) • examples: P → N , P → A P • different notation: P → N | A P • left-hand side ( lhs ) and right-hand side ( rhs ) of a rule: → AP P ���� ���� lhs rhs • rhs may be a mixture of terminals and non-terminals: P → big green N • empty rule (epsilon rule, epsilon production): P → ǫ • unit production : P → N where P and N are non-terminals 11 / 13
Generating Sentences, example ———————– S → P V P • S ⇒ P V P ⇒ N V P ⇒ N V N ⇒ Jim V N P → N ⇒ Jim ate N ⇒ Jim ate cheese P → A P A → big|green N → cheese|Jim • S ⇒ P V P ⇒ A P V P ⇒ big P V P ⇒ V → ate big N V P ⇒ big Jim V P ⇒ big Jim ate P ———————— ⇒ big Jim ate A P ⇒ big Jim ate green P ⇒ big Jim ate green N ⇒ big Jim ate green cheese 12 / 13
Generating Sentences CFG generates sentences using a process of rewriting in the following way: • start with S • choose a rule S → α ( α is a sequence of terminals and/or non-terminals), and replace S with α • if α contains a non-terminal X , choose a rule X → β , and replace X with β • continue the process until only terminals remain This process of rewriting is known as derivation . Intermediate strings are called sentential forms . ∗ notation: S ⇒ ∗ Jim ate cheese 13 / 13
Recommend
More recommend