Compiler Construction Lecture 6: Syntax Analysis II ( LL ( k ) Grammars) Thomas Noll Lehrstuhl f¨ ur Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de http://moves.rwth-aachen.de/teaching/ss-14/cc14/ Summer Semester 2014
Outline Recap: Nondeterministic Top-Down Parsing 1 Correctness of NTA ( G ) 2 Adding Lookahead 3 LL ( k ) Grammars 4 Follow Sets 5 LL (1) Grammars 6 Compiler Construction Summer Semester 2014 6.2
Conceptual Structure of a Compiler Source code Lexical analysis (Scanner) (id , x1 )(gets , )(id , y2 )(plus , )(int , 1) Syntax analysis (Parser) context-free grammars/pushdown automata Assgn Exp Var Semantic analysis Sum Var Const Generation of intermediate code Code optimization Generation of machine code Target code Compiler Construction Summer Semester 2014 6.3
Top-Down Parsing Approach: Given G ∈ CFG Σ , construct a nondeterministic pushdown automaton 1 (PDA) which accepts L ( G ) and which additionally computes corresponding leftmost derivations (similar to the proof of “ L ( CFG Σ ) ⊆ L ( PDA Σ )”) input alphabet: Σ pushdown alphabet: X output alphabet: [ p ] state set: not required Remove nondeterminism by allowing lookahead on the input: 2 G ∈ LL ( k ) iff L ( G ) recognizable by deterministic PDA with lookahead of k symbols Compiler Construction Summer Semester 2014 6.4
The Nondeterministic Top-Down Automaton Definition (Nondeterministic top-down parsing automaton) Let G = � N , Σ , P , S � ∈ CFG Σ . The nondeterministic top-down parsing automaton of G , NTA ( G ), is defined by the following components. Input alphabet: Σ Pushdown alphabet: X Output alphabet: [ p ] Configurations: Σ ∗ × X ∗ × [ p ] ∗ (top of pushdown to the left) Transitions for w ∈ Σ ∗ , α ∈ X ∗ , and z ∈ [ p ] ∗ : expansion steps: if π i = A → β , then ( w , A α, z ) ⊢ ( w , βα, zi ) matching steps: for every a ∈ Σ, ( aw , a α, z ) ⊢ ( w , α, z ) Initial configuration for w ∈ Σ ∗ : ( w , S , ε ) Final configurations: { ε } × { ε } × [ p ] ∗ Remark: NTA ( G ) is nondeterministic iff G contains A → β | γ Compiler Construction Summer Semester 2014 6.5
Outline Recap: Nondeterministic Top-Down Parsing 1 Correctness of NTA ( G ) 2 Adding Lookahead 3 LL ( k ) Grammars 4 Follow Sets 5 LL (1) Grammars 6 Compiler Construction Summer Semester 2014 6.6
Correctness of NTA ( G ) Theorem 6.1 (Correctness of NTA ( G )) Let G = � N , Σ , P , S � ∈ CFG Σ and NTA ( G ) as before. Then, for every w ∈ Σ ∗ and z ∈ [ p ] ∗ , ( w , S , ε ) ⊢ ∗ ( ε, ε, z ) iff z is a leftmost analysis of w Compiler Construction Summer Semester 2014 6.7
Correctness of NTA ( G ) Theorem 6.1 (Correctness of NTA ( G )) Let G = � N , Σ , P , S � ∈ CFG Σ and NTA ( G ) as before. Then, for every w ∈ Σ ∗ and z ∈ [ p ] ∗ , ( w , S , ε ) ⊢ ∗ ( ε, ε, z ) iff z is a leftmost analysis of w Proof. = ⇒ (soundness): see exercises ⇐ = (completeness): on the board Compiler Construction Summer Semester 2014 6.7
Outline Recap: Nondeterministic Top-Down Parsing 1 Correctness of NTA ( G ) 2 Adding Lookahead 3 LL ( k ) Grammars 4 Follow Sets 5 LL (1) Grammars 6 Compiler Construction Summer Semester 2014 6.8
Adding Lookahead Goal: resolve nondeterminism of NTA ( G ) by supporting lookahead of k ∈ N symbols on the input = ⇒ determination of expanding A -production by next k symbols Compiler Construction Summer Semester 2014 6.9
Adding Lookahead Goal: resolve nondeterminism of NTA ( G ) by supporting lookahead of k ∈ N symbols on the input = ⇒ determination of expanding A -production by next k symbols Definition 6.2 ( first k set) Let G = � N , Σ , P , S � ∈ CFG Σ , α ∈ X ∗ , and k ∈ N . Then the first k set of α , first k ( α ) ⊆ Σ ∗ , is given by first k ( α ) := { v ∈ Σ k | ex. w ∈ Σ ∗ such that α ⇒ ∗ vw } ∪ { v ∈ Σ < k | α ⇒ ∗ v } Compiler Construction Summer Semester 2014 6.9
Adding Lookahead Goal: resolve nondeterminism of NTA ( G ) by supporting lookahead of k ∈ N symbols on the input = ⇒ determination of expanding A -production by next k symbols Definition 6.2 ( first k set) Let G = � N , Σ , P , S � ∈ CFG Σ , α ∈ X ∗ , and k ∈ N . Then the first k set of α , first k ( α ) ⊆ Σ ∗ , is given by first k ( α ) := { v ∈ Σ k | ex. w ∈ Σ ∗ such that α ⇒ ∗ vw } ∪ { v ∈ Σ < k | α ⇒ ∗ v } Remark: first k ( α ) is effectively computable. If α ∈ Σ ∗ , then | first k ( α ) | = 1. Compiler Construction Summer Semester 2014 6.9
Adding Lookahead Goal: resolve nondeterminism of NTA ( G ) by supporting lookahead of k ∈ N symbols on the input = ⇒ determination of expanding A -production by next k symbols Definition 6.2 ( first k set) Let G = � N , Σ , P , S � ∈ CFG Σ , α ∈ X ∗ , and k ∈ N . Then the first k set of α , first k ( α ) ⊆ Σ ∗ , is given by first k ( α ) := { v ∈ Σ k | ex. w ∈ Σ ∗ such that α ⇒ ∗ vw } ∪ { v ∈ Σ < k | α ⇒ ∗ v } Remark: first k ( α ) is effectively computable. If α ∈ Σ ∗ , then | first k ( α ) | = 1. Example 6.3 ( first k set) Let G : S → aSb | ε . first 1 ( ab ) = { a } = first 2 ( a ) 1 Compiler Construction Summer Semester 2014 6.9
Adding Lookahead Goal: resolve nondeterminism of NTA ( G ) by supporting lookahead of k ∈ N symbols on the input = ⇒ determination of expanding A -production by next k symbols Definition 6.2 ( first k set) Let G = � N , Σ , P , S � ∈ CFG Σ , α ∈ X ∗ , and k ∈ N . Then the first k set of α , first k ( α ) ⊆ Σ ∗ , is given by first k ( α ) := { v ∈ Σ k | ex. w ∈ Σ ∗ such that α ⇒ ∗ vw } ∪ { v ∈ Σ < k | α ⇒ ∗ v } Remark: first k ( α ) is effectively computable. If α ∈ Σ ∗ , then | first k ( α ) | = 1. Example 6.3 ( first k set) Let G : S → aSb | ε . first 1 ( ab ) = { a } = first 2 ( a ) 1 first 3 ( S ) = { ε, ab , aab , aaa } 2 Compiler Construction Summer Semester 2014 6.9
Adding Lookahead Goal: resolve nondeterminism of NTA ( G ) by supporting lookahead of k ∈ N symbols on the input = ⇒ determination of expanding A -production by next k symbols Definition 6.2 ( first k set) Let G = � N , Σ , P , S � ∈ CFG Σ , α ∈ X ∗ , and k ∈ N . Then the first k set of α , first k ( α ) ⊆ Σ ∗ , is given by first k ( α ) := { v ∈ Σ k | ex. w ∈ Σ ∗ such that α ⇒ ∗ vw } ∪ { v ∈ Σ < k | α ⇒ ∗ v } Remark: first k ( α ) is effectively computable. If α ∈ Σ ∗ , then | first k ( α ) | = 1. Example 6.3 ( first k set) Let G : S → aSb | ε . first 1 ( ab ) = { a } = first 2 ( a ) 1 first 3 ( S ) = { ε, ab , aab , aaa } 2 first 3 ( Sa ) = { a , aba , aab , aaa } 3 Compiler Construction Summer Semester 2014 6.9
Outline Recap: Nondeterministic Top-Down Parsing 1 Correctness of NTA ( G ) 2 Adding Lookahead 3 LL ( k ) Grammars 4 Follow Sets 5 LL (1) Grammars 6 Compiler Construction Summer Semester 2014 6.10
LL ( k ) Grammars I LL ( k ): reading of input from Left to right with k -lookahead, computing a Leftmost analysis Compiler Construction Summer Semester 2014 6.11
LL ( k ) Grammars I LL ( k ): reading of input from Left to right with k -lookahead, computing a Leftmost analysis Definition 6.4 ( LL ( k ) grammar) Let G = � N , Σ , P , S � ∈ CFG Σ and k ∈ N . Then G has the LL ( k ) property (notation: G ∈ LL ( k )) if for all leftmost derivations of the form � ⇒ l w βα ⇒ ∗ l wx S ⇒ ∗ l wA α ⇒ l w γα ⇒ ∗ l wy such that β � = γ , it follows that first k ( x ) � = first k ( y ) (i.e., different productions must not yield the same lookahead). Compiler Construction Summer Semester 2014 6.11
LL ( k ) Grammars II Remarks: If G ∈ LL ( k ), then the leftmost derivation step for wA α in � ⇒ l w βα ⇒ ∗ l wx S ⇒ ∗ l wA α ⇒ l w γα ⇒ ∗ l wy is determined by the next k symbols following w . Compiler Construction Summer Semester 2014 6.12
LL ( k ) Grammars II Remarks: If G ∈ LL ( k ), then the leftmost derivation step for wA α in � ⇒ l w βα ⇒ ∗ l wx S ⇒ ∗ l wA α ⇒ l w γα ⇒ ∗ l wy is determined by the next k symbols following w . Corresponding computations of NTA ( G ): ( ∗ ) ( wx , S , ε ) ⊢ ∗ ( x , A α, z ) ⊢ ( x , βα, zi ) ⊢ ∗ ( ε, ε, ziz ′ ) ( ∗ ) ⊢ ∗ ⊢ ⊢ ∗ ( ε, ε, zjz ′′ ) ( wy , S , ε ) ( y , A α, z ) ( y , γα, zj ) where π i = A → β and π j = A → γ Deterministic decision in ( ∗ ) possible if first k ( x ) � = first k ( y ) Compiler Construction Summer Semester 2014 6.12
LL ( k ) Grammars II Remarks: If G ∈ LL ( k ), then the leftmost derivation step for wA α in � ⇒ l w βα ⇒ ∗ l wx S ⇒ ∗ l wA α ⇒ l w γα ⇒ ∗ l wy is determined by the next k symbols following w . Corresponding computations of NTA ( G ): ( ∗ ) ( wx , S , ε ) ⊢ ∗ ( x , A α, z ) ⊢ ( x , βα, zi ) ⊢ ∗ ( ε, ε, ziz ′ ) ( ∗ ) ⊢ ∗ ⊢ ⊢ ∗ ( ε, ε, zjz ′′ ) ( wy , S , ε ) ( y , A α, z ) ( y , γα, zj ) where π i = A → β and π j = A → γ Deterministic decision in ( ∗ ) possible if first k ( x ) � = first k ( y ) Problem: how to determine the A -production from the lookahead (potentially infinitely many derivations βα ⇒ ∗ l x / γα ⇒ ∗ l y )? Compiler Construction Summer Semester 2014 6.12
Recommend
More recommend