compiler construction
play

Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) - PowerPoint PPT Presentation

Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) & Bottom-Up Parsing) Thomas Noll Lehrstuhl f ur Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de


  1. Compiler Construction Lecture 8: Syntax Analysis IV (More on LL (1) & Bottom-Up Parsing) Thomas Noll Lehrstuhl f¨ ur Informatik 2 (Software Modeling and Verification) noll@cs.rwth-aachen.de http://moves.rwth-aachen.de/teaching/ss-14/cc14/ Summer Semester 2014

  2. Outline Recap: LL (1) Parsing 1 Transformation to LL (1) 2 The Complexity of LL (1) Parsing 3 Recursive-Descent Parsing 4 Bottom-Up Parsing 5 Nondeterministic Bottom-Up Parsing 6 Compiler Construction Summer Semester 2014 8.2

  3. Characterization of LL (1) Theorem (Characterization of LL (1)) G ∈ LL (1) iff for all pairs of rules A → β | γ ∈ P (where β � = γ ): la ( A → β ) ∩ la ( A → γ ) = ∅ . Proof. on the board Remark: the above theorem generally does not hold if k > 1 (cf. exercises) Compiler Construction Summer Semester 2014 8.3

  4. Deterministic Top-Down Parsing Approach: given G ∈ CFG Σ , Verify that G ∈ LL (1) by computing the lookahead sets and checking 1 alternatives for disjointness Start with nondeterministic top-down parsing automaton NTA ( G ) 2 Use 1-symbol lookahead to control the choice of expanding 3 productions: ( aw , A α, z ) ⊢ ( aw , βα, zi ) if π i = A → β and a ∈ la ( π i ) ( ε, A α, z ) ⊢ ( ε, βα, zi ) if π i = A → β and ε ∈ la ( π i ) [matching steps as before: ( aw , a α, z ) ⊢ ( w , α, z )] ⇒ deterministic top-down parsing automaton DTA ( G ) = Remarks: DTA ( G ) is actually not a pushdown automaton ( a is read but not consumed). But: can be simulated using the finite control. Advantage of using lookahead is twofold: Removal of nondeterminism Earlier detection of syntax errors ∈ � A → β ∈ P la ( A → β )) (in configurations ( aw , A α, z ) where a / Compiler Construction Summer Semester 2014 8.4

  5. Outline Recap: LL (1) Parsing 1 Transformation to LL (1) 2 The Complexity of LL (1) Parsing 3 Recursive-Descent Parsing 4 Bottom-Up Parsing 5 Nondeterministic Bottom-Up Parsing 6 Compiler Construction Summer Semester 2014 8.5

  6. Transformation to LL (1) Assume that G = � N , Σ , P , S � ∈ CFG Σ \ LL (1) (i.e., there exist A → β | γ ∈ P such that la ( A → β ) ∩ la ( A → γ ) � = ∅ ) Compiler Construction Summer Semester 2014 8.6

  7. Transformation to LL (1) Assume that G = � N , Σ , P , S � ∈ CFG Σ \ LL (1) (i.e., there exist A → β | γ ∈ P such that la ( A → β ) ∩ la ( A → γ ) � = ∅ ) Two heuristics for transforming G into G ′ ∈ LL (1): Removal of left recursion 1 Left factorization 2 (used in parser-generating systems such as ANTLR) Compiler Construction Summer Semester 2014 8.6

  8. Transformation to LL (1) Assume that G = � N , Σ , P , S � ∈ CFG Σ \ LL (1) (i.e., there exist A → β | γ ∈ P such that la ( A → β ) ∩ la ( A → γ ) � = ∅ ) Two heuristics for transforming G into G ′ ∈ LL (1): Removal of left recursion 1 Left factorization 2 (used in parser-generating systems such as ANTLR) Remarks: Transformations generally preserve the semantics (= generated language) of CFGs but not the syntactic structure of words (different syntax trees). Transformations cannot always yield an LL (1) grammar (since not every context-free language is generated by an LL grammar; details later). Compiler Construction Summer Semester 2014 8.6

  9. Left Recursion I Definition 8.1 (Left recursion) A grammar G = � N , Σ , P , S � ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒ + A α . Compiler Construction Summer Semester 2014 8.7

  10. Left Recursion I Definition 8.1 (Left recursion) A grammar G = � N , Σ , P , S � ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒ + A α . Corollary 8.2 If G ∈ CFG Σ is left recursive with A ⇒ + A α , then there exists β ∈ X ∗ such that A ⇒ + l A β . Compiler Construction Summer Semester 2014 8.7

  11. Left Recursion I Definition 8.1 (Left recursion) A grammar G = � N , Σ , P , S � ∈ CFG Σ is called left recursive if there exist A ∈ N and α ∈ X ∗ such that A ⇒ + A α . Corollary 8.2 If G ∈ CFG Σ is left recursive with A ⇒ + A α , then there exists β ∈ X ∗ such that A ⇒ + l A β . Example 8.3 The grammar (cf. Example 5.10) G AE : E → E + T | T T → T * F | F F → ( E ) | a | b ∈ LL (1) is left recursive, and in Example 7.4 it was shown that G AE / Compiler Construction Summer Semester 2014 8.7

  12. Left Recursion II Lemma 8.4 ∈ � If G ∈ CFG Σ is left recursive, then G / k ∈ N LL ( k ) . Compiler Construction Summer Semester 2014 8.8

  13. Left Recursion II Lemma 8.4 ∈ � If G ∈ CFG Σ is left recursive, then G / k ∈ N LL ( k ) . Proof. (for k = 1) Assume that G ∈ LL (1) is left recursive with A ⇒ + l A β . Together with the reducedness of G this implies that l vw for some v , w ∈ Σ ∗ and α ∈ X ∗ . l vA α ⇒ + l vA βα ⇒ + S ⇒ ∗ Compiler Construction Summer Semester 2014 8.8

  14. Left Recursion II Lemma 8.4 ∈ � If G ∈ CFG Σ is left recursive, then G / k ∈ N LL ( k ) . Proof. (for k = 1) Assume that G ∈ LL (1) is left recursive with A ⇒ + l A β . Together with the reducedness of G this implies that l vw for some v , w ∈ Σ ∗ and α ∈ X ∗ . l vA α ⇒ + l vA βα ⇒ + S ⇒ ∗ The corresponding computation of DTA ( G ) (Def. 7.6) starts with ( vw , S , ε ) ⊢ ∗ ( w , A α, . . . ) ⊢ + ( w , A βα, . . . ). Compiler Construction Summer Semester 2014 8.8

  15. Left Recursion II Lemma 8.4 ∈ � If G ∈ CFG Σ is left recursive, then G / k ∈ N LL ( k ) . Proof. (for k = 1) Assume that G ∈ LL (1) is left recursive with A ⇒ + l A β . Together with the reducedness of G this implies that l vw for some v , w ∈ Σ ∗ and α ∈ X ∗ . l vA α ⇒ + l vA βα ⇒ + S ⇒ ∗ The corresponding computation of DTA ( G ) (Def. 7.6) starts with ( vw , S , ε ) ⊢ ∗ ( w , A α, . . . ) ⊢ + ( w , A βα, . . . ). But in the last state the behaviour of DTA ( G ) is determined by the same input ( fi ( w )) and stack symbol ( A ). Thus it enters a loop of the form ( w , A α, . . . ) ⊢ + ( w , A βα, . . . ) ⊢ + ( w , A ββα, . . . ) ⊢ + . . . and will never recognize w . Contradiction Compiler Construction Summer Semester 2014 8.8

  16. Removing Direct Left Recursion Direct left recursion occurs in productions of the form A → A α 1 | . . . | A α m | β 1 | . . . | β n where α i � = ε and β j � = A . . . Compiler Construction Summer Semester 2014 8.9

  17. Removing Direct Left Recursion Direct left recursion occurs in productions of the form A → A α 1 | . . . | A α m | β 1 | . . . | β n where α i � = ε and β j � = A . . . Transformation: replacement by right recursion A → β 1 A ′ | . . . | β n A ′ A ′ → α 1 A ′ | . . . | α m A ′ | ε (with a new A ′ ∈ N ) which preserves L ( G ). Compiler Construction Summer Semester 2014 8.9

  18. Removing Direct Left Recursion Direct left recursion occurs in productions of the form A → A α 1 | . . . | A α m | β 1 | . . . | β n where α i � = ε and β j � = A . . . Transformation: replacement by right recursion A → β 1 A ′ | . . . | β n A ′ A ′ → α 1 A ′ | . . . | α m A ′ | ε (with a new A ′ ∈ N ) which preserves L ( G ). Example 8.5 G AE : E → E + T | T T → T * F | F is transformed into F → ( E ) | a | b G ′ AE : E → TE ′ E ′ → + TE ′ | ε T → FT ′ with G ′ AE ∈ LL (1) (see Example 7.5). T ′ → * FT ′ | ε F → ( E ) | a | b Compiler Construction Summer Semester 2014 8.9

  19. Removing Indirect Left Recursion Indirect left recursion occurs in productions of the form ( n ≥ 1) A → A 1 α 1 | . . . A 1 → A 2 α 2 | . . . . . . A n − 1 → A n α n | . . . A n → A β | . . . Compiler Construction Summer Semester 2014 8.10

  20. Removing Indirect Left Recursion Indirect left recursion occurs in productions of the form ( n ≥ 1) A → A 1 α 1 | . . . A 1 → A 2 α 2 | . . . . . . A n − 1 → A n α n | . . . A n → A β | . . . Transformation: into Greibach Normal Form with productions of the form A → aB 1 . . . B n (where n ∈ N and each B i � = S ) or S → ε (cf. Formale Systeme, Automaten, Prozesse ) Compiler Construction Summer Semester 2014 8.10

  21. Left Factorization Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”. Compiler Construction Summer Semester 2014 8.11

  22. Left Factorization Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”. Transformation: delaying the decision by left factorization A → α A ′ A ′ → β | γ (with a new A ′ ∈ N ) which preserves L ( G ). Compiler Construction Summer Semester 2014 8.11

  23. Left Factorization Applies to productions of the form A → αβ | αγ which are problematic if α “at least as long as lookahead”. Transformation: delaying the decision by left factorization A → α A ′ A ′ → β | γ (with a new A ′ ∈ N ) which preserves L ( G ). Example 8.6 Statement → if Condition then Statement else Statement fi | if Condition then Statement fi is transformed into Statement → if Condition then Statement S ′ S ′ → else Statement fi | fi Compiler Construction Summer Semester 2014 8.11

Recommend


More recommend