introduction to bottom up parsing
play

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR - PowerPoint PPT Presentation

Outline Review LL parsing Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm Constructing LR parsing tables 2 Top-Down Parsing: Review Top-Down Parsing: Review Top-down parsing expands a


  1. Outline • Review LL parsing Introduction to Bottom-Up Parsing • Shift-reduce parsing • The LR parsing algorithm • Constructing LR parsing tables 2 Top-Down Parsing: Review Top-Down Parsing: Review • Top-down parsing expands a parse tree from • Top-down parsing expands a parse tree from the start symbol to the leaves the start symbol to the leaves – Always expand the leftmost non-terminal – Always expand the leftmost non-terminal E E • The leaves at any point form a string β A γ E E T T + + – contains only terminals β – The input string is β b δ T int – The prefix β matches * – The next token is b int * int + int int * int + int E → T + E | T E → T + E | T T → (E) | int | int * T T → (E) | int | int * T 3 4

  2. Top-Down Parsing: Review Top-Down Parsing: Review • Top-down parsing expands a parse tree from • Top-down parsing expands a parse tree from the start symbol to the leaves the start symbol to the leaves – Always expand the leftmost non-terminal – Always expand the leftmost non-terminal E E • The leaves at any point • The leaves at any point form a string β A γ form a string β A γ E E T T + + – contains only terminals – contains only terminals β β – The input string is β b δ – The input string is β b δ T T T T int int – The prefix β matches * – The prefix β matches * – The next token is b – The next token is b int int int int * int + int int * int + int 5 6 Predictive Parsing: Review Constructing Predictive Parsing Tables • A predictive parser is described by a table Consider the state S → * β A γ – For each non-terminal A and for each token b we – With b the next token specify a production A → α – Trying to match β b δ – When trying to expand A we use A → α if b is the There are two possibilities: token that follows next 1. Token b belongs to an expansion of A • Any A → α can be used if b can start a string • Once we have the table derived from α – The parsing algorithm is simple and fast • We say that b ∈ First( α ) – No backtracking is necessary Or… 7 8

  3. Computing First Sets Constructing Predictive Parsing Tables (Cont.) 2. Token b does not belong to an expansion of A Definition – The expansion of A is empty and b belongs to an First(X) = { b | X → * b α } ∪ { ε | X → * } ε expansion of γ Algorithm sketch – Means that b can appear after A in a derivation of 1. First(b) = { b } the form S → * β Ab ω 2. ε ∈ First(X) – We say that b ∈ Follow(A) in this case if X → ε is a production 3. ε ∈ First(X) if X → A 1 … A n – What productions can we use in this case? and ε ∈ First(A i ) for 1 ≤ i ≤ n • Any A → can be used if α can expand to ε α 4. First( α ) ⊆ First(X) if X → A 1 … A n α • We say that ε First(A) in this case ∈ and ε ∈ First(A i ) for 1 ≤ i ≤ n 9 10 First Sets: Example Computing Follow Sets • Recall the grammar • Definition E → T X X → + E | ε Follow(X) = { b | S → * X b δ } β T → ( E ) | int Y Y → * T | ε • First sets • Intuition First( T ) = { int, ( } First( ( ) = { ( } – If X → A B then First(B) ⊆ Follow(A) First( E ) = { int, ( } First( ) ) = { ) } and Follow(X) ⊆ Follow(B) First( X ) = { +, ε } – Also if B → * then Follow(X) ⊆ Follow(A) First( int ) = { int } ε – If S is the start symbol then $ ∈ Follow(S) First( Y ) = { *, ε } First( + ) = { + } First( * ) = { * } 11 12

  4. First( T ) = { int, ( } Computing Follow Sets (Cont.) Follow Sets: Example First( E ) = { int, ( } First( X ) = { +, ε } First( Y ) = { *, ε } Algorithm sketch • Recall the grammar E → T X X → + E | ε 1. $ ∈ Follow(S) T → ( E ) | int Y Y → * T | ε 2. First( β ) - { ε } ⊆ Follow(X) • Follow sets – For each production A → α X β Follow( + ) = { int, ( } Follow( * ) = { int, ( } 3. Follow(A) ⊆ Follow(X) Follow( ( ) = { int, ( } Follow( E ) = { ), $ } – For each production A → α X β where ε ∈ First( β ) Follow( X ) = { $, ) } Follow( T ) = { +, ) , $ } Follow( ) ) = { +, ) , $ } Follow( Y ) = { +, ) , $ } Follow( int ) = { *, +, ) , $ } 13 14 Constructing LL(1) Parsing Tables Constructing LL(1) Tables: Example • Construct a parsing table T for CFG G • Recall the grammar E → T X X → + E | ε • For each production A → α in G do: T → ( E ) | int Y Y → * T | ε – For each terminal b ∈ First( α ) do T[A, b] = α • Where in the line of Y do we put Y → * T ? – If ε ∈ First( α ), for each b ∈ Follow(A) do – In the lines of First(*T) = { * } T[A, b] = α – If ε ∈ First( α ) and $ ∈ Follow(A) do • Where in the line of Y do we put Y → ε ? T[A, $] = α – In the lines of Follow(Y) = { $, +, ) } 15 16

  5. Notes on LL(1) Parsing Tables Bottom Up Parsing • If any entry is multiply defined then G is not LL(1) – If G is ambiguous – If G is left recursive – If G is not left-factored – And in other cases as well • For some grammars there is a simple parsing strategy: Predictive parsing • Most programming language grammars are not LL(1) • Thus, we need more powerful parsing strategies 17 Bottom-Up Parsing An Introductory Example • Bottom-up parsing is more general than top- • LR parsers don’t need left-factored grammars down parsing and can also handle left-recursive grammars – And just as efficient • Consider the following grammar: – Builds on ideas in top-down parsing – Preferred method in practice E → E + ( E ) | int • Also called LR parsing – Why is this not LL(1)? – L means that tokens are read left to right – R means that it constructs a rightmost derivation ! • Consider the string: int + ( int ) + ( int ) 19 20

  6. The Idea A Bottom-up Parse in Detail (1) E → E + ( E ) | int • LR parsing reduces a string to the start int + (int) + (int) symbol by inverting productions: str w input string of terminals repeat – Identify β in str such that A → β is a production (i.e., str = α β γ ) – Replace β by A in str (i.e., str w = α A γ ) until str = S (the start symbol) OR all possibilities are exhausted int + ( int ) + ( int ) 21 22 A Bottom-up Parse in Detail (2) A Bottom-up Parse in Detail (3) E → E + ( E ) | int E → E + ( E ) | int int + (int) + (int) int + (int) + (int) E + (int) + (int) E + (int) + (int) E + (E) + (int) E E E int + ( int ) + ( int ) int + ( int ) + ( int ) 23 24

  7. A Bottom-up Parse in Detail (4) A Bottom-up Parse in Detail (5) E → E + ( E ) | int E → E + ( E ) | int int + (int) + (int) int + (int) + (int) E + (int) + (int) E + (int) + (int) E + (E) + (int) E + (E) + (int) E + (int) E + (int) E E E + (E) E E E E E int + ( int ) + ( int ) int + ( int ) + ( int ) 25 26 A Bottom-up Parse in Detail (6) Important Fact #1 about Bottom-up Parsing E → E + ( E ) | int int + (int) + (int) E An LR parser traces a rightmost E + (int) + (int) derivation in reverse E + (E) + (int) E + (int) E E + (E) E A rightmost E E E derivation in reverse int + ( int ) + ( int ) 27 28

  8. Where Do Reductions Happen Notation Fact #1 has an interesting consequence: • Idea: Split string into two substrings – Let αβγ be a step of a bottom-up parse – Right substring is as yet unexamined by parsing (a string of terminals) – Assume the next reduction is by using A → β – Left substring has terminals and non-terminals – Then γ is a string of terminals • The dividing point is marked by a I Why? – The I is not part of the string Because α A γ → αβγ is a step in a right-most derivation • Initially, all input is unexamined: I x 1 x 2 . . . x n 29 30 Shift-Reduce Parsing Shift Bottom-up parsing uses only two kinds of actions: Shift: Move I one place to the right – Shifts a terminal to the left string Shift E + ( I int ) ⇒ E + (int I ) Reduce In general: ABC I xyz ⇒ ABCx I yz 31 32

  9. Shift-Reduce Example Reduce E → E + ( E ) | int Reduce: Apply an inverse production at the right I int + (int) + (int)$ shift end of the left string – If E → E + ( E ) is a production, then E + ( E + ( E ) I ) ⇒ E + ( E I ) In general, given A → xy, then: Cbxy I ijk ⇒ CbA I ijk int + ( int ) + ( int ) 33 Shift-Reduce Example Shift-Reduce Example E → E + ( E ) | int E → E + ( E ) | int I int + (int) + (int)$ shift I int + (int) + (int)$ shift int I + (int) + (int)$ reduce E → int int I + (int) + (int)$ reduce E → int E I + (int) + (int)$ shift 3 times E int + ( int ) + ( int ) int + ( int ) + ( int )

Recommend


More recommend