Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis – Wilhelm/Maurer: Compiler Design, Chapter 8 – Reinhard Wilhelm Universität des Saarlandes wilhelm@cs.uni-sb.de and Mooly Sagiv Tel Aviv University sagiv@math.tau.ac.il
Bottom-Up Syntax Analysis Subjects ◮ Functionality and Method ◮ Example Parsers ◮ Derivation of a Parser ◮ Conflicts ◮ LR ( k ) –Grammars ◮ LR ( 1 ) –Parser Generation ◮ Bison
Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis Input: A stream of symbols (tokens) Output: A syntax tree or error Method: until input consumed or error do ◮ shift next symbol or reduce by some production ◮ decide what to do by looking one symbol ahead Properties ◮ Constructs the syntax tree in a bottom-up manner ◮ Finds the rightmost derivation (in reversed order) ◮ Reports error as soon as the already read part of the input is not a prefix of a program (valid prefix property)
Bottom-Up Syntax Analysis Parsing aabb in the grammar G ab with S → aSb | ǫ Stack Input Action Dead ends reduce S → ǫ $ aabb # shift $ a abb # shift reduce S → ǫ reduce S → ǫ $ aa bb # shift $ aaS bb # shift reduce S → ǫ $ aaSb b # reduce S → aSb shift , reduce S → ǫ $ aS b # shift reduce S → ǫ $ aSb # reduce S → aSb reduce S → ǫ $ S # accept reduce S → ǫ Issues: ◮ Shift vs. Reduce ◮ Reduce A → β , Reduce B → αβ
Bottom-Up Syntax Analysis Parsing aa in the grammar S → AB , S → A , A → a , B → a Stack Input Action Dead ends $ aa # shift $ a a # reduce A → a reduce B → a , shift $ A a # shift reduce S → A $ Aa # reduce B → a reduce A → a $ AB # reduce S → AB $ S # accept Issues: ◮ Shift vs. Reduce ◮ Reduce A → β , Reduce B → αβ
Bottom-Up Syntax Analysis Shift-Reduce Parsers ◮ The bottom–up Parser is a shift–reduce parser, each step is a shift: consuming the next input symbol or a reduction: reducing a suffix of the stack contents by some production. ◮ the problem is to decide when to stop shifting and make a reduction instead. ◮ a next right side to reduce is called a “handle”, reducing too early: dead end, reducing too late: burying the handle.
Bottom-Up Syntax Analysis LR-Parsers – Deterministic Shift–Reduce Parsers Parser decides whether to shift or to reduce based on ◮ the contents of the stack and ◮ k symbols lookahead into the rest of the input Property of the LR–Parser: it suffices to consider the topmost state on the stack instead of the whole stack contents.
Bottom-Up Syntax Analysis From P G to LR–Parsers for G ◮ P G has non-deterministic choice of expansions, ◮ LL–parsers eliminate non–determinism by looking ahead at expansions, ◮ LR–parsers follow all possibilities in parallel (corresponds to the subset–construction in NFA → DFA ). Derivation 1. Characteristic finite automaton of P G , a description of P G 2. Make deterministic 3. Interpret as control of a push down automaton 4. Check for “inedaquate” states
Bottom-Up Syntax Analysis From P G to LR–Parsers for G ◮ P G has non-deterministic choice of expansions, ◮ LL–parsers eliminate non–determinism by looking ahead at expansions, ◮ LR–parsers follow all possibilities in parallel (corresponds to the subset–construction in NFA → DFA ). Derivation 1. Characteristic finite automaton of P G , a description of P G 2. Make deterministic 3. Interpret as control of a push down automaton 4. Check for “inedaquate” states
Bottom-Up Syntax Analysis Characteristic Finite Automaton of P G NFA char ( P G ) = ( Q c , V c , ∆ c , q c , F c ) — the characteristic finite automaton of P G : ◮ Q c = It G — states: the items of G ◮ V c = V T ∪ V N — input alphabet: the sets of term. and non-term. symbols ◮ q c = [ S ′ → . S ] — start state ◮ F c = { [ X → α. ] | X → α ∈ P } — final states: the complete items ◮ ∆ c = { ([ X → α. Y β ] , Y , [ X → α Y .β ]) | X → α Y β ∈ P and Y ∈ V N ∪ V T }∪ { ([ X → α. Y β ] , ε, [ Y → .γ ]) | X → α Y β ∈ P and Y → γ ∈ P }
Bottom-Up Syntax Analysis Item PDA for G ab : S → aSb | ǫ P G ab Stack Input New Stack [ S ′ → . S ] [ S ′ → . S ] [ S → . aSb ] ǫ [ S ′ → . S ] [ S ′ → . S ] [ S → . ] ǫ [ S → . aSb ] a [ S → a . Sb ] [ S → a . Sb ] ǫ [ S → a . Sb ] [ S → . aSb ] [ S → a . Sb ] ǫ [ S → a . Sb ] [ S → . ] [ S → aS . b ] b [ S → aSb . ] [ S → a . Sb ] [ S → . ] [ S → aS . b ] ǫ [ S → a . Sb ] [ S → aSb . ] [ S → aS . b ] ǫ [ S ′ → . S ] [ S → aSb . ] [ S ′ → S . ] ǫ [ S ′ → . S ] [ S → . ] [ S ′ → S . ] ǫ
Bottom-Up Syntax Analysis The Characteristic NFA char ( P G ab ) S [S’ → . S] [S’ → S.] ǫ a S b ǫ [S → .aSb] [S → a.Sb] [S → aS.b] [S → aSb.] ǫ [S → . ] ǫ
Bottom-Up Syntax Analysis Characteristic NFA for G 0 E [ S → . E ] [ S → E . ] ε ε ε E + T ε [ E → . E + T ] [ E → E . + T ] [ E → E + . T ] [ E → E + T . ] ε ε T S → E [ E → . T ] [ E → T . ] ε ε E → E + T | T ε T ∗ F ε T → T ∗ F | F [ T → . T ∗ F ] [ T → T . ∗ F ] [ T → T ∗ . F ] [ T → T ∗ F . ] ε ε F → ( E ) | id F [ T → . F ] [ T → F . ] ε ε ( ) E ε [ F → . ( E )] [ F → ( . E )] [ F → ( E . )] [ F → ( E ) . ] ε id [ F → . id ] [ F → id . ]
Bottom-Up Syntax Analysis Interpreting char ( P G ) State of char ( P G ) is the current state of P G , i.e. the state on top of P G ’s stack. Adding actions to the transitions and states of char ( P G ) to describe P G : ε –transitions: push new state of char ( P G ) onto stack of P G : new current state. reading transitions: reading transitions of P G : replace current state of P G by the shifted one. final state: Actions in P G : ◮ pop final state [ X → α. ] from the stack, ◮ do a transition from the new topmost state under X , ◮ push the new state onto the stack.
Bottom-Up Syntax Analysis The Handle Revisited ◮ The bottom up–Parser is a shift–reduce–parser, each step is a shift: consuming the next input symbol, making a transition under it from the current state, pushing the new state onto the stack. a reduction: reducing a suffix of the stack contents by some production, making a transition under the left side non–terminal from the new current state, pushing the new state. ◮ the problem is the localization of the “handle”, the next right side to reduce. reducing too early: dead end, reducing too late: burying the handle.
Bottom-Up Syntax Analysis Handles and Viable Prefixes Some Abbreviations: RMD – rightmost derivation RSF – right sentential form ∗ S ′ = rm β Xu = ⇒ rm βα u – a RMD of cfg G . ⇒ ◮ α is a handle of βα u . The part of a RSF next to be reduced. ◮ Each prefix of βα is a viable prefix . A prefix of a RSF stretching at most up to the end of the handle, i.e. reductions if possible then only at the end.
Bottom-Up Syntax Analysis Examples in G 0 RSF handle viable prefix Reason E + F F E , E + , E + F S = rm E = ⇒ rm E + T = ⇒ rm E + F ⇒ 3 T ∗ id id T , T ∗ , T ∗ id S rm T ∗ F = rm T ∗ id = ⇒ ⇒ 4 F ∗ id F F S = rm T ∗ id = ⇒ rm F ∗ id ⇒
Bottom-Up Syntax Analysis Valid Items [ X → α.β ] is valid for the viable prefix γα , if there exists a ∗ RMD S ′ ⇒ ⇒ = rm γ Xw = rm γαβ w . An item valid for a viable prefix gives one interpretation of the parsing situation. Some viable prefixes of G 0 Viable Valid Items Reason γ w X α β Prefix E + [ E → E + . T ] S = rm E = rm E + T ε ε E E + T ⇒ ⇒ ∗ [ T → . F ] S rm E + T = rm E + F E + ε T ε F = ⇒ ⇒ ∗ [ F → . id ] S rm E + F = rm E + id E + ε F ε id = ⇒ ⇒ ∗ ( E + ( [ F → ( . E )] S rm ( E + F ) ( E + ) F ( E ) = ⇒ rm ( E + ( E )) = ⇒
Bottom-Up Syntax Analysis Valid Items and Parsing Situations Given some input string xuvw . The RMD ∗ ∗ ∗ ∗ S ′ = rm γ Xw = ⇒ rm γαβ w ⇒ = rm γα vw ⇒ = rm γ uvw ⇒ = rm xuvw ⇒ describes the following sequence of partial derivations: ∗ ∗ ∗ rm x ⇒ rm u ⇒ rm v ⇒ X = ⇒ γ = α = β = rm αβ ∗ S ′ = rm γ Xw ⇒ executed by the bottom-up parser in this order. The valid item [ X → α . β ] for the viable prefix γα describes the situation after partial derivation 2.
Bottom-Up Syntax Analysis Theorems char ( P G ) = ( Q c , V c , ∆ c , q c , F c ) Theorem For each viable prefix there is at least one valid item. Every parsing situation is described by at least one valid item. Theorem Let γ ∈ ( V T ∪ V N ) ∗ and q ∈ Q c . ∗ ( q c , γ ) ⊢ char ( PG ) ( q , ε ) iff γ is a viable prefix and q is a valid item for γ . A viable prefix brings char ( P G ) from its initial state to all its valid items. Theorem The language of viable prefixes of a cfg is regular.
Bottom-Up Syntax Analysis Making char ( P G ) deterministic Apply NFA → DFA to char ( P G ) : Result LR-DFA( G ). Example: char ( P G ab ) S [S’ → . S] [S’ → S.] ǫ a S b ǫ [S → .aSb] [S → a.Sb] [S → aS.b] [S → aSb.] ǫ [S → . ] ǫ LR-DFA( G ab ):
Recommend
More recommend