Pushdown Automata and Parser Sebastian Hack (based on slides by Reinhard Wilhelm and Mooly Sagiv) http://compilers.cs.uni-saarland.de Compiler Construction Core Course 2017 Saarland University
Pushdown Automata Input ✛ ✻ ✲ Head ❄ ✲ Stack Control ❄ Memory unboundedly extensible at one end, grows (by push), shrinks (by pop), test for emptiness. 1
Example Automaton • Accepted language L = { a i b i | i ≥ 0 } • Context Free Grammar S → aSb | ε • Pushdown automaton (TOS = top of stack) TOS input a b $ ε � � 1 state 0: Initial state, (0) (3) (3) (4) 0 state 1: reading a’s � � 1 (1) (2) (3) (3) state 2: reading b’s 1 � � state 3: error state 2 (3) (2) (3) (3) 1 state 4: final state. � � 2 (3) (3) (3) (4) 0 2
Pushdown Automaton (PDA) Definition A tuple P = ( V , Q , ∆ , q 0 , F ) where: • V — input alphabet • Q — finite set of states (stack symbols) • q 0 ∈ Q — initial state • F ⊆ Q — final states • ∆ ⊆ ( Q + × ( V ∪ { ε } )) × Q ∗ • Alternatively: δ : ( Q + × ( V ∪ { ε } )) → 2 Q ∗ where δ is a partial function 3
The Language Accepted by a PDA • PDA P = ( V , Q , ∆ , q 0 , F ) • For γ ∈ Q + , w ∈ V ∗ , ( γ, w ) is a configuration • The binary relation step on configurations is defined by: ( γ, aw ) ⊢ P ( γ ′ , w ) if there exists γ 1 such that • γ ≡ γ 1 γ 2 • γ ′ ≡ γ 1 γ 3 • ( γ 2 , a , γ 3 ) ∈ ∆ • ⊢ ∗ P is the reflexive transitive closure of ⊢ P • The language accepted by P L ( P ) = { w ∈ V ∗ | ∃ q f ∈ F : ( q 0 , w ) ⊢ ∗ M ( q f , ε ) } 4
Deterministic Pushdown Automaton • For every a ∈ V , ( γ 1 , a , γ 2 ) , ( γ ′ 1 , a , γ ′ 2 ) ∈ ∆ such that γ ′ 1 is a suffix of γ 1 implies γ 1 = γ ′ 1 and γ 2 = γ ′ 2 • There exist no ( γ 1 , ε, γ 2 ) , ( γ ′ 1 , a , γ ′ 2 ) ∈ ∆ such that a ∈ V ∪ { ε } and γ ′ 1 is a suffix of γ 1 or vice versa. 5
Theoretical Results Theorem For every context free grammar G there exists a non-deterministic pushdown automaton P such that L ( G ) = L ( P ) Proof: A PDA is given which emulates the original grammar. 6
Context Free Items • A (context–free) item is a triple ( A , α, β ) where A → αβ ∈ P • An item ( A , α, β ) is denoted by [ A → α.β ] • Interpretation: “In an attempt to recognize a word for A, a word for α has already been recognized” α — history of the item [ A → α.β ] • [ A → α. ] — A complete item • IT G — The set of items of G • hist ([ A 1 → α 1 .β 1 ][ A 2 → α 2 .β 2 ] . . . [ A n → α n .β n ]) = α 1 α 2 . . . α n 7
The Item Pushdown Automaton • A context-free grammar G = ( V N , V T , P , S ) • Extended grammar: Add non-term S ′ , and production S ′ → S • P G = ( V T , IT G , δ, [ S ′ → . S ] , { [ S ′ → S . ] } ) • Control δ TOS input new TOS comment Y → α ∈ P [ X → β. Y γ ] ε [ X → β. Y γ ][ Y → .α ] “expand” [ X → β. a γ ] a [ X → β a .γ ] “shift” [ X → β. Y γ ][ Y → α. ] [ X → β Y .γ ] “reduce” ε 8
The Item Pushdown Automaton • A context-free grammar G = ( V N , V T , P , S ) • Extended grammar: Add non-term S ′ , and production S ′ → S • P G = ( V T , IT G , δ, [ S ′ → . S ] , { [ S ′ → S . ] } ) • Control δ TOS input new TOS comment Y → α ∈ P [ X → β. Y γ ] ε [ X → β. Y γ ][ Y → .α ] “expand” [ X → β. a γ ] a [ X → β a .γ ] “shift” [ X → β. Y γ ][ Y → α. ] [ X → β Y .γ ] “reduce” ε Source of nondeterminism: expansion transitions: there may be several productions for Y 8
Example P = { 1 : S ′ → S , 2 : S → ε, 3 : S → aSb } TOS input new TOS comment [ S ′ → . S ] [ S ′ → . S ][ S → . ] ε e 1 , 2 [ S ′ → . S ] [ S ′ → . S ][ S → . aSb ] ε e 1 , 3 [ S → a . Sb ] [ S → a . Sb ][ S → . ] e 2 , 2 ε [ S → a . Sb ] ε [ S → a . Sb ][ S → . aSb ] e 2 , 3 [ S → . aSb ] a [ S → a . Sb ] s 1 [ S → aS . b ] b [ S → aSb . ] s 2 [ S ′ → . S ][ S → . ] [ S ′ → S . ] ε r 1 [ S ′ → . S ][ S → aSb . ] [ S ′ → S . ] ε r 2 [ S → a . Sb ][ S → . ] [ S → aS . b ] ε r 3 [ S → a . Sb ][ S → aSb . ] [ S → aS . b ] r 4 ε 9
Automaton for the Expression Grammar G 0 TOS Input New TOS [ S → . E ] [ S → . E ][ E → . E + T ] ε [ S → . E ] [ S → . E ][ E → . T ] ε [ E → . E + T ] [ E → . E + T ][ E → . E + T ] ε [ E → . E + T ] [ E → . E + T ][ E → . T ] ε [ F → ( . E )] [ F → ( . E )][ E → . E + T ] ε [ F → ( . E )] [ F → ( . E )][ E → . T ] ε [ E → . T ] [ E → . T ][ T → . T ∗ F ] ε [ E → . T ] [ E → . T ][ T → . F ] ε [ T → . T ∗ F ] [ T → . T ∗ F ][ T → . T ∗ F ] ε [ T → . T ∗ F ] [ T → . T ∗ F ][ T → . F ] ε [ E → E + . T ] [ E → E + . T ][ T → . T ∗ F ] ε [ E → E + . T ] [ E → E + . T ][ T → . F ] ε [ T → . F ] [ T → . F ][ F → . ( E )] ε [ T → . F ] [ T → . F ][ F → . id ] ε [ T → T ∗ . F ] [ T → T ∗ . F ][ F → . ( E )] ε [ T → T ∗ . F ] [ T → T ∗ . F ][ F → . id ] ε 10
TOS Input New TOS [ F → . ( E )] ( [ F → ( . E )] [ F → . id ] id [ F → id . ] [ F → ( E . )] ) [ E → ( E ) . ] [ E → E . + T ] + [ E → E + . T ] [ T → T . ∗ F ] [ T → T ∗ . F ] ∗ [ T → . F ][ F → id . ] [ T → F . ] ε [ T → T ∗ . F ][ F → id . ] [ T → T ∗ F . ] ε [ T → . F ][ F → ( E ) . ] [ T → F . ] ε [ T → T ∗ . F ][ F → ( E ) . ] [ T → T ∗ F . ] ε [ T → . T ∗ F ][ T → F . ] [ T → T . ∗ F ] ε [ E → . T ][ T → F . ] [ E → T . ] ε [ E → E + . T ][ T → F . ] [ E → E + T . ] ε [ E → E + . T ][ T → T ∗ F . ] [ E → E + T . ] ε [ T → . T ∗ F ][ T → T ∗ F . ] [ T → T . ∗ F ] ε [ E → . T ][ T → T ∗ F . ] [ E → T . ] ε [ F → ( . E )][ E → T . ] [ F → ( E . )] ε [ F → ( . E )][ E → E + T . ] [ F → ( E . )] ε [ E → . E + T ][ E → T . ] [ E → E . + T ] ε [ E → . E + T ][ E → E + T . ] [ E → E . + T ] ε [ S → . E ][ E → T . ] [ S → E . ] ε [ S → . E ][ E → E + T . ] [ S → E . ] ε 11
Stack when accepting id + id ∗ id : Remaining Input [ S → . E ] id + id ∗ id [ S → . E ][ E → . E + T ] id + id ∗ id [ S → . E ][ E → . E + T ][ E → . T ] id + id ∗ id [ S → . E ][ E → . E + T ][ E → . T ][ T → . F ] id + id ∗ id [ S → . E ][ E → . E + T ][ E → . T ][ T → . F ][ F → . id ] id + id ∗ id [ S → . E ][ E → . E + T ][ E → . T ][ T → . F ][ F → id . ] + id ∗ id [ S → . E ][ E → . E + T ][ E → . T ][ T → F . ] + id ∗ id [ S → . E ][ E → . E + T ][ E → T . ] + id ∗ id [ S → . E ][ E → E . + T ] + id ∗ id [ S → . E ][ E → E + . T ] id ∗ id [ S → . E ][ E → E + . T ][ T → . T ∗ F ] id ∗ id [ S → . E ][ E → E + . T ][ T → . T ∗ F ][ T → . F ] id ∗ id [ S → . E ][ E → E + . T ][ T → . T ∗ F ][ T → . F ][ F → . id ] id ∗ id [ S → . E ][ E → E + . T ][ T → . T ∗ F ][ T → . F ][ F → id . ] ∗ id [ S → . E ][ E → E + . T ][ T → . T ∗ F ][ T → F . ] ∗ id [ S → . E ][ E → E + . T ][ T → T . ∗ F ] ∗ id [ S → . E ][ E → E + . T ][ T → T ∗ . F ] id [ S → . E ][ E → E + . T ][ T → T ∗ . F ][ F → . id ] id [ S → . E ][ E → E + . T ][ T → T ∗ . F ][ F → id . ] [ S → . E ][ E → E + . T ][ T → T ∗ F . ] [ S → . E ][ E → E + T . ] 12 [ S → E . ]
Correctness Lemma If ([ S ′ → . S ] , uv ) ⊢ ∗ ∗ = ⇒ P G ( ρ, v ) then hist ( ρ ) u G Corollary: L ( P G ) ⊆ L ( G ) Lemma ∗ Let A ∈ V N and w ∈ V ∗ = ⇒ w, there exists A → α ∈ P T . If A G such that for all ρ ∈ IT ∗ G and v ∈ V ∗ T ( ρ [ A → .α ] , wv ) ⊢ ∗ P G ( ρ [ A → α. ] , v ) Corollary: L ( P G ) ⊇ L ( G ) 13
Automaton with Output A tuple P = ( V , Q , O , ∆ , q 0 , F ) where: • input alphabet V , output alphabet O • finite set of states Q, initial state q 0 ∈ Q , final states F ⊆ Q • ∆ ⊆ ( Q + × ( V ∪ { ε } )) × Q ∗ × ( O ∪ { ε } ) • Alternatively: δ : ( Q + × ( V ∪ { ε } )) → 2 Q ∗ × ( O ∪ { ε } ) where δ is a partial function • Essentially like a normal PDA but with output on steps 14
Left/Predictive/Top-Down Parser G = ( V T , IT G , P , δ l , [ S ′ → . S ] , { [ S ′ → S . ] } ) where P l δ l ([ X → β. Y γ ] , ε ) = { [ X → β. Y γ ][ Y → .α ] , Y → α ) | Y → α ∈ P } Configuration: IT + G × V ∗ T × P ∗ Step: ( ρ [ X → β. Y γ ] , w , o ) ⊢ P l G ( ρ [ X → β. Y γ ][ Y → .α ] , w , o ( Y → α )) 15
Right/Bottom-Up Parser G = ( V T , IT G , P , δ r , [ S ′ → . S ] , { [ S ′ → S . ] } ) where P r δ r ([ X → β. Y γ ][ Y → α. ] , ε ) = { [ X → β Y .γ ] , Y → α ) } Configuration: IT + G × V ∗ T × P ∗ Step: ( ρ [ X → β. Y γ ][ Y → α. ] , w , o ) ⊢ P r G ( ρ [ X → β Y .γ ] , w , o ( Y → α ))) 16
Recommend
More recommend