Dependently Typed Grammars MPC 2010 Kasper Brink, Stefan Holdermans, Andres L¨ oh June 22, 2010
Parser Combinators Expression Grammar E → E B N | N B → + | − N → 0 | 1 pExpr , pNum :: Parser Int pBin :: Parser (Int → Int → Int) pExpr = ( λ e b n → b e n) < $ > pExpr < ∗ > pBin < ∗ > pNum < | > pNum pBin = (+) < $ pSymbol ’+’ < | > ( − ) < $ pSymbol ’-’ pNum = 0 < $ pSymbol ’0’ < | > 1 < $ pSymbol ’1’
Parser Combinators Expression Grammar E → E B N | N B → + | − N → 0 | 1 pExpr , pNum :: Parser Int pBin :: Parser (Int → Int → Int) pExpr = ( λ e b n → b e n) < $ > pExpr < ∗ > pBin < ∗ > pNum < | > pNum pBin = (+) < $ pSymbol ’+’ < | > ( − ) < $ pSymbol ’-’ pNum = 0 < $ pSymbol ’0’ < | > 1 < $ pSymbol ’1’ Left Recursion − → Non-termination!
Representing grammars instead of parsers ◮ Represent a grammar as a data value ◮ Analyze and transform ◮ Generate a parser
Representing grammars instead of parsers ◮ Represent a grammar as a data value ◮ Analyze and transform ◮ Generate a parser This talk ◮ Representation in Agda ◮ Transform grammar to remove left recursion
Outline ◮ Grammar Representation ◮ Left-Corner Transform ◮ (Part of) Correctness Proof ◮ Conclusion
Grammar Representation
Symbols Terminal : Set Terminal = Char data Nonterminal : Set where E : Nonterminal B : Nonterminal N : Nonterminal data Symbol : Set where st : Terminal → Symbol sn : Nonterminal → Symbol
Semantic Types ◮ Parsers: every parser has a result type ◮ Grammars: every nonterminal has a semantic type ❏ ❑ : Nonterminal → Set ❏ E ❑ = N ❏ B ❑ = N → N → N ❏ N ❑ = N
❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ ❏ ❑ Semantic Functions ◮ Type of semantic functions determined by ❏ ❑ E → E B N λ e b n → b e n : ❏ E ❑ → ❏ B ❑ → ❏ N ❑ → ❏ E ❑ E → N id : ❏ N ❑ → ❏ E ❑ N → 1 1 : ❏ N ❑
Semantic Functions ◮ Type of semantic functions determined by ❏ ❑ E → E B N λ e b n → b e n : ❏ E ❑ → ❏ B ❑ → ❏ N ❑ → ❏ E ❑ E → N id : ❏ N ❑ → ❏ E ❑ N → 1 1 : ❏ N ❑ ◮ Compute type of semantic function: ❏ | | ❑ ◮ Production A → β has semantic function of type ❏ β | | A ❑ ❏ | | ❑ : Symbols → Nonterminal → Set ❏ [ ] | | A ❑ = ❏ A ❑ ❏ st :: β | | A ❑ = ❏ β | | A ❑ ❏ sn B :: β | | A ❑ = ❏ B ❑ → ❏ β | | A ❑
Productions data Production : Set where prod : (A : Nonterminal) → ( β : Symbols) → ❏ β | | A ❑ → Production Example: p 1 = prod E (sn E :: sn B :: sn N :: [ ]) ( λ e b n → b e n) p 2 = prod E (sn N :: [ ]) id p 3 = prod N (st ’1’ :: [ ]) 1 Of course it is desirable to devise a more convenient input syntax for grammars.
Generating a Parser generateParser : Productions → (S : Nonterminal) → Parser ❏ S ❑ generateParser prods = gen where mutual gen : (A : Nonterminal) → Parser ❏ A ❑ gen A = (foldr < | > pFail ◦ map genAlt ◦ filterLHS A) prods genAlt : ∀ { A } → ProductionLHS A → Parser ❏ A ❑ genAlt (prodlhs (prod A β sem)) = buildParser β (pSucceed sem) buildParser : ∀ { A } β → Parser ❏ β | | A ❑ → Parser ❏ A ❑ buildParser [ ] p = p buildParser (st b :: β ) p = buildParser β (p < ∗ pTerminal b) buildParser (sn B :: β ) p = buildParser β (p < ∗ > gen B)
Generating a Parser generateParser : Productions → (S : Nonterminal) → Parser ❏ S ❑ generateParser prods = gen where mutual gen : (A : Nonterminal) → Parser ❏ A ❑ gen A = (foldr < | > pFail ◦ map genAlt ◦ filterLHS A) prods genAlt : ∀ { A } → ProductionLHS A → Parser ❏ A ❑ genAlt (prodlhs (prod A β sem)) = buildParser β (pSucceed sem) buildParser : ∀ { A } β → Parser ❏ β | | A ❑ → Parser ❏ A ❑ buildParser [ ] p = p buildParser (st b :: β ) p = buildParser β (p < ∗ pTerminal b) buildParser (sn B :: β ) p = buildParser β (p < ∗ > gen B)
Generating a Parser generateParser : Productions → (S : Nonterminal) → Parser ❏ S ❑ generateParser prods = gen where mutual gen : (A : Nonterminal) → Parser ❏ A ❑ gen A = (foldr < | > pFail ◦ map genAlt ◦ filterLHS A) prods genAlt : ∀ { A } → ProductionLHS A → Parser ❏ A ❑ genAlt (prodlhs (prod A β sem)) = buildParser β (pSucceed sem) buildParser : ∀ { A } β → Parser ❏ β | | A ❑ → Parser ❏ A ❑ buildParser [ ] p = p buildParser (st b :: β ) p = buildParser β (p < ∗ pTerminal b) buildParser (sn B :: β ) p = buildParser β (p < ∗ > gen B)
Left-Corner Transform
Left Corners A ∗ ◮ Left corner: = ⇒ X β
Left Corners A ∗ ◮ Left corner: = ⇒ X β ◮ Left-corner transform introduces new nonterminals“ A − X ” ◮ A − X represents the part of an A that follows an X . ◮ Example: A ∗ ∗ ∗ A ⇒ B β = = ⇒ a b c β = ⇒ a b c d e f g a b c d e f g B A B −
Left-corner Transform Transformation Rules (Johnson, 1998) (1) ∀ A , b : A → b A − b (2) ∀ C , A → X β : C − X → β C − A (3) ∀ A : A − A → ǫ
Example Transformation Transformed: E → + E − + B → + B − + N → + N − + E → − E − − B → − B − − N → − N − − Original: E → 0 E − 0 B → 0 B − 0 N → 0 N − 0 E → 1 E − 1 B → 1 B − 1 N → 1 N − 1 E → E B N E → N E − E → B N E − E B − E → B N B − E N − E → B N N − E B → + E − N → E − E B − N → B − E N − N → N − E B → − E − + → E − B B − + → B − B N − + → N − B N → 0 E − − → E − B B − − → B − B N − − → N − B N → 1 E − 0 → E − N B − 0 → B − N N − 0 → N − N E − 1 → E − N B − 1 → B − N N − 1 → N − N E − E → ǫ B − B → ǫ N − N → ǫ
New nonterminals (notation: Original“O. . . ” , Transformed“T. . . ” ) data TNonterminal : Set where n : ONonterminal → TNonterminal n − : ONonterminal → OSymbol → TNonterminal T ❏ ❑ : TNonterminal → Set T ❏ n A ❑ = O ❏ A ❑ ❏ A ❑ T ❏ n A − st b ❑ = O ❏ A ❑ T ❏ n A − sn B ❑ = O ❏ B ❑ → O ❏ A ❑ a b c d e f g h ❏ B ❑ ❏ B ❑ → ❏ A ❑
Transforming Grammars Transformation Rules (1) ∀ A , b : A → b A − b (2) ∀ C , A → X β : C − X → β C − A (3) ∀ A : A − A → ǫ lct : OProductions → TProductions lct ps = concatMap ( λ A → map (rule1 A) (terms ps)) (nonterms ps) + + concatMap ( λ C → map (rule2 C) ps) (nonterms ps) + + map rule3 (nonterms ps)
Transforming Productions Rule (2): A → X β − → C − X → β C − A rule2 : ONonterminal → OProduction → TProduction rule2 C (O.prod A (X :: β ) sem) = T.prod (n C − X) (lift β + + [T.sn (n C − O.sn A)]) (semtrans C A X β sem)
❏ ❑ ❏ ❑ Transforming Semantics Use semantic types as specification of semantic transformation Semantic transformation production: A → B β − → C − B → β C − A semantics: ❏ B β | | A ❑ − → ❏ β C − A | | C − B ❑
Transforming Semantics Use semantic types as specification of semantic transformation Semantic transformation production: A → B β − → C − B → β C − A semantics: ❏ B β | | A ❑ − → ❏ β C − A | | C − B ❑ semtrans : ∀ C A B β → O ❏ O.sn B :: β | | A ❑ → T ❏ lift β + + [T.sn (n C − O.sn A)] | | n C − O.sn B ❑
Transforming Semantics Use semantic types as specification of semantic transformation Semantic transformation production: A → B β − → C − B → β C − A semantics: ❏ B β | | A ❑ − → ❏ β C − A | | C − B ❑ semtrans : ∀ C A B β → O ❏ O.sn B :: β | | A ❑ → T ❏ lift β + + [T.sn (n C − O.sn A)] | | n C − O.sn B ❑ semtrans C A B β = O.foldSymbols ( λ f → f) ( λ f → λ g → f ◦ flip g) ( λ f g → g ◦ f) β
Correctness
Correctness Criteria ◮ Correctness of the left-corner transform: ◮ Transformed grammar recognizes the same language ◮ No addition or removal of ambiguity (number of parse trees for each sentence is preserved) ◮ Left recursion is removed ◮ What we proved (weaker): ◮ Transformed grammar recognizes at least the original language: L ( G ) ⊆ L ( G ′ )
Concepts Involved in Proof LC Transform G G ′ (rule1a–1c) ∗ ∗ = ⇒ G = ⇒ G ′ ∼ = (preserving w ) ′ S S w w
Concepts Involved in Proof LC Transform G G ′ (rule1a–1c) ∗ ∗ ⇒ G = ⇒ G ′ = ∼ = (preserving w ) ′ S S w w LC Traversal Top-Down Traversal Original productions in LC-order
Concepts Involved in Proof LC Transform G G ′ (rule1a–1c) ∗ ∗ ⇒ G = ⇒ G ′ = ∼ = (preserving w ) ′ S S w language inclusion w LC Traversal Top-Down Traversal Original productions in LC-order
Parse Tree Traversals Top-down traversal: parent recognized before children Bottom-up traversal: parent recognized after children Left-corner traversal: parent recognized after left corner, and before other children 2 4 . . . 1 3
Recommend
More recommend