approximating context free grammars for parsing and
play

Approximating Context-Free Grammars for Parsing and Verification - PowerPoint PPT Presentation

Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Approximating Context-Free Grammars for Parsing and Verification Sylvain Schmitz LORIA, INRIA Nancy - Grand Est October 18, 2007 Motivation Approximations


  1. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion A Syntax Issue Parsers /** * smlfvalbind.y * * Standard ML function declarations. * See _The_definition_of_standard_ML_, Milner et al., 1997, * ISBN 0−262−63181−4. Context-free */ %token CASE "case" %token FUN "fun" %token MATCH "=>" %token OF "of" grammar %token VID %start dec %% dec: "fun" fvalbind ; fvalbind: sfvalbind | fvalbind ’|’ sfvalbind ; sfvalbind: VID atpats ’=’ exp ; exp: VID | "case" exp "of" match ; match: mrule | match ’|’ mrule ; mrule: pat "=>" exp ; atpats: atpat | atpats atpat ; . . . atpat: VID ; pat: VID atpat ; %% | NONE = > filterP(r, l) | filterP Parser generator � fvalbind � � fvalbind � � sfvalbind � ([], l) = rev l Input Parse � exp � tokens tree � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � Parser . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  2. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts LALR(1) Parser Generator ◮ GNU Bison state 20 6 exp: "case" exp "of" match . 8 match: match . ’|’ mrule ’|’ shift, and go to state 24 ’|’ [reduce using rule 6 (exp)] ◮ Restricted grammar class

  3. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts LALR(1) Parser Generator CFG ◮ GNU Bison state 20 6 exp: "case" exp "of" match . 8 match: match . ’|’ mrule ’|’ shift, and go to state 24 ’|’ [reduce using rule 6 (exp)] LALR(1) ◮ Restricted grammar class

  4. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts Dealing with Conflicts An Objective Measure [Malloy et al., 2002] on a C# Grammar 700 ’2002_malloy.data’ using 1:($2+$3) 600 500 LALR(1) conflicts 400 300 200 100 0 2 4 6 8 10 12 14 16 18 20 Parser versions

  5. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts Dealing with Conflicts A Subjective Measure Courtesy of http://www.phdcomics.com .

  6. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts Dealing with Conflicts A Subjective Measure Courtesy of http://www.phdcomics.com .

  7. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Conflicts Dealing with Conflicts A Subjective Measure Courtesy of http://www.phdcomics.com .

  8. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ Culik and Cohen, 1973] ◮ Generalized LR [Tomita, LR( k ) 1986] ◮ Unambiguous CFGs LALR(1) [Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963] ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  9. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ Culik and Cohen, 1973] LR-Regular ◮ Generalized LR [Tomita, LR( k ) 1986] ◮ Unambiguous CFGs LALR(1) [Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963] ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  10. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ Culik and Cohen, 1973] ◮ Generalized LR [Tomita, 1986] ◮ Unambiguous CFGs [Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963] ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  11. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Ambiguity /** * smlfvalbind.y * * Standard ML function declarations. * See _The_definition_of_standard_ML_, Milner et al., 1997, Context-free * ISBN 0−262−63181−4. */ %token CASE "case" %token FUN "fun" %token MATCH "=>" %token OF "of" grammar %token VID %start dec %% dec: "fun" fvalbind ; fvalbind: sfvalbind | fvalbind ’|’ sfvalbind ; sfvalbind: � exp � VID atpats ’=’ exp ; exp: VID | "case" exp "of" match ; match: mrule � match � | match ’|’ mrule ; mrule: pat "=>" exp ; atpats: atpat | atpats atpat � mrule � ; atpat: VID ; pat: VID atpat � exp � ; %% case a of b = > case b of c = > c | d = > d � match � Parser � match � generator � exp � � pat � � exp � � mrule � � mrule � case a of b = > case b of c = > c | d = > d � exp � � match � � match � Input Parse � mrule � tokens forest � exp � � match � � exp � � pat � � exp � � mrule � � mrule � Parser case a of b = > case b of c = > c | d = > d

  12. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Ambiguity /** * smlfvalbind.y * * Standard ML function declarations. * See _The_definition_of_standard_ML_, Milner et al., 1997, * ISBN 0−262−63181−4. Context-free */ %token CASE "case" %token FUN "fun" %token MATCH "=>" %token OF "of" grammar %token VID %start dec %% dec: "fun" fvalbind ; fvalbind: sfvalbind | fvalbind ’|’ sfvalbind ; sfvalbind: VID atpats ’=’ exp � exp � ; exp: VID | "case" exp "of" match ; match: mrule � match � | match ’|’ mrule ; mrule: pat "=>" exp ; atpats: atpat � mrule � | atpats atpat ; atpat: VID ; pat: VID atpat ; � exp � %% case a of b = > case b of c = > c | d = > d � match � Parser � match � generator � exp � � pat � � exp � � mrule � � mrule � case a of b = > case b of c = > c | d = > d � exp � � match � � match � Input Parse � mrule � tokens forest � exp � � match � � exp � � pat � � exp � � mrule � � mrule � Parser case a of b = > case b of c = > c | d = > d

  13. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ Culik and Cohen, 1973] UCFG ◮ Generalized LR [Tomita, 1986] ◮ Unambiguous CFGs [Cantor, 1962, Chomsky and Sch¨ utzenberger, 1963] ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  14. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ U Culik and n s a f e Cohen, 1973] UCFG LR-Regular ◮ Generalized LR [Tomita, LR( k ) 1986] ◮ Unambiguous CFGs LALR(1) [Cantor, 1962, Chomsky S and Sch¨ utzenberger, 1963] a f e ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  15. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ U Culik and n s a f e Cohen, 1973] HVRU UCFG ◮ Generalized LR [Tomita, 1986] ◮ Unambiguous CFGs [Cantor, 1962, Chomsky S and Sch¨ utzenberger, 1963] a f e ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  16. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions State of the Art CFG ◮ LR( k ) [Knuth, 1965] ◮ LR-Regular [ ˇ U Culik and n s a f e Cohen, 1973] HVRU UCFG LR-Regular ◮ Generalized LR [Tomita, LR( k ) 1986] ◮ Unambiguous CFGs LALR(1) [Cantor, 1962, Chomsky S and Sch¨ utzenberger, 1963] a f e ◮ Horizontal and vertical unambiguity test [Brabrand et al., 2007]

  17. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) ◮ Shift-Resolve ◮ Noncanonical unambiguity test ◮ Framework for grammar approximations

  18. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions CFG ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) UCFG LR-Regular ◮ Shift-Resolve LR( k ) NLALR(1) ◮ Noncanonical LALR(1) unambiguity test ◮ Framework for grammar approximations

  19. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions CFG ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) UCFG LR-Regular ◮ Shift-Resolve LR( k ) ShRe ◮ Noncanonical LALR(1) unambiguity test ◮ Framework for grammar approximations

  20. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions CFG ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) HVRU UCFG NU LR-Regular ◮ Shift-Resolve LR( k ) ◮ Noncanonical LALR(1) unambiguity test ◮ Framework for grammar approximations

  21. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Solutions Contributions CFG ◮ Noncanonical parsing methods [Szymanski and Williams, 1976, Tai, 1979] ◮ Noncanonical LALR(1) HVRU UCFG NU LR-Regular ◮ Shift-Resolve LR( k ) ◮ Noncanonical LALR(1) unambiguity test ◮ Framework for grammar approximations

  22. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Bracketed Grammars G = � N , T , P , S � , V = N ∪ T 1 � dec � − → fun � fvalbind � 2 � fvalbind � → − � sfvalbind � 3 � fvalbind � ′ | ′ � sfvalbind � � fvalbind � − → 4 � sfvalbind � − → vid � atpats � = � exp � 5 � exp � − → case � exp � of � match � 6 � match � − → � mrule � 7 � match � ′ | ′ � mrule � � match � − → 8 � mrule � − → � pat � = > � exp � 9 � atpats � → − � atpat � 10 � atpats � − → � atpats � � atpat � 11 � pat � − → vid � atpat � 12 � atpat � − → vid

  23. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Bracketed Grammars G b = � N , T b , P b , S � , V b = N ∪ T b 1 � dec � − → d 1 fun � fvalbind � r 1 2 � fvalbind � − → d 2 � sfvalbind � r 2 3 d 3 � fvalbind � ′ | ′ � sfvalbind � r 3 � fvalbind � − → 4 � sfvalbind � − → d 4 vid � atpats � = � exp � r 4 5 � exp � − → d 5 case � exp � of � match � r 5 6 � match � − → d 6 � mrule � r 6 7 d 7 � match � ′ | ′ � mrule � r 7 � match � − → 8 � mrule � − → d 8 � pat � = > � exp � r 8 9 � atpats � − → d 9 � atpat � r 9 10 � atpats � − → d 10 � atpats � � atpat � r 10 11 � pat � − → d 11 vid � atpat � r 11 12 � atpat � − → d 12 vid r 12

  24. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Positions � fvalbind � � fvalbind � � sfvalbind � ’ | ’ � sfvalbind � vid � atpats � � exp � = ′ | ′ · d 4 vid � atpats � = � exp � r 4 r 3 d 3 d 2 � sfvalbind � r 2

  25. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Graph Γ Left-to-right Walks in Trees � fvalbind � � fvalbind � ’ | ’ � sfvalbind � d 4 � sfvalbind � vid � atpats � � exp � = ′ | ′ d 4 · vid � atpats � = � exp � r 4 r 3 d 3 d 2 � sfvalbind � r 2

  26. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Graph Γ Left-to-right Walks in Trees � fvalbind � � sfvalbind � � fvalbind � ’ | ’ � sfvalbind � � sfvalbind � vid � atpats � � exp � = ′ | ′ d 4 vid � atpats � = � exp � r 4 · r 3 d 3 d 2 � sfvalbind � r 2

  27. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Graph Γ Left-to-right Walks in Trees � fvalbind � r 3 � fvalbind � ’ | ’ � sfvalbind � � sfvalbind � vid � atpats � � exp � = ′ | ′ d 4 vid � atpats � = � exp � r 4 r 3 · d 3 d 2 � sfvalbind � r 2

  28. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Graph Γ Left-to-right Walks in Trees . . . � fvalbind � d 3 r 3 � fvalbind � � sfvalbind � ’ | ’ r 4 d 4 � sfvalbind � � atpats � � exp � d 2 vid = r 2 . . . . . . . . .

  29. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Position Automaton Γ/ ≡ Definition Γ/ ≡ is the quotient of Γ by an equivalence relation ≡ between positions. Theorem (Language over-approximation) L ( G b ) ⊆ L ( Γ/ ≡ ) ∩ T ∗ b

  30. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Example: item 0 Equivalence � fvalbind � d 3 r 3 � fvalbind � ’ | ’ � sfvalbind � r 4 d 4 � sfvalbind � � atpats � � exp � d 2 vid = r 2 r 4 d 4 vid � atpats � � exp � = ◮ equivalence class → vid � atpats � · = � exp � ] 4 [ � sfvalbind � − ◮ LR(0) items ◮ Γ/ item 0 : nondeterministic LR(0) automaton

  31. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Example: item 0 Equivalence → · � sfvalbind � ] →� fvalbind � ′ | ′ · � sfvalbind � ] 2 3 [ � fvalbind � − [ � fvalbind � − d 4 d 4 → · vid � atpats � = � exp � ] 4 [ � sfvalbind � − vid → vid · � atpats � = � exp � ] 4 [ � sfvalbind � − � atpats � → vid � atpats � · = � exp � ] 4 [ � sfvalbind � − = → vid � atpats � = · � exp � ] 4 [ � sfvalbind � − � exp � → vid � atpats � = � exp � · ] 4 [ � sfvalbind � − r 4 r 4 →� sfvalbind � · ] →� fvalbind � ′ | ′ � sfvalbind � · ] 2 3 [ � fvalbind � − [ � fvalbind � − equivalence class

  32. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Summary ◮ general framework for approximations ◮ applications: ◮ parser construction ◮ ambiguity detection ◮ XML validation [Segoufin and Vianu, 2002]? ◮ symbolic supertagging [Boullier, 2003]?

  33. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Grammar Approximations Summary ◮ general framework for approximations ◮ applications: ◮ parser construction ◮ ambiguity detection ◮ XML validation [Segoufin and Vianu, 2002]? ◮ symbolic supertagging [Boullier, 2003]?

  34. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Principles Shift-Resolve Parsing ◮ noncanonical ◮ k = 1 reduced lookahead symbol ◮ resolve = reduce + pushback: emulates a bounded reduced lookahead without any preset bound

  35. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Principles Shift-Resolve Parsing ◮ noncanonical ◮ k = 1 reduced lookahead symbol ◮ resolve = reduce + pushback: emulates a bounded reduced lookahead without any preset bound

  36. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � sfvalbind � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  37. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  38. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  39. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  40. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  41. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parsing Example Shift-Resolve Parse � fvalbind � � fvalbind � � sfvalbind � � exp � � match � � mrule � � sfvalbind � � pat � � exp � � atpats � � exp � . . . | NONE = > filterP(r, l) | filterP ([], l) = rev l

  42. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Generating the Parser 1. position automaton 2. determinization by subset construction

  43. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Principle ◮ d i transitions denote traditional item closures ◮ r i transitions denote a phrase that should be reduced ◮ other transitions denote shifts ◮ items in the construction hold 1. a state of the position automaton 2. a parsing action 3. a pushback length

  44. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Principle ◮ d i transitions denote traditional item closures ◮ r i transitions denote a phrase that should be reduced ◮ other transitions denote shifts ◮ items in the construction hold 1. a state of the position automaton 2. a parsing action 3. a pushback length

  45. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �−

  46. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− r 5 → vid � atpats � = � exp � · , 5, 0 � sfvalbind �−

  47. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 r 4 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �−

  48. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ −

  49. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ −

  50. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − ’ | ’ →� fvalbind � ’ | ’ · � sfvalbind � , 5, 1 � fvalbind �− →� match � ’ | ’ · � mrule � , 0, 0 � match �−

  51. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − ’ | ’ →� fvalbind � ’ | ’ · � sfvalbind � , 5, 1 � fvalbind �− →� match � ’ | ’ · � mrule � , 0, 0 � match �− d 8 → · � pat � = > � exp � , 0, 0 � mrule �−

  52. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Subset Construction Example → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − ’ | ’ →� fvalbind � ’ | ’ · � sfvalbind � , 5, 1 � fvalbind �− →� match � ’ | ’ · � mrule � , 0, 0 � match �− → · � pat � = > � exp � , 0, 0 � mrule �− → · vid � atpat � , 0, 0 � pat �− → · vid � atpats � = � exp � , 0, 0 � sfvalbind �−

  53. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Construction Failure → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ −

  54. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Construction Failure → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 r 5 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − →� pat � ’ | ’ � exp � · , 5, 0 � mrule �−

  55. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Construction Failure → case � exp � of � match � · , 0, 0 � exp �− →� match � · ’ | ’ � mrule � , 0, 0 � match �− → vid � atpats � = � exp � · , 5, 0 � sfvalbind �− →� fvalbind � ’ | ’ � sfvalbind � · , 5, 0 � fvalbind �− →� sfvalbind � · , 5, 0 � fvalbind �− →� fvalbind � · ’ | ’ � sfvalbind � , 5, 0 � fvalbind �− → fun � fvalbind � · , 5, 0 � dec �− →� dec � · $, 5, 0 S ′ − →� pat � ’ | ’ � exp � · , 5, 0 � mrule �− →� mrule � · , 5, 0 � match �− →� match � · ’ | ’ � mrule � , 5, 0 � match �−

  56. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Complexity ◮ | Γ/ ≡ | : size of the position automaton ◮ |A| : size of the parser: O ( 2 | Γ/ ≡ | | P | ) ◮ parsing time complexity for input w : O ( | w | )

  57. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Complexity ◮ | Γ/ ≡ | : size of the position automaton | Γ/ item 0 | = O ( |G| ) ◮ |A| : size of the parser: O ( 2 | Γ/ ≡ | | P | ) ◮ parsing time complexity for input w : O ( | w | )

  58. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Limitations − incomparable with classical parsing techniques + subset construction mendable

  59. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Limitations − incomparable with classical parsing techniques + subset construction mendable

  60. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Parser Construction Summary ◮ Shift Resolve parsers 1. Large class of grammars accepted 2. Unambiguity 3. Linear time parsing ◮ 2-steps construction 1. Simple 2. Flexible

  61. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Principles ◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same yield d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 6 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 d 7 d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 ◮ construct a FSA A such that L ( G b ) ⊆ L ( A ) , and look for bracketed sentences with the same yield

  62. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Principles ◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same yield d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 6 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 d 7 d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 ◮ construct a FSA A such that L ( G b ) ⊆ L ( A ) , and look for bracketed sentences with the same yield

  63. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Principles ◮ a bracketed sentence = a derivation tree ◮ ambiguity = more than one tree with the same yield d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 6 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 d 7 d 6 d 8 d 13 vid r 13 = > d 5 case d 14 vid r 14 of d 7 d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 r 5 r 8 r 6 ′ | ′ d 8 d 13 vid r 13 = > d 14 vid r 14 r 8 r 7 ◮ construct a FSA A such that L ( G b ) ⊆ L ( A ) , and look for bracketed sentences with the same yield

  64. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Regular Unambiguity RU( ≡ ) ◮ G is regular unambiguous for ≡ of finite index, if there does not exist w b � w ′ b in L ( Γ/ ≡ ) ∩ T ∗ b with h ( w b ) = h ( w ′ b ) ◮ LR(0) � RU( item 0 ) ◮ regular approximations are too weak

  65. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Regular Unambiguity RU( ≡ ) ◮ G is regular unambiguous for ≡ of finite index, if there does not exist w b � w ′ b in L ( Γ/ ≡ ) ∩ T ∗ b with h ( w b ) = h ( w ′ b ) ◮ LR(0) � RU( item 0 ) ◮ regular approximations are too weak

  66. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Nonterminal Transitions ◮ SF ( G b ) ⊆ L ( Γ/ ≡ ) ◮ look for two di ff erent bracketed sentential forms in L ( Γ/ ≡ ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 ◮ a nonterminal transition represents exactly its derived context-free language

  67. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Nonterminal Transitions ◮ SF ( G b ) ⊆ L ( Γ/ ≡ ) ◮ look for two di ff erent bracketed sentential forms in L ( Γ/ ≡ ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 ◮ a nonterminal transition represents exactly its derived context-free language

  68. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Nonterminal Transitions ◮ SF ( G b ) ⊆ L ( Γ/ ≡ ) ◮ look for two di ff erent bracketed sentential forms in L ( Γ/ ≡ ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 ◮ a nonterminal transition represents exactly its derived context-free language

  69. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 epsilon: mae

  70. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 epsilon: mae

  71. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 epsilon: mae

  72. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 shift: mas

  73. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 d 14 vid r 14 = > d 5 case � exp � of � match � r 5 r 8 r 6 nothing!

  74. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 shift: mas

  75. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 conflict: mac

  76. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 conflict: mac

  77. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 conflict: mac

  78. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 shift: mas

  79. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 reduce: mar

  80. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Mutual Accessibility Relations ◮ between pairs of states of Γ/ ≡ , ( q 1 , q 2 ) ◮ synchronized left-to-right walks from an initial pair ( q s , q s ) d 6 d 8 � pat � = > d 5 case � exp � of d 7 � match � ′ | ′ � mrules � r 7 r 5 r 8 r 6 ′ | ′ � mrules � r 7 d 7 d 6 d 8 � pat � = > d 5 case � exp � of � match � r 5 r 8 r 6 conflict: mac

  81. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity NU( ≡ ) ◮ ma = mas ∪ mae ∪ mac ∪ mar ◮ G is noncanonically unambiguous if there does not exist a relation ( q s , q s ) ma ∗ ( q f , q f ) that uses mac at some step ◮ Computation in O ( | Γ/ ≡ | 2 ) in space

  82. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Comparisons ◮ Regular Unambiguity RU( ≡ ) ◮ Bounded-length detection schemes ◮ LR( k ) and LR-Regular (LR( Π )) ◮ Horizontal and vertical ambiguity (HVRU( ≡ ))

  83. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity Bounded-length detection [Gorn, 1963, Cheung and Uzgalis, 1995, Schr¨ oer, 2001, Jampana, 2005] ◮ generate sentences ◮ not conservative ◮ prefix m prevents from false positives in sentences of length < m ◮ need to generate a 2 n + 1 to find G n 4 ambiguous, but G n 4 � NU ( item 0 ) S − → A | B n a , A − → Aaa | a , B 1 − → aa , B 2 − → B 1 B 1 , . . . , B n − → B n − 1 B n − 1 ( G n 4 )

  84. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Noncanonical Unambiguity LR( k ) and LR-Regular [Knuth, 1965, Hunt III et al., 1975, ˇ Culik and Cohen, 1973, Heilbrunner, 1983] ◮ conservative tests ◮ define item Π s.t. LR ( Π ) ⊂ NU ( item Π ) ◮ need a LR(2 n ) test to prove G n 3 unambiguous, but G n 3 ∈ NU ( item 0 ) S − → A | B n , A − → Aaa | a , B 1 − → aa , B 2 − → B 1 B 1 , . . . , B n − → B n − 1 B n − 1 ( G n 3 )

  85. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Experimental Results Implementation ◮ For the whole SML grammar: ◮ conflicts in the LALR(1) parser sml.y: conflicts: 223 shift/reduce, 35 reduce/reduce ◮ Our tool: 89 potential ambiguities with LR(1) precision detected ◮ For the SML grammar fragment: 2 potential ambiguities with LR(0) precision detected: (match -> mrule . , match -> match . ’|’ mrule ) (match -> match . ’|’ mrule , match -> match ’|’ mrule . ) ◮ NU( item 1 ) correctly identifies 87% of our unambiguous grammars—73% of the non-LALR(1) ones

  86. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Experimental Results Summary ◮ conservative ambiguity detection ◮ provably better than several other techniques ◮ also experimentally better

  87. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Closing Comments Conclusion ◮ Main issues in parser development: ◮ nondeterminism ◮ ambiguity in particular ◮ Deterministic parsers for larger classes of grammars ◮ Ambiguity detection algorithm

  88. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Future Work Directions for Future Work ◮ Linear time parsing for NU( ≡ ) grammars? ◮ Improved implementation ◮ Noncanonical languages ◮ Regular approximations

  89. Motivation Approximations Shift-Resolve Parsing Ambiguity Detection Conclusion Future Work Thanks!

Recommend


More recommend