Introduction An automaton An ML implementation Beyond ML Conclusion 1 Towards efficient, typed LR parsers Franc ¸ois Pottier and Yann R´ egis-Gianas June 2005 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 2 Introduction An automaton An ML implementation Beyond ML Conclusion Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 3 In short This talk is meant to illustrate how an expressive type system allows guaranteeing the safety of complex programs. The programs considered here are LR parsers and the type system is an extension of ML with generalized algebraic data types (GADTs) . Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 4 LR parsers People like to specify a parser as a context-free grammar, typically in BNF format, decorated with semantic actions. People like to implement a parser as a deterministic pushdown automaton (DPDA). A grammar is LR if such an implementation is possible. Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 5 LR parser generators There are tools that generate , out of an LR grammar, a program that simulates execution of the corresponding automaton. Can one guarantee the safety of the generated program without requiring trust in the tool’s correctness ? Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 6 What do existing tools produce ? Yacc, Bison, etc. produce C programs, with no safety guarantee. They use a union to represent semantic values, and do not protect against stack underflow. ML-Yacc or Happy produce ML or Haskell programs, which are typed. Yet, runtime exceptions still arise when pattern matching fails, so safety isn’t quite guaranteed. Furthermore, redundant dynamic tests incur a runtime penalty. Before showing any code, let’s have a look at a sample grammar and automaton. Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 7 Introduction An automaton An ML implementation Beyond ML Conclusion Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 8 A simple grammar Here is a very simple LR grammar, drawn from the “Dragon Book:” E { x } + T { y } → E { x + y } (1) T { x } → E { x } (2) (3) T { x } * F { y } → T { x × y } F { x } → T { x } (4) (5) ( E { x } ) → F { x } int { x } → F { x } (6) The terminals or tokens are + , * , ( , ) , and int . The non-terminals are E , T , and F . The first four have no semantic value ; the last four have an integer semantic value. Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 9 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Here is a pushdown automaton that accepts this grammar. Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 10 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action ( int ) $ ǫ S 1 shift S 4 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 11 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action int ) $ S 1 ( S 4 shift S 10 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 12 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action reduce int → F , goto F ) $ S 1 ( S 4 int S 10 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 13 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action ) $ S 1 ( S 4 F goto F Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 14 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action reduce F → T , goto T ) $ S 1 ( S 4 F S 5 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 15 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action reduce T → E , goto E ) $ S 1 ( S 4 T S 6 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 16 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action ) $ S 1 ( S 4 E S 11 shift S 12 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 17 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action reduce ( E ) → F , goto F $ S 1 ( S 4 E S 11 ) S 12 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 18 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action reduce F → T , goto T $ S 1 F S 5 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 19 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action reduce T → E , goto E $ S 1 T S 6 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 20 T F S6 S1 ... ... * ( S12 T ... ) E ( S11 E ... S4 F S5 + ... ... ( ( S2 + ... S3 T S7 * S8 Int F ... ... ... S9 ... Int F S10 ... <)> F:Int. Int ... Int Input Stack State Next action $ S 1 E S 2 accept Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 21 Introduction An automaton An ML implementation Beyond ML Conclusion Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 22 Lexer interface Tokens are made up of a tag and possibly of a semantic value: type token = KPlus | KStar | KLeft | KRight | KEnd | KInt of int The lexer provides two functions for looking up and for discarding the current token: val peek : unit → token val discard : unit → unit Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Introduction An automaton An ML implementation Beyond ML Conclusion 23 Data structures The type of states is easily defined: type state = S0 | S1 | . . . | S11 Franc ¸ois Pottier and Yann R´ egis-Gianas Towards efficient, typed LR parsers
Recommend
More recommend