Compilation 2016 Parsing Tools Aslan Askarov aslan@cs.au.dk
Today: using parsing tools to parse 3 simple languages 1. Language of arithmetic expressions 2. Straight-line commands 3. Straight-line commands with alternative syntax
Language of arithmetic expressions Example: 1 + (2*x- 3) expr -> id | num | expr op expr | ( expr ) CFG: op -> plus | minus | times | div datatype aexp = Id of string ML code: ? | Number of int | Op of binop * aexp * aexp and binop = Plus | Minus | Times | Div
Using ml-yacc ML declarations to be copied verbatim into the generated parser %% parser declarations (nonterminals, terminals, precedence, etc) %% grammar rules .grm file format parser generator .grm.desc file that helps debug grammar conflicts make sure to use %verbose declaration .grm.sml file that implements an LR parser the rest of the compiler lexer
>_
Operator associativity • Consider expression 7 - 5 - 3 • What should be the default interpretation? (7 - 5) - 3 7 - (5 - 3) error Say that operator MINUS also a … right associativity has left associativity possibility • For arithmetic MINUS the correct choices are either left associativity or error, but not right associativity • The desired choice is left associativity
Examples of associativity Operators Associativity Arithmetic minus, division Left Arithmetic addition, a+b+c = (a+b)+c = a+(b+c) Left/Right multiplication y is assigned the value of 5; x is Assignment in C: Right assigned the updated value of y x = y = 5 2^1^3 = 2^ (1^3) = 2 Arithmetic exponentiation Right not the same as (2^1)^3 = 8 2^1^3 Comparison operators parse error Non-associative 2 < 3 < 4
What can parse trees tell about associativity? - - 7 - - 3 ? 5 3 7 5 7 - 5 - 3 left-associative parse tree right-associative parse tree
Grammar conflicts • Shift/reduce conflicts • Typical fixes • try specifying associativity/precedence • otherwise rewrite grammar • Default action: shift • Reduce/reduce conflicts • Rarely a good sign: must rewrite grammar
Example shift/reduce conflict exp : exp . MINUS exp exp : exp MINUS exp . (reduce by rule 4) Scenario 1 stack: input string exp MINUS exp MINUS exp
Example shift/reduce conflict exp : exp . MINUS exp exp : exp MINUS exp . (reduce by rule 4) Scenario 1 stack: input string exp MINUS exp MINUS exp shift action Scenario 2 stack: input string exp MINUS exp MINUS exp
Example shift/reduce conflict exp : exp . MINUS exp exp : exp MINUS exp . (reduce by rule 4) Scenario 1 stack: input string exp MINUS exp MINUS exp shift action Scenario 2 stack: input string exp MINUS exp reduce action
Example shift/reduce conflict • Possible fix: specifying associativity using%left or %right directive • %right favors shifting • %left favors reducing • Q: What should we do for MINUS? • Other fixes: precedence is given by the order of parser directives in the .grm file • If not enough: rework the grammar!
Language of straight-line programs Example: y:=1 + (2*x- 3); if x then if y then z := 20 else y := 2 … cmd -> id := aexp CFG: | if aexp then cmd | if aexp then cmd else cmd | ( cmds ) cmds -> cmd | cmd ; cmds … ML code: and cmd = Assign of id * aexp | If of aexp * cmd * cmd option | Cmds of cmd list
Syntactic variation of SLP Example: if x then if y then z := 20 else y := 2 fi fi extra keyword to mark the end of IfThenElse … cmd -> id := aexp CFG: new grammar | if aexp then cmd fi | if aexp then cmd else cmd fi | ( cmds ) cmds -> cmd | cmd ; cmds … ML code: same AST and cmd = Assign of id * aexp | If of aexp * cmd * cmd option | Cmds of cmd list
Syntactic variation of SLP Example: if x then if y then z := 20 else y := 2 fi fi • Pros of explicit syntax • Avoids some grammar conflicts • Handy at prototyping stages when designing your own PL and the language features are not yet stable (unlike this course) • Cons • Not very elegant; your programmers may be grumpy
Reporting syntax errors • Simplest strategy: fail on the first syntax error • OK, but not very helpful • Local error recovery: • adjust parse stack/input at the point of error • used in many variants of Yacc (but not ML-Yacc) • Global error recovery: • adjust input stream before the point of error • see Burke-Fisher error repair • essentially tries all possible edits • used in ML-Yacc • see %change and %value directives
Summary • Use %left, %right parser declarations to control associativity • See ML-Yacc manual on Precedence • Conflict reports in .desc file • Develop your grammar slowly to understand the source of conflicts • Exploring design space in developing surface syntax • ML-Yacc uses global error recovery
Recommend
More recommend