syntax
play

Syntax Gabriele Keller Ron Vanderfeesten Overview So far - PowerPoint PPT Presentation

Concepts of Program Design Syntax Gabriele Keller Ron Vanderfeesten Overview So far Revision of inference rules, natural (rule) induction Haskell Simple grammars specified using inference rules This week - first-order &


  1. Concepts of Program Design Syntax Gabriele Keller Ron Vanderfeesten

  2. Overview • So far ‣ Revision of inference rules, natural (rule) induction ‣ Haskell ‣ Simple grammars specified using inference rules • This week - first-order & higher-order abstract syntax, - static and dynamic semantics - embedded languages - assignment 1 will be released early next week - let me know on Thu if you consider doing the project (no need for final decision yet)

  3. Concrete Syntax e PExpr e 1 SExpr e 2 PExpr e 1 + e 2 SExpr e SExpr e FExpr e 1 PExpr e 2 FExpr e 1 * e 2 PExpr e PExpr e SExpr i ∈ Int i FExpr (e) FExpr • the inference rules for SExpr defined the concrete syntax of a simple language, including precedence and associativity • the concrete syntax of a language is designed with the human user in mind • not adequate for internal representation during compilation

  4. Concrete vs abstract syntax • Example: - 1 + 2 * 3 - 1 + (2 * 3) - (1) + ((2) * (3)) - what is the problem? • Concrete syntax contains too much information - these expressions all have different derivations, but semantically, they represent the same arithmetic expression • After parsing, we’re just interested in three cases: an expression is either - an addition - a multiplication, or - a number

  5. Concrete vs abstract syntax • we use Haskell style terms of the form operator arg 1 arg 2 …. to represent parsed programs unambiguously; e.g., Plus (Num 1) (Times (Num 2) (Num 3)) • we define the abstract grammar of arithmetic expressions as follows: t 1 expr t 2 expr t 1 expr t 2 expr ( Times t 1 t 2 ) expr (Plus t 1 t 2 ) expr i ∈ Int (Num i) expr

  6. Concrete vs abstract syntax • Parsers - check if the program (sequence of tokens) is derivable from the rules of the concrete syntax - turn the derivation into an abstract syntax tree (AST) • Transformation rules - we formalise this with inference rules as a binary relation ↔ : We write e SExpr ↔ t expr iff the (concrete grammar) expression e corresponds to the (abstract grammar) expression t. Usually, many different concrete expressions correspond to a single abstract expression

  7. Concrete vs abstract syntax • Example: - 1 + 2 * 3 SExpr ↔ (Plus (Num 1) (Times (Num 2)(Num 3))) expr - 1 + (2 * 3) SExpr ↔ (Plus (Num 1) (Times (Num 2)(Num 3))) expr - (1) + ((2)*(3)) SExpr ↔ (Plus (Num 1) (Times (Num 2)(Num 3))) expr

  8. Concrete vs abstract syntax • Formal definition: we define a parsing relation ↔ formally as an extension of the structural rules of the concrete syntax. e 1 SExpr e 2 PExpr ↔ e 1 ’ expr ↔ e 2 ’ expr e PExpr ↔ e’ expr e 1 + e 2 SExpr e SExpr ↔ ( Plus e 1 ’ e 2 ’ ) expr ↔ e’ expr e 1 PExpr e 2 FExpr ↔ e 1 ’ expr ↔ e 2 ’ expr e FExpr ↔ e’ expr e PExpr e 1 * e 2 PExpr ↔ ( Times e 1 ’ e 2 ’ ) expr ↔ e’ expr i ∈ Int e SExpr ↔ e’ expr i FExpr ↔ ( Num i ) expr ( e ) FExpr ↔ e’ expr

  9. The translation relation ↔ • The binary syntax translation relation ‣ e ↔ e’ can be viewed as translation function ‣ input is e ‣ output is e’ ‣ derivations are unambiguously determined by e - since the grammar of the concrete syntax was unambiguous ‣ e’ is unambiguously determined by the derivation - for each concrete syntax term, there is only one rule we can apply at each step

  10. The translation relation ↔ • Derive the abstract syntax as follows: (1) bottom up, decompose the concrete expression e according to the left hand side of ↔ (2) top down, synthesise the abstract expression e’ according to the right hand side of each ↔ from the rules used in the derivation. • Example: derivation for 1 + 2 * 3 (we abbreviate SExpr, PExpr, FExpr with S, P , F respectively, and expr with e 1 + 2 * 3 S ↔

  11. The translation relation ↔ • Derive the abstract syntax as follows: (1) bottom up, decompose the concrete expression e according to the left hand side of ↔ (2) top down, synthesise the abstract expression e’ according to the right hand side of each ↔ from the rules used in the derivation. • Example: derivation for 1 + 2 * 3 (we abbreviate SExpr, PExpr, FExpr with S, P , F respectively, and expr with e 1 Int 2 Int 3 Int 2 F ↔ (Num 1) e (Num 2) e 1 F ↔ (Num 3) e (Num 2) e 2 P ↔ 3 F ↔ (Num 1) e 1 P ↔ (Times (Num 2) (Num 3)) e (Num 1) e 2 * 3 P ↔ 1 S ↔ 1 + 2 * 3 S ↔ Plus (Num 1)(Times (Num 2)(Num 3)) e

  12. Parsing and inference rules • The parsing problem Given a sequence of tokens s SExpr , find t such that s SExpr ↔ t expr • Requirements A parser should be ‣ total for all expressions that are correct according to the concrete syntax, that is - there must be a t expr for every s SExpr ‣ unambiguous, that is for every t 1 and t 2 with - s SExpr ↔ t 1 expr and s SExpr ↔ t 2 expr we have t 1 = t 2

  13. Parsing and pretty printing • The parsing problem Given a sequence of tokens s SExpr , find t such that s SExpr ↔ t expr • What about the inverse? - given t expr , find s SExpr • The inverse of parsing is unparsing ‣ unparsing is often ambiguous ‣ unparsing is often partial (not total) • Pretty printing • unparsing together with appropriate formatting us called pretty printing • due to the ambiguity of unparsing, this will usually not reproduce the original program (but a semantically equivalent one)

  14. Parsing and pretty printing Example Given the abstract syntax term Times (Num 3) (Times (Num 4) (Num 5))) pretty printing may produce the string “3 * 4 * 5” or “(3 * 4) * 5” ‣ it’s best to chose the most simple, readable representation ‣ but usually, this requires extra effort

  15. Bindings • Local variable bindings (let) Let’s extend our simple expression language with ‣ variables and variable bindings ‣ let v = e 1 in e 2 end • Example: let let x = 3 x = 3 in let y = x + 1 in x + 1 in x + y end end end • Concrete syntax (adding two new rules): id Ident e 1 SExpr e 2 SExpr id FExpr let id = e 1 in e 2 end FExpr

  16. Bindings The end keyword is necessary for nested let-expressions: let x = 3 in 2 * let y = 5 in y + x we’ll leave it out when not needed to disambiguate

  17. Bindings • First order abstract syntax: i ∈ Int (Num i ) expr t 1 expr t 2 expr t 1 expr t 2 expr (Times t 1 t 2 ) expr (Plus t 1 t 2 ) expr id Ident (Var id) expr t 1 expr t 2 expr (Let id t 1 t 2 ) expr (Var id ) expr

  18. Bindings • Scope ‣ let x = e 1 in e 2 end introduces -or binds- the variable x for use within its scope e 2 ‣ we call the occurrence of x in the left-hand side of the binding its binding occurrence (or defining occurrence) ‣ occurrences of x in e 2 are usage occurrences ‣ finding the binding occurrence of a variable is called scope resolution • Two types of scope resolution ‣ static (or lexical) scoping: scoping resolution happens at compile time ‣ dynamic scoping: resolution happens at run time

  19. Bindings Example: let x = y in let y = 2 scope of y in x scope of x Out of scope variable: the first occurrence of y is out of scope

  20. Bindings Example: let x = 5 in let x = 3 in x + x Shadowing: the inner binding of x is shadowing the outer binding

  21. Scope • Where the scope starts di ff ers in di ff erent languages: JavaSript: In C: void f () { function showMsg () { … console.log(msg); int x = 5; … scope of x scope of msg int y = x; var msg = “hi”; … } … } In Haskell: let … y = x where scope of x x = 5 y = x … x = 5 in … …

  22. Bindings Example: what is the difference between these two expressions? let let x = 3 y = 3 in x + 1 in y + 1 end end α -equivalence: ‣ they only differ in the choice of the bound variable names ‣ we call them α -equivalent ‣ we call the process of consistently changing variable names α -renaming ‣ the terminology is due to a conversion rule of the λ -calculus ‣ we write e 1 ≡ α e 2 if two expressions are α -equivalent ‣ the relation ≡ α is a equivalence relation

  23. Substitution • Free variables ★ a free variable is one without a binding occurrence ‣ let x = 1 in x + y end y is free in this expression • Substitution: replacing all occurrences of a free variable x in an expression e by another expression e’ is called substitution • Example: substituting x with 2 * y in 5 * x + 7 yields 5 * (2 * y) + 7

  24. Substitution • We have to be careful when applying substitution: ‣ let y = 5 in y * x + 7 α -equivalent ‣ let z = 5 in z * x + 7 - substitute x by 2 * y in both - let y = 5 in y * (2 * y) + 7 not α -equivalent anymore! - let z = 5 in z * (2 * y) + 7 - the free variable y of 2 * y is captured in the first expression

  25. Substitution • Capture-free substitution: to substitute e’ for x in e we require the free variables in e’ to be different from the variables in e • We a can always arrange for a substitution to be capture free - use α -renaming of e’ (the expression replacing the variable) - change all variable names that occur in e and e’ - or use fresh variable names

Recommend


More recommend