compiler construction
play

Compiler Construction Compiler Construction 1 / 111 Mayer Goldberg - PowerPoint PPT Presentation

Compiler Construction Compiler Construction 1 / 111 Mayer Goldberg \ Ben-Gurion University Tuesday 10 th December, 2019 Mayer Goldberg \ Ben-Gurion University Chapter 3 Roadmap Compiler Construction 2 / 111 Expressions in Scheme The


  1. Expressions in Scheme (define pi (* 4 (atan 1))) Compiler Construction almost always a bad idea! Defjnitions (type expr , type-constructor Def ) 20 / 111 ▶ Type: Def of expr * expr ▶ The AST node for define -expressions ▶ Two syntaxes for define : ▶ (define ⟨ var ⟩ ⟨ expr ⟩ ) ▶ Example: ▶ (define ( ⟨ var ⟩ . ⟨ arglist ⟩ ) . ( ⟨ expr ⟩ + )) ▶ This form is macro-expanded into (define ⟨ var ⟩ (lambda ⟨ arglist ⟩ . ( ⟨ expr ⟩ + ))) ▶ Used to defjne functions without specifying the λ : This is ☞ Note the implicit sequences! ▶ Example: (define (square x) (* x x)) Mayer Goldberg \ Ben-Gurion University

  2. Expressions in Scheme Disjunctions (type expr , type-constructor Or ) Compiler Construction 21 / 111 ▶ Type: Or of expr list ▶ � (or) � = � #f � (by defjnition) ▶ � (or ⟨ expr ⟩ ) � = � ⟨ expr ⟩ � ( #f is the unit element of or ) ▶ The real work is done here: � (or ⟨ expr 1 ⟩ · · · ⟨ expr n ⟩ ) � = Or([ � ⟨ expr 1 ⟩ � ; · · · ; � ⟨ expr n ⟩ � ]) ▶ It is possible to macro-expand disjunctions ▶ We will learn the expansion later on ▶ The expansion results in impractically-ineffjcient code ▶ We support disjunctions directly for reasons of effjciency Mayer Goldberg \ Ben-Gurion University

  3. arguments: Expressions in Scheme Compiler Construction 22 / 111 Applications (type expr , type-constructor Applic ) ▶ Type: Applic of expr * (expr list) ▶ The AST node separates the expression in the procedure position from the list of arguments ▶ The tag-parser recurses over the procedure & the list of � ( ⟨ expr ⟩ ⟨ expr ⟩ 1 · · · ⟨ expr ⟩ n ) � = Applic( � ⟨ expr ⟩ � , [ � ⟨ expr ⟩ 1 � ; · · · ; � ⟨ expr ⟩ n � ]) Mayer Goldberg \ Ben-Gurion University

  4. Expressions in Scheme Lambdas (type expr , type-constructor LambdaSimple , LambdaOpt ) these three forms using the two AST nodes LambdaSimple & LambdaOpt . Compiler Construction 23 / 111 ▶ Types: ▶ LambdaSimple of string list * expr ▶ LambdaOpt of string list * string * expr ▶ Scheme has three lambda -forms, and we’re going to represent Mayer Goldberg \ Ben-Gurion University

  5. Expressions in Scheme Lambdas (type expr , type-constructor LambdaSimple , LambdaOpt ) using the AST node LambdaSimple Compiler Construction 24 / 111 ▶ The general form of lambda -expressions is (lambda ⟨ arglist ⟩ . ( ⟨ expr ⟩ + )) : ① If ⟨ arglist ⟩ is a proper list of unique variable names, then the lambda -expression is said to be simple, and we represent it Mayer Goldberg \ Ben-Gurion University

  6. Expressions in Scheme Lambdas (type expr , type-constructor LambdaSimple , Compiler Construction using the AST node LambdaOpt to be the empty list value of the optional parameter vs 25 / 111 lambda -expression is said to take at least n arguments: LambdaOpt ) ▶ The general form of lambda -expressions is (lambda ⟨ arglist ⟩ . ( ⟨ expr ⟩ + )) : ② If ⟨ arglist ⟩ is the improper list ( v 1 · · · v n . vs ) , then the ▶ The fjrst n arguments are mandatory, and are assigned to v 1 through v n respectively (unique variable names) ▶ The list of values of any additional arguments is going to be the ▶ If precisely n arguments are given, then the value of vs is going ▶ We represent lambda -expressions with optional arguments by Mayer Goldberg \ Ben-Gurion University

  7. Expressions in Scheme arguments: Compiler Construction LambdaOpt (with an empty list, and the optional var) the empty list Lambdas (type expr , type-constructor LambdaSimple , 26 / 111 LambdaOpt ) ▶ The general form of lambda -expressions is (lambda ⟨ arglist ⟩ . ( ⟨ expr ⟩ + )) : ③ If ⟨ arglist ⟩ is the symbol vs , then the lambda -expression is said to be variadic, and may be applied to any number of ▶ The list of values of the arguments is going to be the value of the optional parameter vs ▶ If no arguments are given, then the value of vs is going to be ▶ We represent variadic lambda -expressions using the AST node Mayer Goldberg \ Ben-Gurion University

  8. Expressions in Scheme Lambda With Optional Arguments — Demonstration > ( define f ( lambda (a b c . d) `((a ,a) (b ,b) (c ,c) (d ,d)))) > (f 1) Exception: incorrect number of arguments to #<procedure f> Type (debug) to enter the debugger. Compiler Construction 27 / 111 Mayer Goldberg \ Ben-Gurion University

  9. Expressions in Scheme Lambda With Optional Arguments — Demonstration > (f 1 2) Exception: incorrect number of arguments to #<procedure f> Type (debug) to enter the debugger. > (f 1 2 3) ((a 1) (b 2) (c 3) (d ())) > (f 1 2 3 4 5) ((a 1) (b 2) (c 3) (d (4 5))) Compiler Construction 28 / 111 Mayer Goldberg \ Ben-Gurion University

  10. Expressions in Scheme Variadic Lambda — Demonstration > ( define g ( lambda s `(s ,s))) > (g) (s ()) > (g 1 2 3) (s (1 2 3)) Compiler Construction 29 / 111 Mayer Goldberg \ Ben-Gurion University

  11. Roadmap Compiler Construction 30 / 111 ▶ Expressions in Scheme 🗹 The expr datatype ▶ The Tag-Parser ▶ Macros & special forms ▶ Lexical hygiene Mayer Goldberg \ Ben-Gurion University

  12. The Tag-Parser The tag-parser is a function, mapping from sexprs to exprs: converted to an expr (most notably, the lambda -forms) syntactically-incorrect forms are rejected Compiler Construction 31 / 111 ▶ Not all valid sexprs are valid exprs ▶ Some sexprs need to be disambiguated before they can be ▶ Plenty of testing needs to be done to ensure that Mayer Goldberg \ Ben-Gurion University

  13. The Tag-Parser ( continued ) Example: The syntax for (* 4 (atan 1)) Compiler Construction Abstract syntax 32 / 111 Concrete syntax PAIR APPLIC CAR CDR ARGS PROC ARG0 ARG1 SYMBOL PAIR * CAR CDR APPLIC VAR CONST PAIR ARGS INTEGER 4 * VALUE PROC CAR CDR ARG0 PAIR NIL CAR CDR VAR CONST INTEGER 4 atan VALUE SYMBOL PAIR atan CAR CDR INTEGER 1 INTEGER 1 NIL Mayer Goldberg \ Ben-Gurion University

  14. The Tag-Parser ( continued ) Some examples of syntactically-incorrect exprs These are all valid sexprs and invalid expressions: equal 2) duplicates) Compiler Construction 33 / 111 ▶ (lambda (x) . x) (the body is not a [proper] list of exprs) ▶ (quote . a) (not a proper list) ▶ (quote he said he understood unquote) (length does not ▶ (lambda (a b c a a b) (+ a b c)) (the param list contains Mayer Goldberg \ Ben-Gurion University

  15. The Tag-Parser ( continued ) various syntactic forms Compiler Construction no duplicates! How to write a tag-parser support later on) 34 / 111 ▶ A recursive function tag_parse : sexpr -> expr 🤕 The concrete syntax for expr is the abstract syntax for sexpr ▶ For the core forms (we shall deal with macro-expanded forms ① Use pattern-matching to match over the concrete syntax of ▶ Check out the expr datatype so you know what you must ② Perform any additional testing necessary ▶ For example, that argument-lists in lambda -expressions contain ③ Call tag_parse recursively for sub-expressions ④ Generate the corresponding AST Mayer Goldberg \ Ben-Gurion University

  16. Roadmap Compiler Construction 35 / 111 ▶ Expressions in Scheme 🗹 The expr datatype 🗹 The Tag-Parser ▶ Macros & special forms ▶ Lexical hygiene Mayer Goldberg \ Ben-Gurion University

  17. Macros & special forms past, but is considered esoteric Compiler Construction pipeline know nothing about macros What are macros 36 / 111 ▶ Transformers on source-code ▶ They take source code, and rewrite it ▶ They operate on the concrete syntax ▶ Generally execute at compile time ▶ Some work on run-time macro-expansion has been done in the ▶ Used to provide shallow support for syntactic forms ▶ Macros are syntactic sugar or notational conveniences ▶ They are “expanded away”, and then they are gone ▶ Macros are not supported deep within the compiler ▶ The semantic analysis or code-generation stages of the compiler Mayer Goldberg \ Ben-Gurion University

  18. Macros & special forms ( continued ) Illustrating “shallow support” vs “deep support” etc. phase of the compiler pertains to your home Compiler Construction 37 / 111 ▶ When you think of the word “home”, perhaps you think of: ▶ stability, security, safety ▶ family, relations ▶ mortgage ▶ All this is part of the meaning of “home” ▶ Meaning in the compiler has to do with the semantic analysis ▶ The meaning of “home” enriches anything you say and do that Mayer Goldberg \ Ben-Gurion University

  19. Macros & special forms ( continued ) coordinate Compiler Construction other than a geographical location feelings, associated with the word “home” Illustrating “shallow support” vs “deep support” ( cont ) etc. 38 / 111 32.0260699N, 34.7580834E ▶ Suppose home for you is just shorthand for a location ▶ You could then translate the word “home” to USC ▶ This translation would take place early on, so that ▶ “going home” would mean going to a specifjc coordinate ▶ “longing for home” would mean longing to be at a specifjc ▶ Such sentences would carry none of the meaning, signifjcance, ▶ The word “home” would be disconnected from any meaning ▶ This would be insanity! Mayer Goldberg \ Ben-Gurion University

  20. Macros & special forms ( continued ) Illustrating “shallow support” vs “deep support” ( cont ) Back to the compiler: compiler can associate with it many things: etc. recursive calls in tail-position are gone Compiler Construction 39 / 111 ▶ When you have a notion of “loop” in your language, the ▶ a code fragment that gets executed over and over ▶ termination conditions ▶ branch prediction information ▶ We can macro-expand a loop into a recursive function with all ▶ All intentions about the code (namely, its loop-like behaviour) Mayer Goldberg \ Ben-Gurion University

  21. Macros & special forms ( continued ) Illustrating “shallow support” vs “deep support” ( cont ) Compiler Construction to recover completely easier to translate them effjciently with functions 40 / 111 hints variables of the loop in registers, or generate branch-prediction ▶ Some information can be reconstructed through analysis during the semantic analysis phase ▶ Other information is lost ▶ For example, we might want our compiler to keep the index ▶ These would be simple to do when considering loops ▶ These would be diffjcult, and require a great deal of analysis ☞ Keeping around the meaning of syntactic forms would make it ☞ This meaning having been lost in macro-expansion, it is diffjcult Mayer Goldberg \ Ben-Gurion University

  22. Macros & special forms ( continued ) with Compiler Construction 41 / 111 ▶ We now consider some special forms in Scheme ▶ Some of these will be implemented in our compiler ▶ Some of these are of theoretical interest only ▶ We want to understand what special forms can be dispensed ▶ We don’t want our compiler to be overly ineffjcient Mayer Goldberg \ Ben-Gurion University

  23. Scheme, Boolean values, Boolean operators # 4 && 5;; Compiler Construction was expected of type bool Error: This expression has type int but an expression ^ 4 && 5;; Characters 0-1: - : bool = false # not true;; - : bool = true # not false;; expressions that evaluate to Boolean values Booleans: 42 / 111 ▶ Some languages (such as Java or ocaml) are strict about ▶ Conjunctions, disjunctions, conditionals, etc. only take ▶ The distinction is beteen false and true ▶ Not false is exactly true: Mayer Goldberg \ Ben-Gurion University

  24. Scheme, Boolean values, Boolean operators was expected of type bool Compiler Construction was expected of type bool Error: This expression has type string but an expression ^^^^^^^ if "moshe" then "then" else "else";; Characters 3-10: # if "moshe" then "then" else "else";; Error: This expression has type int but an expression ^ 4 || 5;; Characters 0-1: # 4 || 5;; expressions that evaluate to Boolean values Booleans: 43 / 111 ▶ Some languages (such as Java or ocaml) are strict about ▶ Conjudations, disjunctions, conditionals, etc. only take ▶ The distinction is beteen false and true ▶ Not false is exactly true: Mayer Goldberg \ Ben-Gurion University

  25. Scheme, Boolean values, Boolean operators #f Compiler Construction #t > (not (not 'moshe)) #f > (not 'moshe) > (not #t) #t > (not #f) expression Booleans: 44 / 111 ▶ Some languages (such as C or Scheme) are lenient about ▶ Conjudations, disjunctions, conditionals, etc. can take any ▶ The distinction is between false and not false ▶ Not false is not the same as true: Mayer Goldberg \ Ben-Gurion University

  26. Scheme, Boolean values, Boolean operators > (if 3 'then 'else) Compiler Construction else > (if #f 'then 'else) then > (if '() 'then 'else) then 2 > (or 2 3 4) 4 > (and 2 3 4) expression Booleans: 45 / 111 ▶ Some languages (such as C or Scheme) are lenient about ▶ Conjudations, disjunctions, conditionals, etc. can take any ▶ The distinction is between false and not false ▶ Not false is not the same as true: Mayer Goldberg \ Ben-Gurion University

  27. Macros & special forms ( continued ) and from the assembly code that would have been generated had we supported and -expressions as a core syntactic form Compiler Construction 46 / 111 ▶ Conjunctions are easily expanded into nested if -expressions: ▶ � (and) � = � #t � (by defjnition) ▶ � (and ⟨ expr ⟩ ) � = � ⟨ expr ⟩ � ( #t is the unit element of and ) ▶ � (and ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ · · · ⟨ expr n ⟩ ) � = (if � ⟨ expr 1 ⟩ � � (and ⟨ expr 2 ⟩ · · · ⟨ expr n ⟩ ) � � #f � ) ▶ The assembly code generated for and -expansions is no difgerent ☞ You should implement this macro-expansion in your tag-parser Mayer Goldberg \ Ben-Gurion University

  28. Macros & special forms ( continued ) or Macro-expanding or -expressions is very difgerent from macro-expanding and -expressions and -expressions: Compiler Construction 47 / 111 ▶ The fjrst two clauses are similar to what we do with ▶ � (or) � = � #f � (by defjnition) ▶ � (or ⟨ expr ⟩ ) � = � ⟨ expr ⟩ � (because #f is the unit of or ) ▶ For the third clause, you might consider something like: � (or ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ · · · ⟨ expr n ⟩ ) � = (if � ⟨ expr 1 ⟩ � � #t � � (or ⟨ expr 2 ⟩ · · · ⟨ expr n ⟩ ) � ) ▶ This macro-expansion is, of course, incorrect! (think why) Mayer Goldberg \ Ben-Gurion University

  29. Macros & special forms ( continued ) or ( continued ) Compiler Construction 48 / 111 ▶ Take another look at � (or ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ · · · ⟨ expr n ⟩ ) � = (if � ⟨ expr 1 ⟩ � � #t � � (or ⟨ expr 2 ⟩ · · · ⟨ expr n ⟩ ) � ) ▶ Suppose we implemented or -expressions in this way: What would be the value of (or 2 3) ? 😟 It would be #t ▶ Scheme returns 2 Mayer Goldberg \ Ben-Gurion University

  30. Macros & special forms ( continued ) Let us consider a simpler version of our problem: How to Compiler Construction 49 / 111 or ( continued ) macro-expand (or ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) ▶ This is fjne, because or -expressions associate! ▶ What about this macro-expansion: � (or ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (if � ⟨ expr 1 ⟩ � � ⟨ expr 1 ⟩ � � ⟨ expr 2 ⟩ � ) ▶ This macro-expansion is, of course, incorrect! (think why) Mayer Goldberg \ Ben-Gurion University

  31. Macros & special forms ( continued ) or ( continued ) 'moshe) Compiler Construction 50 / 111 ▶ Take another look at � (or ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (if � ⟨ expr 1 ⟩ � � ⟨ expr 1 ⟩ � � ⟨ expr 2 ⟩ � ) ▶ Suppose we implemented or -expressions in this way: What would be the output of (or (begin (display "*\n") #t) 😟 It would print * twice and return #t ▶ Scheme prints * once and returns #t ▶ We told you side-efgects were tricky! 😊 ☞ How might we make sure to evaluate ⟨ expr 1 ⟩ only once? Mayer Goldberg \ Ben-Gurion University

  32. Macros & special forms ( continued ) or ( continued ) Compiler Construction 51 / 111 Suppose we used a let -expression to store the value of ⟨ expr 1 ⟩ : ▶ What about the expansion: � (or ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x ⟨ expr 1 ⟩ )) (if x x ⟨ expr 2 ⟩ )) ▶ This macro-expansion is, of course, incorrect! (think why) Mayer Goldberg \ Ben-Gurion University

  33. Macros & special forms ( continued ) or ( continued ) Compiler Construction 52 / 111 ▶ Take another look at � (or ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x ⟨ expr 1 ⟩ )) (if x x ⟨ expr 2 ⟩ )) ▶ Suppose we implemented or -expressions in this way: What would be the value of (let ((x 'ha-ha!)) (or #f x)) 😟 It would be #f ▶ Scheme returns ha-ha! ☞ Why would this expansion evaluate to #f ? Mayer Goldberg \ Ben-Gurion University

  34. Macros & special forms ( continued ) or ( continued ) Look at how macro-expansion proceeds with our example: (let ((x #f)) (if x x x))) the original let and the or -expression happens to use the same variable as in the outer let -binding x was found in the wrong lexical environment! Compiler Construction 53 / 111 � (let ((x 'ha-ha!)) (or #f x)) � = (let ((x 'ha-ha!)) ▶ Notice how the macro-expansion introduced a new let between ▶ This new let introduced a variable binding that just so ▶ This new variable binding contaminated the code: The value of Mayer Goldberg \ Ben-Gurion University

  35. Macros & special forms ( continued ) or ( continued ) Look at how macro-expansion proceeds with our example: (let ((x #f)) (if x x x))) the original let and the or -expression to be unhygienic Compiler Construction 54 / 111 � (let ((x 'ha-ha!)) (or #f x)) � = (let ((x 'ha-ha!)) ▶ Notice how the macro-expansion introduced a new let between ▶ This is known as a variable-name capture ▶ Macro-expansions that result in variable-name captures are said Mayer Goldberg \ Ben-Gurion University

  36. Macros & special forms ( continued ) or ( continued ) A hygienic macro-expansion for or would requires that no user-code may see any variables introduced by our macro-expansion expansions that incur great performance penalties our compiler Compiler Construction 55 / 111 ▶ This is often impossible to accomplish without resorting to tricks ▶ This often requires tricky, circuitous, counter-intuitive ▶ This is the case with or ▶ This is why we support disjunctions directly as a core form in ☞ You should not macro-expand or -expressions in your compiler! Mayer Goldberg \ Ben-Gurion University

  37. Macros & special forms ( continued ) (if x x (y))) Compiler Construction introduced by the expansion! ((lambda (x y) (if x x (y))) or ( continued ) 56 / 111 This is how to macro-expand or -expressions (if someone were to hold a gun to your head and force you): � (or ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let((x � ⟨ expr 1 ⟩ � ) (y (lambda () � ⟨ expr 2 ⟩ � ))) = � ⟨ expr 1 ⟩ � (lambda () � ⟨ expr 2 ⟩ � )) ☞ Notice that ⟨ expr 1 ⟩ , ⟨ expr 2 ⟩ cannot access the variables x , y Mayer Goldberg \ Ben-Gurion University

  38. Macros & special forms ( continued ) or ( continued ) The cost of the hygienic expansion of or -expressions is high: for each pair of two expressions our compiler requires no allocation of closures and no applications, regardless of the number of disjuncts! Compiler Construction 57 / 111 ▶ Two lambda -expressions, and hence the creation of two closures ▶ This is the same as allocating two objects in an OOPL ▶ Two applications ☞ For an or -expression that has n + 1 disjuncts, we would need: ▶ To create/allocate 2 n closures ▶ To evaluate 2 n applications ☞ By contrast, implementing the or -expression as a core form in Mayer Goldberg \ Ben-Gurion University

  39. Macros & special forms ( continued ) begin Compiler Construction 58 / 111 ▶ The general form: (begin ⟨ expr 1 ⟩ · · · ⟨ expr n ⟩ ) ▶ Sequences are associative, so we need only consider the binary case: (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) ▶ How might we possibly expand it? How about � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x � ⟨ expr 1 ⟩ � )) � ⟨ expr 2 ⟩ � ) ▶ This macro-expansion is, of course, incorrect! (think why) Mayer Goldberg \ Ben-Gurion University

  40. Macros & special forms ( continued ) begin ( continued ) would be the value of (let ((x 3)) (begin 2 x)) Compiler Construction 59 / 111 ▶ Take another look at � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x � ⟨ expr 1 ⟩ � )) � ⟨ expr 2 ⟩ � ) ▶ This expansion introduces the variable x ▶ Notice that � ⟨ expr 2 ⟩ � can access this variable! ▶ This means that this expansion is not hygienic! ▶ Suppose we implemented begin -expressions in this way: What 😟 It would be 2 ▶ Scheme returns 3 Mayer Goldberg \ Ben-Gurion University

  41. Macros & special forms ( continued ) begin ( continued ) How about order is correct Compiler Construction 60 / 111 � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (if � ⟨ expr 1 ⟩ � � ⟨ expr 2 ⟩ � � ⟨ expr 2 ⟩ � ) ▶ We see that fjrst � ⟨ expr 1 ⟩ � evaluates, and then � ⟨ expr 2 ⟩ � , so the ▶ No variables are introduced, so there’s no issue of lexical hygiene ▶ This expansion is actually correct, but bad! (think why) Mayer Goldberg \ Ben-Gurion University

  42. Macros & special forms ( continued ) begin ( continued ) form! Compiler Construction 61 / 111 ▶ Take another look at � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (if � ⟨ expr 1 ⟩ � � ⟨ expr 2 ⟩ � � ⟨ expr 2 ⟩ � ) ▶ The text of � ⟨ expr 2 ⟩ � actually appears twice in the expanded ▶ This means that a begin with n + 1 expressions will expand to an expression of size O (2 n ) ▶ This is clearly not practical! Mayer Goldberg \ Ben-Gurion University

  43. Macros & special forms ( continued ) begin ( continued ) Compiler Construction value is always #t 62 / 111 How about � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (and (or � ⟨ expr 1 ⟩ � � #t � ) � ⟨ expr 2 ⟩ � ) ▶ We see that (or � ⟨ expr 1 ⟩ � � #t � ) evaluates fjrst, and that its ▶ So the and continues on to evaluate � ⟨ expr 2 ⟩ � ▶ The ordering is correct! ▶ No variables are introduced so there’s no issue of lexical hygiene ▶ No expression is duplicated 👎 This expansion actually works! Mayer Goldberg \ Ben-Gurion University

  44. Macros & special forms ( continued ) begin ( continued ) Compiler Construction 63 / 111 ▶ How about � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (or (and � ⟨ expr 1 ⟩ � � #f � ) � ⟨ expr 2 ⟩ � ) ▶ We see that fjrst (and � ⟨ expr 1 ⟩ � � #f � ) evaluates, and then � ⟨ expr 2 ⟩ � , so the order is correct ▶ No variables are introduced, so there’s no issue of lexical hygiene 👎 This expansion actually works! Mayer Goldberg \ Ben-Gurion University

  45. Macros & special forms ( continued ) begin ( continued ) How about (y)) Compiler Construction 64 / 111 � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x � ⟨ expr 1 ⟩ � ) (y (lambda () � ⟨ expr 2 ⟩ � )) ▶ We introduced two variables x & y : ▶ Neither � ⟨ expr 1 ⟩ � nor � ⟨ expr 2 ⟩ � can access these variables! ▶ The solution is hygienic! ▶ � ⟨ expr 1 ⟩ � evaluates in parallel with the creation of the closure for (lambda () � ⟨ expr 2 ⟩ � ) Mayer Goldberg \ Ben-Gurion University

  46. Macros & special forms ( continued ) begin ( continued ) How about (y)) Compiler Construction 65 / 111 � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x � ⟨ expr 1 ⟩ � ) (y (lambda () � ⟨ expr 2 ⟩ � )) ▶ Evaluating (lambda () � ⟨ expr 2 ⟩ � ) does not evaluate � ⟨ expr 2 ⟩ � ▶ � ⟨ expr 2 ⟩ � is evaluated only when the closure is applied! 👎 This expansion actually works! Mayer Goldberg \ Ben-Gurion University

  47. Macros & special forms ( continued ) begin ( continued ) How about (y)) (which used and & or ): and a conditional jump Compiler Construction 66 / 111 � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x � ⟨ expr 1 ⟩ � ) (y (lambda () � ⟨ expr 2 ⟩ � )) ▶ This expansion is actually more effjcient than the previous two ▶ and & or with more than one expression always involve a test ▶ Sequencing does not logically require tests or conditional jumps ▶ So this solution is less expensive Mayer Goldberg \ Ben-Gurion University

  48. Macros & special forms ( continued ) begin ( continued ) How about (y)) (which used and & or ): created for every pair of expressions in a begin : Compiler Construction 67 / 111 � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x � ⟨ expr 1 ⟩ � ) (y (lambda () � ⟨ expr 2 ⟩ � )) ▶ This expansion is actually more effjcient than the previous two ▶ It’s still pretty horrible: Two applications, and two closures ▶ Sequences of n + 1 expressions require 2 n applications and the creation of 2 n closures, which are then garbage-collected Mayer Goldberg \ Ben-Gurion University

  49. Macros & special forms ( continued ) begin ( continued ) How about (y)) compilers! Compiler Construction 68 / 111 � (begin ⟨ expr 1 ⟩ ⟨ expr 2 ⟩ ) � = (let ((x � ⟨ expr 1 ⟩ � ) (y (lambda () � ⟨ expr 2 ⟩ � )) ▶ This is why we support sequences natively within our compiler ☞ You should not implement this macro-expansion in your Mayer Goldberg \ Ben-Gurion University

  50. Macros & special forms ( continued ) let variables, and assigning them initial values. to an implicit sequence of expressions that are evaluated in their lexical scope. Compiler Construction 69 / 111 ▶ The let -expression is a way of defjning any number of local ▶ Once the local variables have been initialized, they are accessible ▶ The syntax looks like this: (let ((v 1 ⟨ Expr 1 ⟩ ) · · · (v n ⟨ Expr n ⟩ )) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) Mayer Goldberg \ Ben-Gurion University

  51. Macros & special forms ( continued ) let ( continued ) We wish to macro-expand let -expressions: of lambda -expressions parameters Compiler Construction 70 / 111 ▶ Local variables are parameters of lambda -expressions ▶ Expressions that can access local variables come from the bodies ▶ The parameters of lambda -expressions get their values when lambda -expressions are applied to arguments ▶ The values of the arguments are the initial values of the Mayer Goldberg \ Ben-Gurion University

  52. Macros & special forms ( continued ) let ( continued ) Compiler Construction 71 / 111 Putting it all together, we get the following macro-expansion: � (let ((v 1 ⟨ Expr 1 ⟩ ) · · · (v n ⟨ Expr n ⟩ )) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) � = � ( (lambda (v 1 · · · v n ) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) ⟨ Expr 1 ⟩ · · · ⟨ Expr n ⟩ ) � ▶ The expansion is hygienic (think why) ☞ You should implement this macro-expansion in your compiler! Mayer Goldberg \ Ben-Gurion University

  53. Macros & special forms ( continued ) Kleene-star Compiler Construction let* -expressions: let* following equations defjne the behaviour of the tag-parser on 72 / 111 we may not assumed they take place in any particular sequence ▶ Recall that the ordering of let -bindings is undefjned, and that ▶ This follows from the fact that ▶ The asterisk in name of the let* -form is meant to suggest the ▶ A let* -expression denotes nested let -expressions. The ① This is the fjrst of the two base cases: � (let* () ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) � = � (let () ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) � Mayer Goldberg \ Ben-Gurion University

  54. Macros & special forms ( continued ) let* ( continued ) Compiler Construction 73 / 111 ② This is the second base case: � (let* ((v ⟨ Expr ⟩ )) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) � = � (let ((v Expr )) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) � ☞ Think why two base cases are needed here! Mayer Goldberg \ Ben-Gurion University

  55. Macros & special forms ( continued ) let* ( continued ) Compiler Construction 74 / 111 ③ This is the inductive case: � (let* ((v 1 ⟨ Expr 1 ⟩ ) (v 2 ⟨ Expr 2 ⟩ ) · · · (v n ⟨ Expr n ⟩ )) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) � = � (let ((v 1 ⟨ Expr 1 ⟩ )) (let* ((v 2 ⟨ Expr 2 ⟩ ) · · · (v n ⟨ Expr n ⟩ )) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ )) � Mayer Goldberg \ Ben-Gurion University

  56. Macros & special forms ( continued ) let* ( continued ) The expansion for let* -expressions seems terribly ineffjcient: let* -expression for each rib in the original let* -expression Compiler Construction 75 / 111 ▶ A nested let -expression for every rib in the original ▶ This means ▶ One more closure created ▶ One application performed ▶ One closure garbage-collected Mayer Goldberg \ Ben-Gurion University

  57. Macros & special forms ( continued ) let* ( continued ) In fact, this is all an illusion because each call is performed in tail position, so: overwritten with the information of the new frames compiled by a reasonably clever, optimizing compiler Compiler Construction 76 / 111 ▶ The call is tail-call-optimized and replaced with a branch ▶ The allocation/creation of a new closure is avoided through a simple analysis in the semantic analysis phase ▶ Rather than allocating new frames, the old frames are being ▶ So in fact, this code is optimized into simple assignments ☞ The bottom line: The macro-expansion is effjcient when ☞ You should implement this macro-expansion in your compiler! Mayer Goldberg \ Ben-Gurion University

  58. Macros & special forms ( continued ) letrec Compiler Construction (* -2 a b ( cos theta))))) (square b) (+ (square a) ( sqrt ;; here we can use the procedure square (* x x)))) ( lambda (x) ( let ((square functions: Let’s re-examine the special form let : 77 / 111 ▶ We can use let to defjne local variables ▶ The value of these variables can be anything really, including ▶ Here’s how one might defjne local procedures using let : Mayer Goldberg \ Ben-Gurion University

  59. Macros & special forms ( continued ) letrec ( continued ) Nevertheless, let has one shortcoming when it comes to defjning local procedures: Recursive procedures cannot be defjned “as is” let , it might look like: ( let ((fact ( lambda (n) ( if (zero? n) 1 (* n (fact (- n 1))))))) (fact 5)) Compiler Construction 78 / 111 ▶ If we were to try to defjne and use the factorial function using Mayer Goldberg \ Ben-Gurion University

  60. Macros & special forms ( continued ) letrec ( continued ) Compiler Construction at the top-level. factorial procedure is free, and refers to a global variable defjned procedure fact of the procedure (lambda (fact) (fact 5)) (* n (fact (- n 1)))))) 1 (if (zero? n) (lambda (n) ((lambda (fact) (fact 5)) Which expands to 79 / 111 ▶ Notice the body of fact is not able to access the parameter ▶ The parameter fact is only accessible in the body of this ▶ The reference to fact within the text of the body of the Mayer Goldberg \ Ben-Gurion University

  61. Macros & special forms ( continued ) letrec ( continued ) To see things more clearly, we macro-expand the let . Note the parameter fact and whence it can be accessed: ((lambda (fact) (fact 5)) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) Compiler Construction 80 / 111 😟 This just looks like an example of recursion, but in fact it isn’t! Mayer Goldberg \ Ben-Gurion University

  62. Macros & special forms ( continued ) letrec ( continued ) An expansion that does work would be something like: (let ((fact 'whatever)) (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))) (fact 5)) Compiler Construction 81 / 111 (set! fact ▶ Can you see why it works? Mayer Goldberg \ Ben-Gurion University

  63. Macros & special forms ( continued ) 1 Compiler Construction why it may access the fact : This is recursion! 'whatever) (fact 5)) letrec ( continued ) 82 / 111 (if (zero? n) (lambda (n) (lambda (fact) ( Let’s expand the let -expression and see why this expansion works: (set! fact ( * n (fact (- n 1)))))) ☞ The text of the body of the factorial procedure appears within the body of the (lambda (fact) ...) procedure, which is Mayer Goldberg \ Ben-Gurion University

  64. Macros & special forms ( continued ) letrec ( continued ) Compiler Construction 83 / 111 The general macro-expansion implied by the last example is presented below: � (letrec ((f 1 ⟨ Expr 1 ⟩ ) = (let ((f 1 'whatever) (f 2 ⟨ Expr 2 ⟩ ) (f 2 'whatever) · · · · · · (f n ⟨ Expr n ⟩ )) (f n 'whatever)) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) � (set! f 1 ⟨ Expr 1 ⟩ ) (set! f 2 ⟨ Expr 2 ⟩ ) · · · (set! f n ⟨ Expr n ⟩ ) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) Mayer Goldberg \ Ben-Gurion University

  65. Macros & special forms ( continued ) letrec ( continued ) Compiler Construction non- define -expression in the body of a lambda or let the body even if they are grouped in difgerent begin -expressions begin -expression 84 / 111 let or lambda form define -expressions This expansion is almost right: ▶ The main problem with the expansion has to do with nested ▶ It is possible to use define to defjne a local function within a ▶ Several of such defjnitions may appear at the top of the body ▶ Several of such defjnitions may be grouped together within a ▶ All the nested define -expressions must appear at the top of ▶ It is a syntax error to have a nested define after a Mayer Goldberg \ Ben-Gurion University

  66. Macros & special forms ( continued ) ( define g3 Compiler Construction ... )) ( lambda (z) ... )))) ( define g4 ( lambda (x y) ... )) ( begin letrec ( continued ) ( define g2 ( lambda (x) ... )) ( begin ( define g1 ( lambda () ... )) ( lambda (a b c) ( define f This means that such defjnitions are possible: 85 / 111 Mayer Goldberg \ Ben-Gurion University

  67. Macros & special forms ( continued ) letrec ( continued ) And now we have a problem: letrec -expression, and we macro-expand the letrec -expression set! -expressions, and this would be syntactically illegal! compiler with nested define -expressions… Compiler Construction 86 / 111 💤 If nested define -expressions appear within the body of a into a let -expression with assignments at the top of its body, then the nested define -expressions will appear after the ▶ In fact, we shall not support nested define -expressions in our ☞ You should implement this macro-expansion in your compilers! ▶ We still need to fjnd a macro-expansion for letrec that can live Mayer Goldberg \ Ben-Gurion University

  68. Macros & special forms ( continued ) letrec ( continued ) Compiler Construction (let () 87 / 111 This macro-expansion does the trick: � (letrec ((f 1 ⟨ Expr 1 ⟩ ) = � (let ((f 1 'whatever) (f 2 ⟨ Expr 2 ⟩ ) (f 2 'whatever) · · · · · · (f n ⟨ Expr n ⟩ )) (f n 'whatever)) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ ) � (set! f 1 ⟨ Expr 1 ⟩ ) (set! f 2 ⟨ Expr 2 ⟩ ) · · · (set! f n ⟨ Expr n ⟩ ) ⟨ expr 1 ⟩ · · · ⟨ expr m ⟩ )) � Mayer Goldberg \ Ben-Gurion University

  69. Macros & special forms ( continued ) letrec ( continued ) Why does this macro-expansion work? perfectly acceptable Compiler Construction 88 / 111 ▶ Notice that the body of the original letrec -expression is now wrapped within a (in (let () ...) ), and any nested define -expressions will appear at the top of that let , which is Mayer Goldberg \ Ben-Gurion University

  70. Macros & special forms ( continued ) which are side-efgects Compiler Construction enough to express one of its most basic ideas? side efgects, and this raises doubts about the entire functional programming letrec ( continued ) 89 / 111 macro-expansions for letrec : procedures There is something fundamentally unsavory about the last two ▶ The letrec form has to do with defjning locally-recursive ▶ Recursion forms a cornerstone for functional programming ▶ In both cases, the macro-expanded code contains assignments, ▶ Side-efgects are specifjcally excluded in pure functional ☞ It seems as if there is something about recursion that requires programming agenda: Is functional programming not powerful Mayer Goldberg \ Ben-Gurion University

  71. Macros & special forms ( continued ) letrec ( continued ) Compiler Construction necessary for you to work on your compiler projects programming languages theory exciting topics in the foundations of computer science and in 90 / 111 to functional programming, without any side-efgects express the idea of recursion in a way that is natural and native ▶ The short answer is that yes, functional programming can ▶ To understand this answer, we will need to ▶ Study some fjxed-point theory ▶ Think harder about what recursion really means ▶ The full answer to this question is one of the most beautiful and ▶ For the time being, we move on to further topics that are ▶ We shall return to the topic in several weeks… Stay tuned! Mayer Goldberg \ Ben-Gurion University

  72. Macros & special forms ( continued ) cond The cond form has the general form: There are 3 kinds of cond -ribs: satisfjed, all subsequent ribs are ignored, the corresponding implicit sequence is evaluated, and its fjnal expression is returned. Compiler Construction 91 / 111 (cond ⟨ rib 1 ⟩ · · · ⟨ rib n ⟩ ) ① The common form ( ⟨ expr ⟩ ⟨ expr 1 ⟩ · · · ⟨ expr n ⟩ ) , where ⟨ expr ⟩ is the test-expression: It is evaluated, and if not false, the rib is Mayer Goldberg \ Ben-Gurion University

  73. Macros & special forms ( continued ) cond ( continued ) The cond form has the general form: There are 3 kinds of cond -ribs: If non-false, the rib is satisfjed, and the return value is the Compiler Construction 92 / 111 (cond ⟨ rib 1 ⟩ · · · ⟨ rib n ⟩ ) ② The arrow form ( ⟨ expr ⟩ => ⟨ expr f ⟩ ) , where ⟨ expr ⟩ is evaluated: application of ⟨ expr f ⟩ to the value of ⟨ expr ⟩ . Mayer Goldberg \ Ben-Gurion University

  74. Macros & special forms ( continued ) cond ( continued ) The cond form has the general form: There are 3 kinds of cond -ribs: satisfjed immediately, and all subsequent ribs are ignored. The implicit sequence is evaluated, and the value of its fjnal expression is returned. Compiler Construction 93 / 111 (cond ⟨ rib 1 ⟩ · · · ⟨ rib n ⟩ ) ③ The else -rib has the form (else ⟨ expr 1 ⟩ · · · ⟨ expr n ⟩ ) . It is Mayer Goldberg \ Ben-Gurion University

  75. Macros & special forms ( continued ) cond ( continued ) The cond form macro-expands into nested if -expressions: else -clause of the if -expression continues the expansion of the cond : Compiler Construction 94 / 111 ① The general form of the rib converts into an if -expression with a condition and an explicit sequence for the then -clause. The Mayer Goldberg \ Ben-Gurion University

  76. Macros & special forms ( continued ) The cond form macro-expands into nested if -expressions: value of the test, and if not false, passes it onto the function. following expansion would do: (if value ((f) value) (rest))) Compiler Construction 95 / 111 cond ( continued ) ② The arrow-form of the rib converts into a let that captures the For test-expression ⟨ expr ⟩ , and function-expression ⟨ expr f ⟩ , the (let ((value � ⟨ expr ⟩ � ) (f (lambda () � ⟨ expr f ⟩ � )) (rest (lambda () � ⟨ continue with cond -ribs ⟩ � ))) Mayer Goldberg \ Ben-Gurion University

  77. Macros & special forms ( continued ) cond ( continued ) The cond form macro-expands into nested if -expressions: subsequent ribs are ignored Compiler Construction 96 / 111 ③ The else -form of the rib converts into a begin -expression, and Mayer Goldberg \ Ben-Gurion University

  78. An example of expanding cond cond form Compiler Construction ( rest )))) ((f) value) ( if value ( rest ( lambda () ( begin (h x y) (g x))))) (f ( lambda () (p q))) ( let ((value (h? x)) ( begin (f x) (g y)) ( if (zero? n) Expanded form ((q? y) (p x) (q y))) ( else (h x y) (g x)) ((h? x) => (p q)) ( cond ((zero? n) (f x) (g y)) 97 / 111 Mayer Goldberg \ Ben-Gurion University

  79. Macros & special forms ( continued ) The expansion of quasiquote -expressions respectively away in the tag-parser Compiler Construction 98 / 111 ▶ Quasiquote-expressions are expanded twice: ▶ Once in the reader, when the forms ` ⟨ sexpr ⟩ , , ⟨ sexpr ⟩ , ,@ ⟨ sexpr ⟩ , and in fact ' ⟨ sexpr ⟩ too, are converted to their list forms: (quasiquote ⟨ sexpr ⟩ ) , (unquote ⟨ sexpr ⟩ ) , (unquote-splicing ⟨ sexpr ⟩ ) , and (quote ⟨ sexpr ⟩ ) , ▶ And a second time when quasiquote-expressions are expanded ☞ This is what we’re focusing on here! Mayer Goldberg \ Ben-Gurion University

  80. Macros & special forms ( continued ) The expansion of quasiquote -expressions ( cont ) means we can quasiquote quasiquoted expressions… This is complex, and not terribly useful, so we’re not going to support quasiquote -expressions Compiler Construction 99 / 111 ▶ Since R 6 RS, quasiquote -expressions can be nested, which it: We assume ordinary quasiquote -expressions not to include ▶ We assume we have already received the form (quasiquote ⟨ sexpr ⟩ ) , and are now going to reason about ⟨ sexpr ⟩ Mayer Goldberg \ Ben-Gurion University

  81. Macros & special forms ( continued ) The expansion of quasiquote -expressions ( cont ) we generate an error message, and quit around it the elements of the list, and apply the procedure vector to the elements of the resulting list Compiler Construction 100 / 111 ① Upon receiving the expression (unquote ⟨ sexpr ⟩ ) , we return ⟨ sexpr ⟩ ② Upon receiving the expression (unquote-splicing ⟨ sexpr ⟩ ) , ③ Given either the empty list or a symbol, we wrap (quote · · · ) ④ Given a vector, we apply to it map the quasiquote-expander over Mayer Goldberg \ Ben-Gurion University

Recommend


More recommend