Closure Conversion Higher-order functions Michel Schinz Advanced Compiler Construction – � 2009-03-20 Higher-order functions HOFs in C In C, it is possible to pass a function as an argument, and to A higher-order function ( HOF ) is a function that either: return a function as a result. • takes another function as argument, or However, C functions cannot be nested: they must all appear • returns a function. at the top level. This severely restricts their usefulness, but Many languages offer higher-order functions, but not all greatly simplifies their implementation – they can be provide the same power... represented as simple code pointers. 3 4 HOFs in functional languages HOFs example To illustrate the issues related to the representation of functions in a functional language, we will use the following Scheme In functional languages – Scala, Scheme, OCaml, etc. – example: functions can be nested, and they can survive the scope that (define make-adder defined them. (lambda (x) This is very powerful as it permits the definition of functions (lambda (y) (+ x y)))) that return “new” functions – e.g. functional composition. (define increment (make-adder 1)) However, as we will see, it also complicates the representation (increment 41) ⇒ 42 of functions, as simple code pointers are no longer sufficient. (define decrement (make-adder -1)) (decrement 42) ⇒ 41 5 6
Representing adder functions To represent the functions returned by make-adder , we basically have two choices: • Keep the code pointer representation for functions. Closures However, that implies run-time code generation, as each function returned by make-adder is different! • Find another representation for functions, which does not depend on run-time code generation. 7 Closures Closure (make-adder 1) (make-adder -1) shared To adequately represent the functions returned by make- code adder , their code pointer must be augmented with the value code pointer code pointer of x . compiled code for: Such a combination of a code pointer and an environment environment environment (lambda (y) giving the values of the free variable(s) – here x – is called a (+ x y)) closure . The name refers to the fact that the pair composed of the code x � 1 x � -1 pointer and the environment is self-contained. The code of a closure must be evaluated in its environment, so that x is “known”. 9 10 Introducing closures Representing closures Using closures instead of function pointers to represent During function application, nothing is known about the functions changes the way they are manipulated at run time: closure being called – it can be any closure in the program. • function abstraction builds and returns a closure instead of The code pointer must therefore be at a known and constant a simple code pointer, location so that it can be extracted. • function application extracts the code pointer from the The values contained in the environment, however, are not closure, and invokes it with the environment as an used during application itself: they will only be accessed by additional argument. the function body. This provides some freedom to place them. 11 12
Flat closures In flat (or one-block ) closures, the environment is “inlined” into the closure itself, instead of being referred from it. The closure plays the role of the environment. (make-adder 1) Compiling closures flat closure code pointer x � 1 13 Closure conversion Free variables In a compiler, closures can be implemented by a simplification phase, called closure conversion . The free variables of a function are the variables that are used but not defined in that function – i.e. they are defined in some Closure conversion transforms a program in which functions can have free variables into an equivalent one containing only enclosing scope. closed functions. Global variables are never considered free, since they are The output of closure conversion is therefore a program in available everywhere. which functions can be represented as code pointers! 15 16 Free variables example Closing functions Functions are closed by adding a parameter representing the Our adder example contains two functions, corresponding to environment, and using it in the function’s body to access free the two occurrences of the lambda keyword: variables. (define make-adder Function abstraction and application must of course be (lambda (x) adapted accordingly: (lambda (y) (+ x y)))) • abstraction must create and initialize the closure and its The outer one does not have any free variable: it is a closed environment, function, like all top-level functions. The inner one has a single • application must extract the environment and pass it as an free variable: x . additional parameter. 17 18
Closing example Recursive closures Recursive functions need access to their own closure. For example: (define make-adder (lambda (x) recursive let (letrec ((f (lambda (l) … (map f l) …)))) (lambda (y) (+ x y)))) …) Several techniques can be used to give a closure access to itself: closure for 1. the closure – here f – can be treated as a free variable, make-adder and put in its own environment – leading to a cyclic (define make-adder closure, (vector (lambda (env 1 x) (vector (lambda (env 2 y) 2. the closure can be rebuilt from scratch, (+ (vector-ref env 2 1) y)) closure 3. with flat closures, the environment is the closure, and can x)))) for anonymous be reused directly. adder 19 20 Mutually-recursive closures Mutually-recursive closures Mutually-recursive functions all need access to the closures of cyclic closures shared closures all the functions in the definition. closure for f closure for g closure for f For example, in the following program, f needs access to the code ptr. f closure of g, and the other way around: closure for g (letrec ((f (lambda (l) … (compose f g) …)) code ptr. g code ptr. f code ptr. g (g (lambda (l) … (compose g f) …))) v 1 …) Solutions: v 2 v 1 w 1 1. use cyclic closures, or v 3 v 2 w 2 2. share a single closure with interior pointers (note: interior w 1 pointers make the job of the GC harder). v 3 w 2 21 22 Core minischeme Core minischeme is the version of minischeme that the compiler handles. It is as expressive as the full minischeme language, but more regular, in that: Closure conversion for • let forms can only bind one variable – minischeme let s with more than one binding are converted to nested let s core minischeme in core minischeme, • the body of let and lambda forms have a single expression as body – minischeme let and lambda bodies with more than one expression are wrapped in a begin expression. 24
Minischeme closure conversion Minischeme free variables F [ (lambda ( v 1 ... v n ) body ) ] = F [ body ] \ { v 1 , ..., v n } F [ (begin e 1 … e n ) ] = F [ e 1 ] � … � F [ e n ] As we have seen, closure conversion consists in closing F [ (if e 1 e 2 e 3 ) ] = F [ e 1 ] � F [ e 2 ] � F [ e 3 ] functions by passing them an environment containing the F [ (and e 1 e 2 ) ] = F [ e 1 ] � F [ e 2 ] values of their free variables. F [ (or e 1 e 2 ) ] = F [ e 1 ] � F [ e 2 ] We will specify the closing of core minischeme functions as a F [ ( e 1 … e n ) ] = F [ e 1 ] � … � F [ e n ] function � · � mapping potentially-open terms to closed ones. F [ v ] when v is local = { v } For that, we first need to define a function F mapping a term to F [ v ] when v is global or a primitive = � the set of its free variables. Note: since a let form is equivalent to the application of an Note: to simplify presentation, we assume in the following anonymous function, it is easy to deduce the rule to compute slides that all variables in a program have a unique name. its free variables from the rules above. This is left as an exercise. 25 26 Closing minischeme functions Closing minischeme functions Abstraction is closed by creating and returning the closure, represented as a vector: underlined Closing core minischeme constructs that do not deal with � (lambda ( v 1 … v n ) body ) � = variables are fresh functions or variables is trivial: (vector (lambda (env v 1 … v n ) � (define name value ) � = (define name � value � ) (let ((w 1 (vector-ref env 1)) … � (let (( v e )) body ) � = (let (( v � e � )) � body � ) (w n (vector-ref env n ))) � (begin e 1 … e n ) � = (begin � e 1 � … � e n � ) � body � [ Fv 1 � w 1 ]…[ Fv n � w n ] ) � (if e 1 e 2 e 3 ) � = (if � e 1 � � e 2 � � e 3 � ) Fv 1 Fv 2 …) � (and e 1 e 2 ) � = (and � e 1 � � e 2 � ) where � (or e 1 e 2 ) � = (or � e 1 � � e 2 � ) • t [ x � y ] denotes the term t where the variable x is � x � where x is a number or identifier = x substituted by the variable y , • Fv is an ordering of F [ (lambda ( v 1 … v n ) body ) ] and Fv i is its i th component. 27 28 Closing minischeme functions Finally, application extracts the code pointer from the closure, and invokes it with the closure itself as the first argument, followed by the other arguments: � ( e 1 e 2 … e n ) � when e 1 is not a primitive = Closures and objects (let ((closure � e 1 � )) ((vector-ref closure 0) closure � e 2 � … � e n � )) � ( e 1 e 2 … e n ) � when e 1 is a primitive = ( e 1 � e 2 � … � e n � ) 29
Recommend
More recommend