Intermediate forms: A-Normal Form Matt Might University of Utah www.ucombinator.org matt.might.net
Terminology • Intermediate language • Intermediate representation • Intermediate form
Intermediate form An intermediate language is a target language that sits above the output language.
Purpose Scheme Java C++ IR x86 ARM MIPS
Nano-pass compilation 60+ intermediate forms!
History • Register-transfer languages (1950s) • Continuation-passing style (1970s) • Static single assignment (1980s) • A-normal form (1990s) • Hybridization? (2010s)
A pathway Java, C#, Scheme ANF CPS SSA RTL RTL ASM
Lesser-known • Store-passing style • Monadic languages
Standard IRs • CIL • LLVM • GIMPLE • C-- • JIMPLE/SHIMPLE
A-Normal Form All arguments to functions are atomic. Flanagan et al., “The Essence of Compiling with Continuations.”
— The inverse CPS transformation: I?J: CPS(CS) + A(CS) U[(k w)] = V[w] (let (z iU[W]) U[P]) U[(let (z W) P)] = U[(if’O W P1 Pz)] = (if’(l Q[W] UIP1] U[l%]) (w[w] ‘3qw,] ... V[wn]) Z.f[(w k WI . . . w.)] = (let (z (V[W] Ui[Wl] . . . If[W.])) U[P]) U[(W (AZ.P) WI . . . w.)] = (o W[w,] ... W[wn]) U[(o’ k- WI . . . Wn)] = WI ... Wn)] = (XZ.P) (let (z (O VIW1] . . . f17[Wn])) L/[P]) q(o’ V2:w+ v c W[c] = !qx] = x qMx] . ..%. q = Axl . ..xn.u[itq A-Normal Form The language A(CS) M ::= V (return) (bind) I (let (z V) M) (branch) l(if13V MM) I(VK ... Vn) (tad call) I (let (call) (x (V K Vn)) M) (prira-op) I(ovl... vn) (prim-op) \ (let (x (O Vi . . . V~)) M) v ::= (values) clxl(kc]. ..x M)M) Figure 6: The inverse CPS transformation and its output Evaluation Contexts: where F= Vor F=O l(let(z$)M) l(ifO~MM) I(FV. V8M.. .M) & ::= [1 The A-reductions: (let (x M) S[N]) where&# [ ], z @ FV(f) ~[(let (X M) AT)] + (AI) t[(i~ v M, M,)] -+ (ifO V SIMI] (Az) S[MZ]) where ~ # [ ] t[(F VI . . . Vn)] + (let (t (F VI . . . V~)) t[t]) (A) F = V or F = O,: # ~’[(let (Z []) M)], $ # [ ],t C FV(~) where Figure 7: Evaluation contexts and the set of A-reductions The A-reductions transform programs in a natu- subexpression to be evaluated according to the CEK se- The first two reductions ral and intuitive manner. mantics. For example, in an expression (let (% Ml) flfz), merge code segments across declarations and condi- the next reducible expression must occur within Ml, The last reduction Iifis redexes out of eval- t ionals. hence the definition of evaluation contexts includes the uation contexts and names intermediate results. Us- clause (let (z ~) M). ing evaluation contexts and the A-reductions, we can 243
Semantics: Let P G CPS(CS), ew&(P) = c if (P, O, (ar z, (k z), O, stop)) w: ((k z), O[z := c], stop). Data Specifications: CPS(CS) Envc Cont. SC E Statec = (machine states) x x Variables m Valuec (environments) E- E Envc = (machine values) W* G Valuee = c I (CllCZI... Z~, P,)-) (continuations) Ek E Contc = stop I (ar z, P, E-, Ek) Transition Rules: where Ek = (ar x, P’, El–, E!) W), E-, Ek) @+c (P’ E;[z := p(W, E-)], Ef) ((k ((let (z W) P), E-, E’) F@+= (P, E-[z := ,u(W, E-)], E’) = O {(if’O W PI .P2), E-, Ek) ~c (Pi, E-, E’) where K(W,.E-) # O or (Pz, E-, E’) where K(W, E-) ((wkw, ... W~), E-, Ek) w= (P’, E~[z] := W:, . . ..z. := WJ, Ek) where v(W, E–) = (cl k’zl . . .xn, P’, El-) and for 1 5 t < n, W,” = P(W,, E–) w, . .. Wn), ((w (AS.P) E-, E~) ~c (P’, E~[zl := WY,. ... zn := W:], (ar z, P, E-, Ek)) where p(W, E-) = (cl k’zl and for 1< i < n, W,* = p(W; ,E–) . ..~n. P’, E1-) ((0’ k W, . . Wn), E-, Ek) WC (P’, Efl[z := &( O’, W:,..., W;)], E;) if 8C(0’, W~, . . . . W;) is defined, where Ek = (ar x, P’, E;, E!) and for 1< i < n, W? = ,U(W,, E-) ((o’ (/kz.P) w, . .. W~), E-, Ek) UC (P, E-[x := &( O’, W:,..., W~)], Ek) if ISC(O’, W:, . . . . W;) is defined, and for 1 < i < n, W,* = p(W,, E–) Discovering ANF Figure 5: The realistic CPS abstract machine: the C,-P, EK machine. CPS (7s o Again, the machine ignores the continuation parameter $3 b in the closures and manipulate the “global” register Ek I I instead. Al @normalization I I Undoing CPS The crucial insight is that the I t elimination of the redundant information from the un-CPS e CPS(CS) A(C’S) i an inverse CPS trans- CCP,EK machine corresponds to formation [7, 17] on the intermediate code. The func- The diagram naturally suggests a direct translation A tion Z./ in Figure 6 realizes such an inverse [17]. The in- that combines the effects of the three phases. The iden- verse transformation formalizes our intuition about the tification of the translation A requires a theorem re- redundancies in the CCP.EK machine. It eliminates the lating ~-reductions on CPS terms to reductions on the variable k from return instructions as well aa the param- source language. This correspondence of reductions was eter k from procedures. The latter change implies that the subject of our previous paper [17]. The resulting set continuations are not passed as arguments in function of source reductions, the A-reductions, is in Figure 7.6 calls but rather become contexts surrounding the calls. Since the A-reductions are strongly normalizing, we can For example, the code segment cps (IV) in Section 3 be- characterize the translation A aa any function that aP- comes: plies the A-reductions to a source term until it reaches a normal form [17: Theorem 6.4]. A(iV) = (let (f~ (+ 2 2)) The definition of the A-reductions refers to the con- (let (z 1) cept of evaluation contexts. An evaluation context is a (t,(f %)) (let term with a “hole” (denoted by [ ]) in the place of one (+ -t, t2)))) subterm. The location of the hole points to the next 6Danvy and Weise [21] also recognize that the compaction [8] Based on the above argument, it appears that CPS of CPS terms can be expressed in the source language, but do not compilers perform a sequence of three steps: explore this topic systematically. L-t/&
Recommend
More recommend