Compiling with Continuations and LLVM Kavon Farvardin John Reppy - PowerPoint PPT Presentation
Compiling with Continuations and LLVM Kavon Farvardin John Reppy University of Chicago September 22, 2016 Introduction LLVM Introduction to LLVM De facto backend for new language implementations Offers high quality code generation for
Compiling with Continuations and LLVM Kavon Farvardin John Reppy University of Chicago September 22, 2016
Introduction LLVM Introduction to LLVM ◮ De facto backend for new language implementations ◮ Offers high quality code generation for many architectures ◮ Active industry development ◮ Widely used for research ◮ Includes a multitude of features and tools September 22, 2016 ML’16 — CwC and LLVM 2
Introduction LLVM The LLVM Landscape C Clang LLVM Rust Rustc x86-64 MLton SML Compiler LLVM IR ARM64 ErLLVM Erlang GHC Power Manticore Optimizer Haskell … PML … September 22, 2016 ML’16 — CwC and LLVM 3
Introduction LLVM Characteristics of LLVM IR define i32 @factorial ( i32 n ) { isZero = compare eq i32 n , 0 if isZero , label base , label recurse base : res1 = add i32 n , 1 goto label final recurse : minusOne = sub i32 n , 1 retVal = call i32 @factorial ( i32 minusOne ) res2 = mul i32 n , retVal goto label final final : res = phi i32 [ res1 , res2 ] return i32 res } September 22, 2016 ML’16 — CwC and LLVM 4
Introduction Manticore Manticore’s Runtime Model ◮ Efficient first-class continuations are used for concurrency, work-stealing parallelism, exceptions, etc. ◮ As in Compiling with Continuations , return continuations are passed as arguments to functions. ◮ Continuations are heap-allocated, making callcc cheap. ◮ Functions return by throwing to an explicit continuation. Manticore compiler Closure convert CPS convert MLRISC … BOM CPS CFG x86-64 IR IR IR LLVM September 22, 2016 ML’16 — CwC and LLVM 5
Introduction Manticore This Model Poses a Challenge for LLVM We require ◮ Efficient, reliable tail calls ◮ Garbage collection ◮ Preemption and multithreading ◮ First-class continuations ? + September 22, 2016 ML’16 — CwC and LLVM 6
Implementation Challenges Tail Calls Efficient, Reliable Tail Calls ◮ Tail calls are a major correctness and efficiency concern for us. ◮ LLVM’s tail call support is shaky: the issues are numerous and fixes are hard to come by. September 22, 2016 ML’16 — CwC and LLVM 7
Implementation Challenges Tail Calls Anatomy of a Call Stack foo: r12 Save push r12 r13 Save push r13 r14 Save Prologue { push r14 sub sp , 24 24 bytes foo ’s Spill Area ; body of foo call bar after: after ; body of foo SP add sp , 24 pop r14 Epilogue pop r13 pop r12 ret September 22, 2016 ML’16 — CwC and LLVM 8
Implementation Challenges Tail Calls LLVM’s Tail Call Optimization foo: foo: push r12 push r12 push r13 push r13 push r14 push r14 sub sp , 24 sub sp , 24 ; body of foo ; body of foo call bar ; <-- add sp , 24 add sp , 24 pop r14 pop r14 pop r13 pop r13 pop r12 pop r12 ret ; <-- jmp bar ; <-- September 22, 2016 ML’16 — CwC and LLVM 9
Implementation Challenges Tail Calls Avoiding the Tail Call Overhead ◮ MLton uses a trampoline, reducing procedure calls. ◮ GHC’s calling convention removes only callee-save instructions. ◮ We remove all overhead with a new calling convention (JWA) plus the use of naked functions. � Naked functions blindly omit all frame setup, requiring you to handle it yourself! foo: ; body of foo GOAL → jmp bar September 22, 2016 ML’16 — CwC and LLVM 10
Implementation Challenges Tail Calls Using Naked Functions Runtime System’s Frames RTS Register Saves ◮ Runtime system sets up frame Reusable ◮ Compiler limits number of spills Spill Area ◮ All functions reuse same frame SP ◮ FFI calls are transparent 8 byte slot 16-byte boundary Foreign Function Space September 22, 2016 ML’16 — CwC and LLVM 11
Implementation Challenges Garbage Collection Garbage Collection ◮ Cannot use LLVM’s GC support; assumes a stack runtime model. ◮ Manticore’s stack frame is only for temporary register spills. ◮ Thus, no new stack format to parse; our GC remains unchanged. ◮ We insert heap exhaustion checks before LLVM generation. September 22, 2016 ML’16 — CwC and LLVM 12
Implementation Challenges Garbage Collection Example of a Heap Exhaustion Check declare { i64* , i64* } @invoke-gc ( i64* , i64* ) define jwa void @foo ( i64 allocPtr_0 , . . . ) naked { . . . if enoughSpace , label continue , label doGC doGC : roots_0 = allocPtr_0 ; ... save live vals in roots_0 ... allocPtr_1 = getelementptr allocPtr_0 , 5 ; bump fresh = call { i64* , i64* } @invoke-gc ( allocPtr_1 , roots_0 ) allocPtr_2 = extractvalue fresh , 0 roots_1 = extractvalue fresh , 1 ; ... restore live vals ... goto label continue continue : allocPtr_3 = phi i64* [ allocPtr_0 , ] allocPtr_2 liveVal_1 = phi i64* [ . . . ] . . . September 22, 2016 ML’16 — CwC and LLVM 13
Implementation Challenges Preemption Preemption and Multithreading ◮ Continuations are a natural representation for suspended threads. ◮ Multithreaded runtimes must asynchronously suspend execution. ◮ When using a precise GC, safe preemption is challenging. September 22, 2016 ML’16 — CwC and LLVM 14
Implementation Challenges Preemption Preemption at Garbage Collection Safe Points Heap tests can be used for preemption: ◮ Threads keep their heap limit pointer in shared memory. ◮ We preempt by forcing a thread’s next heap test to fail. ◮ Preempted threads reenter runtime system via callcc . ◮ Non-allocating loops are also given a heap test. fun foo x = ... if limitPtr - allocPtr >= bytesNeeded then foo y else (callcc enterRTS ; foo y) ... September 22, 2016 ML’16 — CwC and LLVM 15
Implementation Challenges First-class Continuations First-class Continuations in LLVM ◮ Preemptions need to occur in the middle of a function. ◮ In CwC, we allocate a function closure to capture a continuation. Problem LLVM does not have first-class labels to create the closure! September 22, 2016 ML’16 — CwC and LLVM 16
Implementation Challenges First-class Continuations First-class Labels in LLVM Observations: ◮ The return address of a non-tail call is a label generated at runtime. ◮ Return conventions for C structs specify a mix of stack/registers. Solution We treat the return address like a first-class label by specifying a return convention for C structs that matches calls. September 22, 2016 ML’16 — CwC and LLVM 17
Implementation Challenges First-class Continuations The Jump-With-Arguments Calling Convention Arguments Passed Arg 1 Arg 2 Arg 3 Arg 4 … Location of Value rsi r11 rdi r8 … C Struct Returned Field 1 Field 2 Field 3 Field 4 … September 22, 2016 ML’16 — CwC and LLVM 18
Implementation Challenges First-class Continuations Example of First-class Labels for callcc define jwa void @foo ( . . . ) naked { . . . preempted : env = ; ... save live vars ... closPtr = allocPair ( undef , env ) ret = call jwa { i64* , i64* } @genLabel ( closPtr , @enterRTS ) arg1 = extractvalue ret , 0 arg2 = extractvalue ret , 1 . . . } ; call convention: ; rsi = closPtr , r11 = @enterRTS genLabel : pop rax ; put return addr in rax mov rax , ( rsi ) ; finish closure jmp r11 September 22, 2016 ML’16 — CwC and LLVM 19
Implementation Challenges First-class Continuations Example of First-class Labels for callcc _foo : ... preempted : ; r10 = env , rsi = closPtr (unintialized) mov r10 , 8( rsi ) mov _enterRTS , r11 call genLabel ; return convention: ; rsi = arg1 , r11 = arg2 ... ; call convention: ; rsi = closPtr , r11 = @enterRTS genLabel : pop rax ; put return addr in rax mov rax , ( rsi ) ; finish closure jmp r11 September 22, 2016 ML’16 — CwC and LLVM 20
Evaluation Performance Comparison No Passes "Basic" Passes "Extra" Passes -O1 -O2 -O3 2.2 2.15 2.15 2.13 2.11 2.12 2 Speedup (normalized) 2 1.8 1.6 1.4 1.2 1.12 1.09 1.09 1.08 1.07 1.08 1.08 1.07 1.08 1.07 1.08 1.05 1.02 1.01 0.99 1 1 1 1 1 1 1 0.87 0.86 0.86 0.8 0.6 life nbody queens quicksort takeuchi Figure: Execution time speedups over MLRisc when using LLVM codegen. September 22, 2016 ML’16 — CwC and LLVM 21
Conclusion and Future Work Conclusion and Future Work ◮ Hope to apply this to SML/NJ in the future. ◮ Plan to upstream JWA convention. ◮ More implementation details in our forthcoming tech report! + (with modifications) http://manticore.cs.uchicago.edu September 22, 2016 ML’16 — CwC and LLVM 22
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.