Source-to-Source Compilation in Racket You Want it in Which Language? University of Bergen University of Utah IFL, 1–3 October 2014 Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket Tero Hasu 1 Matthew Flatt 2 1 Bergen Language Design Laboratory 2 PLT
key topics Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ how to implement source-to-source compilers on top of Racket ▶ motivations: ▶ language infrastructure reuse ▶ support for implementing macro-extensible languages
macros for language definition language definition Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ Racket macros not only support language extension , but also ▶ host language syntax can be hidden entirely
”normal” execution of Racket languages are usually executed within the Racket VM Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket macroexpand Racket core Racket ▶ Racket languages compile run Racket VM bytecode
source-to-source compilers Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ or transcompilers ▶ programming language implementations outputting source code ▶ especially nice with exotic platforms ▶ have a compiler write what the vendor says you should
don’t need no Racket transcompiler implementation recipe: 1. pick your favorite programming language 2. pick useful libraries (parsing, pretty printing, etc.) 3. write an implementation Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket
can get back-end side infrastructure reuse Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ typically target language libraries ▶ e.g., language standard libraries, libuv, OpenGL, SQLite, …
what about front-end side? Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ reuse of language facilities? ▶ macro systems, module systems, … ▶ reuse of dev tools? ▶ IDEs, documentation tools, macro debuggers, …
language embedding Approaches in Haskell, Scala, etc.: translated Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ can use some host language functionality and tools ▶ still syntactically correct language ▶ might e.g. get type checking from host ▶ shallow embedding ▶ language encoded directly as host operations ▶ deep embedding ▶ expressions evaluate to ASTs, which can then be evaluated or
language embedding in Racket An attractive option: Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ difference: Racket has a compile-time phase built-in ▶ gives more options for embedding ▶ macro expressions evaluate to ASTs, which, still at compile-time: ▶ are made to encode Racket VM operations ▶ bonus: might write YourLang macros in YourLang ▶ are also made available for transcompilation
phase separation time have distinct bindings and state Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ Racket’s phase separation guarantees that compile time and run ▶ particularly crucial for a transcompiled language ▶ run time state: TargetLang (not Racket VM) ▶ run time bindings: YourLang (not Racket)
transcompilation via Racket bytecode not retain all of the Hasu, Flatt (BLDL, PLT) syntax original (core) parsing bytecode efficiency—does optimized for implementing Source-to-Source Compilation in Racket Racket ▶ suitable when macroexpand Racket core Racket ▶ bytecode is compile run Racket VM bytecode Whalesong JavaScript ▶ there is an API for
transcompilation via core Racket expand Hasu, Flatt (BLDL, PLT) has the details Source-to-Source Compilation in Racket externally with be extracted Racket module can C ▶ core syntax for any mzc macroexpand Racket core Racket read − syntax , then compile ▶ raco expand run Racket VM bytecode
macros in transcompiler implementation A macro expander is a source-to-source ”compiler”—macros exist to support source-to-source translation. free” semantics specified at once Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ general advantages: ▶ macro-based surface syntax definition gives parsing almost ”for ▶ macros are convenient for ”sugary” constructs: syntax and ▶ macros are modular and composable
Hasu, Flatt (BLDL, PLT) further exploitation of macro-expansion? Source-to-Source Compilation in Racket ▶ might do back-end-specific work in macro expansion ▶ performing target-specific analyses and transformations ▶ collating required metadata ▶ encoding code and metadata in the desired format ▶ made separately loadable, even
Racket submodules ”.” Racket VM run-time code main code for running the module standalone test code for testing the module srcdoc ”data-as-code” for inline documentation can also have: to-c++ code informing a C++ back end to-java code informing a Java back end Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ enable testing time, documentation time, and more ▶ adding to Racket’s run and compile times
accessing code from within (define-syntax (module-begin stx) (syntax-case stx () [(module-begin form ...) (let ([ast (local-expand #'(#%module-begin form ...) 'module-begin null)]) (do-some-processing-of ast))])) Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ a possibility unique(?) to Racket ▶ a Racket language can access all the code of a module ▶ can inspect it unexpanded, or expand it first ▶ can munge it in back-end-specific ways
compilation based on ”transcompile-time” code syntax-checked Hasu, Flatt (BLDL, PLT) annotations AST with type Source-to-Source Compilation in Racket expansion during macro prepared for it a submodule ▶ transcompiler macroexpand Magnolisp core Racket dynamic − require s compile run Racket VM bytecode mglc ▶ e.g. encoding a C++
1. evaluation in the Racket VM Magnolisp 2. by translating runtime code into C++ Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ a proof-of-concept toy language ▶ surface syntax defined as macros ▶ Racket’s macro and module systems exposed ▶ macro-programming in any Racket VM based language ▶ execution options: ▶ supports ”mocking” of primitives, for simulation ▶ by invoking separate mglc tool
Magnolisp syntax sample #lang magnolisp (typedef Int (#:annos foreign)) (function (zero) (#:annos foreign [type (fn Int)])) (function (inc x) (#:annos foreign [type (fn Int Int)])) (function (one) (inc (zero))) (function (two) (do (var x (one)) (return (inc x)))) Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket
example Magnolisp to C++ translation MGL_FUNC Int two( ) { Hasu, Flatt (BLDL, PLT) } return r; b: } } goto b; r = inc(x); { Int x = one(); { Int r; } (function (one) return inc(zero ()); MGL_FUNC Int one( ) { language implementation printing, etc. C++ translation, pretty optimization, type inference, (return (inc x)))) (do (var x (one)) (function (two) (inc (zero))) Source-to-Source Compilation in Racket ▶ mglc does whole-program ▶ more interesting: the Racket
Source-to-Source Compilation in Racket Hasu, Flatt (BLDL, PLT) a.rkt a.rkt m a gnolisp-s2s (ins ta nce) #l a ng m a gnolisp lis t ( r equi r e "num- t ypes .rkt ") (func t ion (in t -id x) DefV ar def-ls t (#: a nnos [ t ype (fn in t in t )] expo rt ) .... x) annos I d La mbd a .... .... in t -id .... .... macroexpand a.rkt (co r e) .... (module a m a gnolisp/m a in (#%module-begin (module m a gnolisp-s2s ra c k e t /b a se translate run (#%module-begin .... (define-v a lues (def-ls t ) a. cpp (#% a pp lis t #include " a. hpp" (#% a pp DefV ar .... ) MG L _AP I _FUNC in t in t _id(in t cons t & x) { .... )) r e t u r n x; .... )) } .... (#% r equi r e "num- t ypes .rkt ") a. hpp (define-v a lues (in t -id) .... ))) #ifndef __ a _hpp__ #include " a _config . hpp" MG L _AP I _PROTO in t in t _id(in t cons t & x); #endif
(module reader syntax/module-reader plain-magnolisp/main) transcompiled language as a library (require magnolisp/surface) (provide #%app function typedef foreign export type fn) (require magnolisp/modbeg) (provide (rename-out [module-begin #%module-begin])) Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ mostly a matter of exporting macros and variables ▶ syntax should be restricted to what can be transcompiled ▶ some macros should embed information for transcompilation E.g., ”main.rkt” for plain − magnolisp language: #lang racket/base
encoding foreign core language (define-syntax (let/local-ec stx) Hasu, Flatt (BLDL, PLT) 'local-ec #t)])) (syntax/loc stx (let/ec . rest)) (syntax-property [(_ . rest) (syntax-case stx () Source-to-Source Compilation in Racket Racket’s E.g., a Magnolisp core form corresponding to a C++ goto label, ▶ a transcompiled language’s core language may differ from ▶ macros expand to Racket core forms, but: ▶ the core forms may have custom syntax properties ▶ some variables may have special meaning ▶ etc. encoded as a call/ec application with a specific property:
defining surface syntax (define-syntax-rule (do body ...) (let/local-ec k (syntax-parameterize ([return (syntax-rules () [(_ v) (apply/local-ec k v)])]) body ... (values)))) (provide do) Hasu, Flatt (BLDL, PLT) Source-to-Source Compilation in Racket ▶ with macros that expand to supported core language
Recommend
More recommend