Direct Reflection for Free! Joomy Korkut Princeton University @cattheory February 25th, 2019 NYPLSE '19 1
Basic terminology When we write an interpreter or a compiler, we are dealing with two languages: • Metalanguage: the language in which the interpreter/compiler is implemented. • Object language: the input language of the generated interpreter/compiler. 2
Metaprogramming Homogeneous Heterogeneous (same language) (different languages) e.g. C preprocessor Generative Intensional (taking apart (putting together) data types and functions) Strings ADTs Quasiquotation (JavaScript's (Lisp, Haskell, Idris) (Template eval ) Haskell) categorization from Martin Berger's 2016 slides 3
Problem statement • Implementing metaprogramming systems, when writing a compiler/interpreter, is difficult. Especially with languages in development, any change in the language will require a lot of work to keep the metaprogramming parts up to date. • Until recently, we did not have a convincing way to automatically add homogeneous generative metaprogramming to an existing language definition, now we do thanks to "Modelling Homogeneous Generative Meta-Programming" by Berger, Tratt and Urban (ECOOP'17) However, their one-size-fits-all method requires the addition of a new constructor to the AST to represent ASTs. And the addition of "tags" as well. • We still do not have a convincing way to automatically add homogeneous generative metaprogramming to an existing language implementation. 4
My solution • To find an appropriate representation of ASTs of an object language inside that language. We can pick a different representation for each language. • To use Haskell and take advantage of the generic programming techniques to automatically add metaprogramming to an existing language implementation. • In other words, I want to use the intensional metaprogramming of the meta language to automatically create a generative metaprogramming system for the object language. 5
Peirce's triangle of signs materializes into Symbol Object (the physical sign itself, representamen) (the referred object, referent) 🛒 stop rule STOP is represented by y d b e c d o e d t o e s n e i n d t o s e s i n e c t a o c d i e d s n i i n t o Sense (the thought/sense made out of it, interpretant) "I should stop here." 6
Peirce's triangle of signs, with a twist Metalanguage term A value Symbol Object that represents it Object language term Sense that represents it (in a language implementation) inspired from James Noble and Kumiko Tanaka-Ishii 7
The language implementation triangle Meta language term A value that represents it the mathematical value Red red (in meta language) Object language term that represents it Red λr.λg.λb.r inl () (if our object language (if our object language (if our object language has algebraic data types) is untyped λ -calculus) is typed λ -calculus with sums and products) 8
The language implementation triangle Meta language term A value that represents it the string hello "hello" (in meta language) Object language term that represents it "hello" any other representation (if our object language our object language supports has strings) 9
Peirce's triangle of signs, with another twist Term in the AST representing that Symbol Object object language term in the meta language Reflection of that term Sense in the object language (in a language implementation) 10
The metaprogramming implementation triangle Term in the AST representing that term in object language the meta language "hello" StrLit "hello" (in object language) (in meta language) Reflection of that term in the object language StrLit "hello" (in object language) 11
Meta language term that represents it A value "hello" reflection the string hello (in meta language) reification AST representing that term Term in the in the meta language object language StrLit "hello" "hello" level 0 (in meta language) (in object language) quotation antiquotation AST representing the reflected term Reflection of that term in the meta language in the object language StrLit "hello" App (Var "StrLit") (StrLit "hello") level 1 (in object language) (in meta language) Reflection of the reflection of the term level 2 ... in the object language App (Var "StrLit") (StrLit "hello") (in object language) 12
class Bridge a where reflect => a ?> Exp reify => Exp ?> Maybe a 13
class Bridge a where reflect => a ?> Exp reify => Exp ?> Maybe a instance Bridge String where reflect s = StrLit s reify (StrLit s) = Just s reify _ = Nothing instance Bridge Int where reflect n = IntLit n reify (IntLit n) = Just n reify _ = Nothing 14
Haskell's generic programming techniques There are a few alternatives such as GHC.Generics, but I chose Data and Typeable for their expressive power. class Typeable a M> Data a where class Typeable a where ... typeOf => a ?> TypeRep toConstr => a ?> Constr dataTypeOf => a ?> DataType gmapQ => (forall d. Data d M> d ?> u) ?> a ?> [u] (can collect arguments of a value) fromConstrM => forall m a. (Monad m, Data a) M> (forall d. Data d M> m d) ?> Constr ?> m a (monadic helper to construct new value from constructor) Both Data and Typeable are automatically derivable! (for simple Haskell ADTs) 15
Cookbook 1. Pick your object language. 2. Define an AST data type for your object language, in the metalanguage. 3. Pick your reflection representation. (There are many options!) Define the Data a M> Bridge a instance for the AST data type. 4. Let's try with the λ -calculus! 16
Scott encoding for untyped λ -calculus Meta language term A value that represents it Z the natural number 0 (in meta language) Object language term that represents it λf.λx. x 17
Scott encoding for untyped λ -calculus Meta language term A value that represents it S Z the natural number 1 (in meta language) Object language term that represents it λf.λx.f (λf.λx.x) 18
Generalizing Scott encoding ⌈ ⌉ Ctor e_1 ... e_n (in meta language) = ⌈ ⌉ ⌈ ⌉ λ c_1. λ c_2. ... λ c_m. c_i e_1 ... e_n where Ctor is the ith constructor out of m constructors Key idea: if Ctor constructs a value of a type that has a Data instance, then we can get the Scott encoding automatically 19
Implementation of Scott encoding from Data instance Data a M> Bridge a where reflect v | getTypeRep @a YZ getTypeRep @Int = reflect @Int (unsafeCoerce v) | getTypeRep @a YZ getTypeRep @String = reflect @String (unsafeCoerce v) | otherwise = (hack) 4 lams args (apps (Var c : gmapQ reflectArg v)) where 2 (args, c) = constrToScott @a (toConstr v) 1 1. get all the constructors reflectArg => forall d. Data d M> d ?> Exp 2. pick which one you use reflectArg x = reflect @d x 3. recurse on the arguments 3 4. construct the nested lambdas reify e and applications ... 20
Implementation of Scott encoding from Data instance Data a M> Bridge a where reflect v ... reify e | getTypeRep @a YZ getTypeRep @Int = unsafeCoerce (reify @Int e) (hack) | getTypeRep @a YZ getTypeRep @String = unsafeCoerce <$> (reify @String e) | otherwise = case collectAbs e of -- dissect the nested lambdas 1 ([], _) ?> Nothing (args, body) ?> case spineView body of -- dissect the nested application 2 (Var c, rest) ?> do ctors <_ getConstrs @a ctor <_ lookup c (zip args ctors) evalStateT (fromConstrM reifyArg ctor) rest 1. get the nested lambda bindings _ ?> Nothing 4 2. get the head of the where reifyArg => forall d. Data d M> StateT [Exp] Maybe d nested application reifyArg = do e <_ gets head 3. recurse on the arguments modify tail 4. construct the Haskell term lift (reify @d e) 3 21
Tying the knot Now we have a way to take (pretty much) any Haskell value to its representation in Exp . This can be either a natural number, a color, or ... Exp itself. data Exp = x Var String e1 e2 | App Exp Exp λ x. e | Abs String Exp "hello" | StrLit String 3 | IntLit Int ( ) | MkUnit deriving (Show, Eq, Data, Typeable) 22
Tying the knot λ> reflect Red Abs "c0" (Abs "c1" (Abs "c2" (Var "c0"))) λ> reflect (S Z) Abs "c0" (Abs "c1" (App (Var "c0") (Abs "c0" (Abs "c1" (Var "c1"))))) λ> reflect MkUnit Abs "c0" (Abs "c1" (Abs "c2" (Abs "c3" (Abs "c4" (Abs "c5" (Var "c5")))))) λ> reflect (reflect Z) Abs "c0" (Abs "c1" (Abs "c2" (Abs "c3" (Abs "c4" (Abs "c5" (App (App (Var "c2") (StrLit "c0")) (Abs "c0" (Abs "c1" (Abs "c2" (Abs "c3" (Abs "c4" (Abs "c5" (App (App (Var "c2") (StrLit "c1")) (Abs "c0" (Abs "c1" (Abs "c2" (Abs "c3" (Abs "c4" (Abs "c5" (App (Var "c0") (StrLit "c1"))))))))))))))))))))) 23
Recommend
More recommend