Yhc: The York Haskell Compiler By Tom Shackell What? Yhc is a - PowerPoint PPT Presentation

Yhc: The York Haskell Compiler By Tom Shackell

What? ● Yhc is a rewrite of the back end of the nhc98 system. ● The back-end of the compiler is replaced. ● The runtime system is replaced. ● The instruction set is different. ● The Prelude is heavily modified.

Why? ● It was written to address some issues with the nhc98 back end. ● In particular: The high bit problem. ● Also as an experiment: Can we make nhc98 more portable?

The High Bit Problem

Graph Reduction ● Lazy functional languages are usually implemented using graph reduction. ● Haskell expressions are represented by graphs. sum :: [Int] -> Int sum [] = 0 sum (x:xs) = x + sum xs ● The expression 'sum [1,2]' might be represented by the graph: sum : : [ ] 1 2

Reduction sum : 1 : 2 [ ]

Reduction sum 3 : 1 : 2 [ ]

Reduction IND 3

Heap Node We can see there are 4 types of graph node : sum Constructor Thunk sum IND Blackholed Thunk Indirection In nhc and Yhc these graph nodes are represented with 4 types of heap node

Heap Nodes in nhc sum Constructor Constructor Information 10 Thunk Function Information Pointer 0 1 Blackholed Thunk Function Information Pointer 1 1 Indirection Redirection Pointer 00

The “High Bit” problem ● nhc assumes that it can use the topmost bit of a pointer to store information. ● This is not always the case: many modern Linux-x86 kernels allocate memory in addresses too high to fit in 31bits. Constructor Constructor Information 10 Thunk Function Information Pointer 0 1 Blackholed Thunk Function Information Pointer 1 1 Indirection Redirection Pointer 00

Heap Nodes in Yhc ● Yhc makes sure that all FInfo structures are 4 byte aligned. Freeing up a bit at the bottom for Thunk nodes. ● It also represents constructors by using a pointer to the information about the constructor, rather than encoding the information into the heap word. Constructor Constructor Information Pointer 01 Thunk Function Information Pointer 0 1 Blackholed Thunk Function Information Pointer 1 1 Indirection Redirection Pointer 00

Instruction Sets ● The instruction set for Yhc is much simpler than for nhc. ● Both are based on stack machines. ● However, nhc has instructions for directly manipulating both the heap and the stack. ● Where as Yhc only directly manipulates the stack.

Instructions main :: IO () main = putStrLn (show 42) nhc instructions Yhc instructions main(): main(): HEAP_CVAL show PUSH_INT 42 HEAP_INT 42 MK_AP show PUSH_HEAP MK_AP putStrLn HEAP_CVAL putStrLn RETURN_EVAL HEAP_OFF -3 RETURN_EVAL

nhc instructions nhc instructions main(): main(): Heap HEAP_CVAL show HEAP_CVAL show HEAP_INT 42 HEAP_INT 42 PUSH_HEAP PUSH_HEAP HEAP_CVAL putStrLn HEAP_CVAL putStrLn HEAP_OFF -3 HEAP_OFF -3 RETURN_EVAL RETURN_EVAL Stack Constants

nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show RETURN_EVAL Stack Constants

nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show RETURN_EVAL Stack Constants 42

nhc instructions main(): Heap HEAP_CVAL show HEAP_INT 42 PUSH_HEAP HEAP_CVAL putStrLn HEAP_OFF -3 show putStrLn RETURN_EVAL Stack Constants 42

Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL Stack

Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL Stack 42

Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL Stack show 42

Yhc instructions Heap main(): PUSH_INT 42 MK_AP show MK_AP putStrLn RETURN_EVAL putStrLn Stack show 42

Comparison ● Yhc uses less instructions to do the same thing. ● Because it doesn't have to have explicit movements between heap and stack. ● ... and because it can reference other nodes implicitly rather than using explicit heap offsets. ● Yhc instructions are also smaller ● Because it has more 'specializations' ● ... and again, because heap references are implicit ● These two factors make Yhc about 20% faster than nhc

Improving Portability

Bytecode in nhc ● nhc compiles Haskell functions into a bytecode for an abstract machine that manipulates graphs: The G-Machine. ● The bytecode is placed in a C source file, using an array of bytes. The C source file is then compiled and linked with the nhc interpreter to form an executable. unsigned char[] FN_Prelude_46sum = { NEEDHEAP_I32, HEAP_CVAL_I3, HEAP_ARG, 1, HEAP_CVAL_I4, HEAP_ARG, 1, HEAP_CVAL_I5, HEAP_OFF_N1, 3, HEAP_CADR_N1, 1, PUSH_HEAP, HEAP_CVAL_P1, 6, HEAP_OFF_N1, 8, HEAP_OFF_N1, 5, RETURN, ENDCODE };

Portable? ● The C code is portable, isn't it? ● Yes, but: ● It creates a dependency on a C compiler. ● There are issues with the nuances of various C compilers. ● The bytecode can't be loaded dynamically.

Improved Portability. ● Yhc also compiles Haskell functions into bytecode instructions for a G-Machine. ● However, Yhc places the bytecodes in a separate file which is then loaded by the interpretter at runtime. Similar to Java's classfile system. ● More portable, but it means Yhc has to do its own linking.

More Portable Still? ● Can we extend portability to include portability over a network? ● Then we could take a closure on one machine and have it run on another machine. ● Not implemented yet, but some interesting ideas.

Computer A Computer B calc data

Computer A Computer B calc data calc data

Computer A Computer B calc data

Computer A Computer B calc data Need calc

Computer A Computer B calc data Need calc calc calc(x): PUSH_ARG x PUSH_CONST subcalc MK_AP iter RETURN_EVAL

Computer A Computer B calc data calc calc(x): PUSH_ARG x PUSH_CONST subcalc MK_AP iter RETURN_EVAL

Computer A Computer B calc data iter subcalc calc calc(x): PUSH_ARG x PUSH_CONST subcalc MK_AP iter RETURN_EVAL

Computer A Computer B IND data iter subcalc

Computer A Computer B IND data iter Need iter subcalc

Computer A Computer B IND data iter And so on ... subcalc

Computer A Computer B IND IND 42

Computer A Computer B IND IND 42 Result

Computer A Computer B 42 Result

Computer A Computer B calc data 42 Result

Computer A Computer B IND 42 Result

Challenges ● Needs concurrency to be useful. ● Complicates Garbage collection. ● Level of granularity versus laziness. ● Possible architecture differences.

Other Things! ● Other people have written various interpretters and backends for Yhc bytecode: Java, Python, .NET ● ... and various related tools such as interactive interpretters. ● I'm also using Yhc to do my Hat G-Machine work.

Questions?

Yhc: The York Haskell Compiler By Tom Shackell What? Yhc is a - PowerPoint PPT Presentation

Yhc: The York Haskell Compiler By Tom Shackell What? Yhc is a rewrite of the back end of the nhc98 system. The back-end of the compiler is replaced. The runtime system is replaced. The instruction set is different. The

Compiler Construction Chapter 11 1 Compiler Construction Compiler Construction A New Compiler

A realtime GraphQL backend as a compiler in Haskell http://bit.ly/graphql-haskell Heippa!

Haskell-RL An Equational Specification of Haskell in Maude Andrew Bennett Presented on 24 April

Haskell Overview David Grisham 31 October 2017 Haskell Overview David Grisham

wrangling the internet of things with haskell production haskell Reid Draper @reiddraper

Intel Labs Haskell Research Compiler Hai (Paul) Liu with Neal Glew, Leaf Peterson, Todd A.

Yhc: Past, Present, Future Neil Mitchell The Past Started by Tom, fork of nhc He

Bringing Haskell to the World www.fpcomplete.com Experience Report Building Haskell Development

GPU programming in Haskell Henning Thielemann 2015-01-23 GPU programming in Haskell Motivation:

haskell cons In haskell consing is done via the infix operator (:). For example: (cons 1 (cons 2

An overview of Haskell Haggai Eran 23/7/2007 Haggai Eran An overview of Haskell Introduction

Dr. Strange- Todd L. Montgomery @toddlmontgomery Haskell Erlang Haskell Clojure

Metaprogramming Haskell, Metaprogramming Haskell, Metaprogramming Haskell, The Racket Way The

Deriving a Relationship from a Single Example Neil Mitchell community.haskell.org/~ndm/derive

CSE 110A: Winter 2020 Fundamentals of Compiler Design I Intro to Haskell Owen Arden

Haskell: Compiler as Theorem-Prover Greg Price ( price ) 2007 Nov 19 code samples:

The Retirement Process From Start to Finish v.06122020 This webinar covers UCs Retirement

Company Overview October 2017 Forward Looking Statements Use of Non-GAAP Financial Measures This

TDP: Agent Simulation AIT-Rescue 1 1 1 Taishun Kusaka , Yuki Miyamoto , Akira Hasegawa 2 1

Investors Relations Presentation Qatari German Medical Devices March 31, 2020 www.qgmd.com The

CONCEPT PRESENTATION January 4, 2015 Assessment Commi6ee Conference

Moorestown Township School District Proposed Space, Security, Sustainability Bond Referendum

The Effects of Raking and Cell Phone Integration on BRFSS Outcomes Machell Town, M.S. Carol

Hunter River Drought Operations 2019-20 Hunter Valley Water Users Association 14 June 2019 Work