ergonomics and verification of a foreign function
play

Ergonomics and verification of a foreign function interface between - PowerPoint PPT Presentation

Ergonomics and verification of a foreign function interface between Coq and C Joomy Korkut Princeton University General Exam Talk May 14th, 2020 1 Hello everyone! Today I'm gonna talk about the foreign function interface, or FFI, as it is often


  1. Ergonomics and verification of a foreign function interface between Coq and C Joomy Korkut Princeton University General Exam Talk May 14th, 2020 1 Hello everyone! Today I'm gonna talk about the foreign function interface, or FFI, as it is often called, between Coq and C, that I developed for the CertiCoq compiler. I'll explain the mechanisms and design decisions for the interface from an ergonomics and ease of verification point of view. For those who are not familiar, "foreign function interface" means allowing one language to call a function from another and vice versa. Here we want to do that for Coq and C. We will achieve that by using the "glue code" that we generate. That means extra code in C generated by the compiler separate from the compiled program, that helps us interact with Coq programs in C.

  2. dashed = not verified solid = verified CompCert Clightgen Coq C Clight ASM PrintClight CertiCoq what concerns this talk 2 To get familiar with the landscape of the problems we're trying to solve, let's first look at the languages we are dealing with and the compilers we use. <click> CertiCoq is a verified compiler written in Coq, it is a compiler from Coq to Clight. Clight is a simpler subset of C; it doesn't have typedefs, it doesn't have enum types, it has function calls only as a statement, and such. Clight does not have a concrete syntax, it only exists in syntax trees. <click> But it is possible to print Clight syntax trees in C concrete syntax, which is what CertiCoq does. However, this printing is not verified, which we will overlook for now. <click> This subset of C comes from the CompCert compiler, which is a verified compiler written in Coq, it is a compiler from C to di ff erent assembly languages. CompCert uses Clight as the one of the first intermediate languages in their compiler pipeline. <click> That being said, we will not deal with those parts today. We will look at the interaction between Coq and C, and use Clight as a step in between for those. Not only that, I will try to hide the noise that is inherent to Clight in the code excerpts I will show you today.

  3. Definition b *= andb true false. Check b. Compute b. 3 So how does a user compile their Coq program with CertiCoq? A Coq programming session looks like this, we have a list of commands that we step through one by one.

  4. stepping Definition b *= andb true false. through Check b. Compute b. b is defined 4 We can create new definitions and functions with these. We get a response from the Coq environment in the bottom right window pane.

  5. Definition b *= andb true false. Check b. Compute b. b : bool 5 We can ask the types of expressions and definitions, and get a result on the bottom right.

  6. Definition b *= andb true false. Check b. Compute b. = false vernacular : bool commands 6 We can evaluate the results of expressions. <click> Through what we call the "vernacular commands", we turn a Coq programming session into a conversation between the programmer and the environment.

  7. Definition b *= andb true false. Check b. Compute b. compiling by stepping CertiCoq Compile b. through 7 Compiling a Coq definition works just the same way. Unlike other compilers, you don't compile the full Coq file. You compile a single definition at a time. This definition can be simple expression, or a big function, or a tuple of multiple functions, that is up to you. <click> We step through...

  8. Definition b *= andb true false. Check b. Compute b. CertiCoq Compile b. Printed to file. now we have a C files named after b 8 And this creates a new C file in the same directory.

  9. the C code the user writes where CertiCoq keeps int main() the state of the Coq runtime { calling functions struct thread_info* tinfo = make_tinfo(); from compiled value result = body(tinfo); C code calling function print_bool(result); from generated glue code return 0; } the C type representing all Coq values (32/64-bit unsigned int) user writes C code to call and interact with the generated C code 9 Then what does the user do with that C file? They cannot just compile and run it, because it doesn't have a "main" function. But it can be used as a library. So if the user wants to run it and use the result, they have to write some C code to do that. <click> The first two lines in the main function here are to set up the Coq runtime and run the initial expression. <click> Specifically we set up the part of memory to be used to allocate memory for Coq values, and to return Coq values, We do that in this C struct type called thread_info. Any function that needs to allocate new memory on CertiCoq's heap will have to deal with the thread_info. <click> The type of all CertiCoq expressions is "value", which is an alias for a 32 or 64-bit unsigned integer. In reality, this can be a pointer or an actual integer, both of which fit this size. <click> Once we have the C representation of the result of the program, we can print it using a print function in the generated glue code. So this is the simplest example of a C file that uses a glue code function. Users of CertiCoq currently have to write some C code file like this to do something meaningful with the compiler output. We will see more complex examples of this later in the talk.

  10. dashed = not verified solid = verified CertiCoq (without recent changes) L1 
 L2 
 L3 
 L4 
 L6 
 L7 globally MetaCoq λ ☐ η -long λ ☐ CPS Clight nameless glue code + proof generation 
 (future work) 10 The CertiCoq compiler consists of many di ff erent phases, each defined in Coq. It starts from the MetaCoq description of the term that is compiled, that is, a syntax tree of a Coq program in Coq itself, also often called a reified program . After many phases, it eventually generates a Clight syntax tree. However, most information about inductive types are erased fairly early in this pipeline. <click> For that reason, I worked on a glue code generator, which is an extra code generator that takes the MetaCoq description of a term and generates helper functions in C for the types involved. <click> One direction we always keep in mind when we do that, is the verification of these functions. We are going to use the Verified Software Toolchain to generate specifications and proofs for these helper functions. This part isn't finished yet, we are still exploring the right way to do that, I'll talk about some ideas on this towards the end of this talk.

  11. example Coq function for the big picture Fixpoint simplify (r : rgx) : rgx *= 1 match r with inspecting a term's constructor | star epsilon K> epsilon 2 inspecting a | star (star r') K> star (simplify r') constructor's arguments | or epsilon (star r') K> star (simplify r') | or empty r' K> simplify r' | and r1 r2 K> and (simplify r1) (simplify r2) | or r1 r2 K> or (simplify r1) (simplify r2) | star r' K> star (simplify r') | _ K> r end. 3 creating new 
 constructor values 5 calling existing functions defining new functions 4 question: How can we do the same things with Coq values on the C side? 11 Here's some example Coq code to simplify a regular expression into an equivalent regular expression, we'll look at this code and identify what kind of expressive capabilities we use when we define Coq programs, to get the big picture. Ideally, whatever we can do in Coq with values of this data type, we should be able to do the same things in C; we want the same expressiveness in C. The glue code that we generate should accommodate the same kind of actions, but in C, using the C representations of Coq values. And what *can* we do in Coq? Let's look at this example Coq function and identify di ff erent parts. <click> We can inspect what constructor a value is created with. <click> We can inspect the arguments carried by those constructors. <click> We can create new values using constructors. <click> We can call existing functions by passing them arguments. <click> We can define new functions. <click> So now we'll look at these 5 expressive capabilities... <click> and see how we can do the same things with Coq values, ... but on the C side.

  12. How can we do the same things with Coq values on the C side? Coq C inspecting a term's constructor constructor tag getter 1 inspecting a constructor's arguments constructor argument structs 2 creating new constructor values constructor functions 3 calling existing functions closure caller 4 defining new functions closure currier 5 12 We will generate 5 basic kinds of glue code, each corresponding to an expressive capability of Coq. <click> The first one is constructor tag getter functions, which will tell us which constructor a value belongs to. <click> The second one is a way to get the collection of arguments a constructor has taken, ideally in a type that outlines how many. <click> To create new values, we'll generate a few kinds of functions to put them in di ff erent areas in memory. We also have to be mindful of how the garbage collector deals with this. <click> The other kinds of values in CertiCoq are closures. We need to be able to call existing closures... <click> ... and create new closures that would fit a given Coq type.

Recommend


More recommend