a concrete memory model for compcert
play

A concrete memory model for CompCert Frdric Besson Sandrine Blazy - PowerPoint PPT Presentation

A concrete memory model for CompCert Frdric Besson Sandrine Blazy Pierre Wilke Rennes, France P . Wilke A concrete memory model for CompCert 1 / 28 CompCert real-world C to ASM compiler used in industry (commercialised by AbsInt)


  1. A concrete memory model for CompCert Frédéric Besson Sandrine Blazy Pierre Wilke Rennes, France P . Wilke A concrete memory model for CompCert 1 / 28

  2. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Clight C Cminor RTL ASM P . Wilke A concrete memory model for CompCert 2 / 28

  3. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Clight C Cminor RTL ASM P . Wilke A concrete memory model for CompCert 2 / 28

  4. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Clight C Cminor RTL ASM Each language has a Formal Semantics i.e. a mathematical meaning for programs P . Wilke A concrete memory model for CompCert 2 / 28

  5. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Clight C Cminor RTL ASM Each language has a Formal Semantics i.e. a mathematical meaning for programs Proof of semantic preservation For every source program S that has a defined semantics, If the compiler succeeds to generate a target program T , Then T has the same behavior as S . P . Wilke A concrete memory model for CompCert 2 / 28

  6. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Memory model Clight C Cminor RTL ASM Each language has a Formal Semantics i.e. a mathematical meaning for programs Proof of semantic preservation For every source program S that has a defined semantics, If the compiler succeeds to generate a target program T , Then T has the same behavior as S . P . Wilke A concrete memory model for CompCert 2 / 28

  7. Goal: Make the semantics of C more defined Why did C leave some behaviors undefined? • Portability • Performance Why do we want to make it more defined? • real-life programs use features that are undefined, according to C • the compilation theorem will be more useful What kind of undefined behaviors do we aim at? • undefined pointer arithmetic, i.e. bitwise operators • use of uninitialised memory Our starting point: CompCert P . Wilke A concrete memory model for CompCert 3 / 28

  8. An example of low-level C program in CompCert int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b q b r P . Wilke A concrete memory model for CompCert 4 / 28

  9. An example of low-level C program in CompCert int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b ( b , 0 ) b q b r P . Wilke A concrete memory model for CompCert 4 / 28

  10. An example of low-level C program in CompCert int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b ( b , 0 ) 42 b q b r P . Wilke A concrete memory model for CompCert 4 / 28

  11. An example of low-level C program in CompCert int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b ( b , 0 ) 42 b q Bitwise operators on pointers are b r undefined behavior! CompCert [JAR’09], KCC [POPL ’12], Krebbers [POPL ’14], Norrish [PhD’98]: undefined behavior Kang et al. [PLDI’15]: don’t model bitwise operators P . Wilke A concrete memory model for CompCert 4 / 28

  12. Contributions • Previous work [APLAS’14]: A memory model for low-level programs • This work: • integration of the memory model inside CompCert • correctness proofs of the memory model • correctness proofs of the transformations of the frontend (up to Cminor) P . Wilke A concrete memory model for CompCert 5 / 28

  13. Outline 1 CompCert’s memory model 2 New features of the memory model 3 Consistency of the memory models 4 CompCert proof: Overview 5 Conclusion P . Wilke A concrete memory model for CompCert 6 / 28

  14. Outline 1 CompCert’s memory model 2 New features of the memory model 3 Consistency of the memory models 4 CompCert proof: Overview 5 Conclusion P . Wilke A concrete memory model for CompCert 7 / 28

  15. New features of the memory model Symbolic expressions val ::= i | ( b , o ) not expressive enough We change the semantic domain to: expr ::= val | op 1 expr | expr op 2 expr P . Wilke A concrete memory model for CompCert 8 / 28

  16. New features of the memory model Symbolic expressions val ::= i | ( b , o ) not expressive enough We change the semantic domain to: expr ::= val | op 1 expr | expr op 2 expr Alignment constraints We need information about some bits of the concrete address of a pointer The alloc primitive takes an extra parameter mask , such that: A ( b ) & mask = A ( b ) P . Wilke A concrete memory model for CompCert 8 / 28

  17. Interaction with the memory model What is the semantics of reading from memory: *p ? In CompCert, p is evaluated into a pointer ( b , i ) , then we can use load ( M , b , i ) In our model, p is a symbolic expression. It needs to be transformed into a pointer so that we can use load . normalise : mem → expr → ⌊ val ⌋ We need to modify the semantics to include calls to normalise • memory accesses (load and store) • conditionnal branches P . Wilke A concrete memory model for CompCert 9 / 28

  18. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q b r P . Wilke A concrete memory model for CompCert 10 / 28

  19. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q ( b , 0 ) | 5 b r P . Wilke A concrete memory model for CompCert 10 / 28

  20. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q ( b , 0 ) | 5 b r �� � � ( b , 0 ) | 5 ≫ 3 ≪ 3 P . Wilke A concrete memory model for CompCert 10 / 28

  21. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q ( b , 0 ) | 5 b r normalise �� � � ( b , 0 ) | 5 ≫ 3 ≪ 3 ( b , 0 ) P . Wilke A concrete memory model for CompCert 10 / 28

  22. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q ( b , 0 ) | 5 b r normalise �� � � ( b , 0 ) | 5 ≫ 3 ≪ 3 ( b , 0 ) P . Wilke A concrete memory model for CompCert 10 / 28

  23. Normalisation specification: concrete memories ( b 2 , 2 ) Abstract memory m 5 Concrete memories of m cm 1 cm 2 cm i ⊢ m cm 3 cm 4 • range : ] 0 ; 55 [ cm 5 • no overlap cm 6 • alignment 0 8 16 24 32 40 48 56 P . Wilke A concrete memory model for CompCert 11 / 28

  24. Normalisation: example 1 e = ((( b , 0 ) | 5 ) ≫ 3 ) ≪ 3 cm 1 = � ( b , o ) � cm 1 8 cm 2 = � ( b , o ) � cm 2 8 cm 3 = � ( b , o ) � cm 3 16 cm 4 = � ( b , o ) � cm 4 24 cm 5 = � ( b , o ) � cm 5 32 = � ( b , o ) � cm 6 cm 6 32 0 8 16 24 32 40 48 56 � e � cm 1 = ((( cm 1 ( b )+ 0 ) | 5 ) ≫ 3 ) = (( 8 | 5 ) ≫ 3 ) = (( 0b1000 | 5 ) ≫ 3 ) ≪ 3 = ( 0b1101 ≫ 3 ) ≪ 3 = 0b0001 ≪ 3 = 0b1000 = 8 = cm 1 ( b ) ∀ i , � e � cm i = cm i ( b ) , hence e normalises into ( b , 0 ) P . Wilke A concrete memory model for CompCert 12 / 28

  25. Normalisation: example 2 e = ( b , 0 ) > ( b ′ , 0 ) cm 1 true cm 2 true cm 3 true cm 4 false cm 5 false cm 6 false 0 8 16 24 32 40 48 56 There is no v such that ∀ i , � e � cm i = � v � cm i , hence e doesn’t normalise P . Wilke A concrete memory model for CompCert 13 / 28

  26. CompCert with symbolic expressions expr ::= val | op 1 expr | expr op 2 expr b 2 b 1 0 ( b 2 , 2 ) 5 b 3 7 5 ( b , o ) | 5 Memory model Clight C Cminor RTL ASM S S S S S P . Wilke A concrete memory model for CompCert 14 / 28

  27. Outline 1 CompCert’s memory model 2 New features of the memory model 3 Consistency of the memory models 4 CompCert proof: Overview 5 Conclusion P . Wilke A concrete memory model for CompCert 15 / 28

  28. How does our model compare to CompCert? x ( t ) x ( t ) t t Behaviors in CompCert Behaviors with symbolic expressions We are an extension of CompCert P . Wilke A concrete memory model for CompCert 16 / 28

  29. How does our model compare to CompCert? Formally, Lemma expr_add_ok : ∀ v 1 v 2 m v , sem_add v 1 v 2 m = ⌊ v ⌋ → ∃ e , sem_add_expr v 1 v 2 m = ⌊ e ⌋ ∧ normalise m e = v . If the addition of v 1 and v 2 succeeds in CompCert, Then it should succeed in our model as well, And the expression we compute should normalise into the same value. P . Wilke A concrete memory model for CompCert 17 / 28

  30. Discovery of bugs 2 cases where our model disagrees with CompCert • Bug in CompCert 2.4: Pointer comparison to NULL (fixed in CompCert 2.5) • Bug in our model: incorrect handling of pointers one past the end P . Wilke A concrete memory model for CompCert 18 / 28

Recommend


More recommend