run your research
play

Run your Research On the Effectiveness of Lightweight Mechanization - PowerPoint PPT Presentation

Run your Research On the Effectiveness of Lightweight Mechanization C Klein, J Clements, C Dimoulas, C Eastlund, M Felleisen, M Flatt, J A McCarthy, J Rafkind, S Tobin-Hochstadt, R B Findler 1 The Koala, the Orangutan, and the Walrus ftp>


  1. Run your Research On the Effectiveness of Lightweight Mechanization C Klein, J Clements, C Dimoulas, C Eastlund, M Felleisen, M Flatt, J A McCarthy, J Rafkind, S Tobin-Hochstadt, R B Findler 1

  2. The Koala, the Orangutan, and the Walrus ftp> user anonymous 331 Guest login ok Password: 230-Welcome to λ .com int main () { One day, Koala decided to build an ftp server 2

  3. The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\ and made the unfortunate choice to use the programming language C. 3

  4. The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\ We must not be surprised by this choice, however, as C is well-known to be a programming language that is effective for building systems software. 4

  5. The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\ After a few months of effort, Koala produced a functioning server that was rapidly adopted across the internet and widely used. 5

  6. The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\ One day, Orangutan decided to apply a new, automated testing technique to Koala’s ftp server and, sure enough, found multiple bugs — 6

  7. The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; }p == 0 ∨ *p == *q \[\Gamma\ \vdash\ unsurprising for software of that complexity implemented in a programming language like C. After all, C is designed for performance and provides no help to maintain invariants of data structures or to detect errors early, when they are easy to fix. 7

  8. The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] So, Orangutan decided to write a paper that explained the mathematical techniques it used to uncover the bugs and made the unfortunate choice to use the programming language LaTeX. 8

  9. The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] We must not be surprised by this choice, however, as LaTeX is well-known to be a programming language that is effective for typesetting mathematical formulas. 9

  10. The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] After a few months of effort, Orangutan produced a paper extolling the virtues of its new techniques, and the ideas were adopted across the software engineering community and the paper was widely cited. 10

  11. The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] One day, Walrus decided to apply a new, lightweight mechanized metatheory technique to Orangutan’s paper and, sure enough, found multiple bugs — 11

  12. The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] unsurprising for a piece of mathematics of that complexity implemented in a programming language like LaTeX. After all, LaTeX is designed for beautiful output and provides no help to check invariants of mathematical formulas or to run examples to ensure they illustrate the intended points. 12

  13. Moral: bugs are everywhere 13

  14. A niche for mechanized metatheory: • lightweight: high level of expressiveness (think scripting language) Prototype model • supports the entire semantics lifecycle: Robust Write-up model 14

  15. The Semantics Lifecycle Prototype model Robust Write-up model 15

  16. The Semantics Lifecycle misrenamed Prototype non-terminal model Robust Write-up model 16

  17. The Semantics Lifecycle misrenamed Prototype non-terminal model forgot typing rule Robust Write-up model 17

  18. The Semantics Lifecycle misrenamed Prototype non-terminal model forgot typing rule lost a case in a helper function Robust Write-up model 18

  19. The Semantics Lifecycle misrenamed Prototype non-terminal model forgot typing rule lost a case in a helper function Robust Write-up model added a case to wrong fn 19

  20. The Semantics Lifecycle swappped args misrenamed Prototype non-terminal model forgot typing rule lost a case in a helper function Robust Write-up model added a case to wrong fn 20

  21. The Semantics Lifecycle swappped args misrenamed Prototype non-terminal misused the model inductive hyp. forgot typing rule lost a case in a helper function Robust Write-up model added a case to wrong fn 21

  22. The Semantics Lifecycle swappped args misrenamed Prototype non-terminal misused the model inductive hyp. forgot typing didn’t rule recheck a lemma lost a case in a helper function Robust Write-up model added a case to wrong fn 22

  23. The Semantics Lifecycle swappped args misrenamed Prototype non-terminal misused the model inductive hyp. forgot typing didn’t rule recheck a lemma lost a case in a helper transcribed function Robust math wrong Write-up model added a case to wrong fn 23

  24. The Semantics Lifecycle swappped args misrenamed Prototype non-terminal misused the model inductive hyp. forgot typing didn’t rule recheck a lemma lost a case in a helper transcribed function Robust math wrong Write-up model added a forgot case to to recheck wrong fn example 24

  25. Redex our tool designed to fill this niche 25

  26. Our study: • Can random testing find bugs in an existing, well-tested Redex model? • Can Redex find bugs in published papers? 26

  27. Our study: • Can random testing find bugs in an existing, well-tested Redex model? Yes • Can Redex find bugs in published papers? Yes 27

  28. 10 10 papers in Redex 9 ICFP ’09 papers 8 written by others 2 mechanically verified 28

  29. 10 papers with errors 10 10 papers in Redex 9 ICFP ’09 papers 8 written by others 2 mechanically verified 29

  30. 10 Your papers 10 have errors too 30

  31. Copy & Paste Typesetting Error: 31

  32. Copy & Paste Typesetting Error: 32

  33. Copy & Paste Typesetting Error: Typesetting should be automatic 33

  34. Erroneous Example: 34

  35. Erroneous Example: 35

  36. Erroneous Example: 36

  37. Erroneous Example: Examples can be tested 37

  38. Unexpected Behavior: select(c, c) 38

  39. Unexpected Behavior: select(c, c) compile ~ ⊙ c | select(c, c) 39

  40. Unexpected Behavior: – stuck select(c, c) compile ~ select(c, c) – loops forever ⊙ c | Deadlock in source but busy waiting in target 40

  41. Unexpected Behavior: – stuck select(c, c) compile ~ select(c, c) – loops forever ⊙ c | Deadlock in source but busy waiting in target Found this by playing with examples 41

  42. False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way 42

  43. False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way Counterexample: Not a fly-by-night If σ = {( δ ,1) → 2} then proof; 12 typeset ( λ δ x. x ) 1, σ ⇒ * 2, σ , pages in a dissertation chapter but ( λ δ x. x ) 1 ↦ 1 43

  44. False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way Counterexample: Not a fly-by-night If σ = {( δ ,1) → 2} then proof; 12 typeset ( λ δ x. x ) 1, σ ⇒ * 2, σ , pages in a dissertation chapter but ( λ δ x. x ) 1 ↦ 1 Random testing easily finds this 44

  45. Recap: • Automatic typesetting • Unit Testing • Exploring Examples • Random testing 45

  46. p ::= ( e ...) p ::= ( e ...) e ::= ( e e ...) e ::= ( e e ...) Γ ⊢ e 1 : ( → t 2 ... t 3 ) Γ ⊢ e 2 : t 2 ... | ( λ ( x : t ...) e ) Γ ⊢ ( e 1 e 2 ...) : t 3 | x | ( + e ...) ( x 1 : t 1 Γ ) ⊢ ( λ ( x 2 : t 2 ...) e ) : ( → t 2 ... t ) | number | number | ( amb e ...) | ( amb e ...) Γ ⊢ ( λ ( x 1 : t 1 x 2 : t 2 ...) e ) : ( → t 1 t 2 ... t ) t ::= ( → t ... t ) | num t ::= ( → t ... t ) | num Γ ⊢ e : t Γ ⊢ ( λ () e ) : ( → t ) P ::= ( e ... E e ...) E ::= ( v ... E e ...) | ( + v ... E e ...) ( x : t Γ ) ⊢ x : t | [] v ::= ( λ ( x : t ...) e ) Γ ⊢ x 1 : t 1 x 1 ≠ x 2 | number ( x 2 : t 2 Γ ) ⊢ x 1 : t 1 Γ ::= · | ( x : t Γ ) Γ ⊢ e : num ... [ β v] Γ ⊢ ( + e ...) : num P [(( λ ( x : t ... 1 ) e ) v ... 1 )] P [ e { x := v ...}] Γ ⊢ number : num [+] P [( + number 1 ...)] P [ Σ [[ number 1 , ... ] ] ] Γ ⊢ e : num Γ ⊢ e : num ... ... ( e 1 ... E [( amb e 2 ...)] e 3 ...) [amb] ( e 1 ... E [( amb e 2 ...)] e 3 ...) [amb] Γ ⊢ ( amb e ...) : num Γ ⊢ ( amb e ...) : num ( e 1 ... E [ e 2 ] ... e 3 ...) ( e 1 ... E [ e 2 ] ... e 3 ...) 46

Recommend


More recommend