Run your Research On the Effectiveness of Lightweight Mechanization C Klein, J Clements, C Dimoulas, C Eastlund, M Felleisen, M Flatt, J A McCarthy, J Rafkind, S Tobin-Hochstadt, R B Findler 1
The Koala, the Orangutan, and the Walrus ftp> user anonymous 331 Guest login ok Password: 230-Welcome to λ .com int main () { One day, Koala decided to build an ftp server 2
The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\ and made the unfortunate choice to use the programming language C. 3
The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\ We must not be surprised by this choice, however, as C is well-known to be a programming language that is effective for building systems software. 4
The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\ After a few months of effort, Koala produced a functioning server that was rapidly adopted across the internet and widely used. 5
The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; } \[\Gamma\ \vdash\ One day, Orangutan decided to apply a new, automated testing technique to Koala’s ftp server and, sure enough, found multiple bugs — 6
The Koala, the Orangutan, and the Walrus 230-Welcome to λ .com int main () { if (!(q = 0)) *((int*)p)=12; }p == 0 ∨ *p == *q \[\Gamma\ \vdash\ unsurprising for software of that complexity implemented in a programming language like C. After all, C is designed for performance and provides no help to maintain invariants of data structures or to detect errors early, when they are easy to fix. 7
The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] So, Orangutan decided to write a paper that explained the mathematical techniques it used to uncover the bugs and made the unfortunate choice to use the programming language LaTeX. 8
The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] We must not be surprised by this choice, however, as LaTeX is well-known to be a programming language that is effective for typesetting mathematical formulas. 9
The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] After a few months of effort, Orangutan produced a paper extolling the virtues of its new techniques, and the ideas were adopted across the software engineering community and the paper was widely cited. 10
The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] One day, Walrus decided to apply a new, lightweight mechanized metatheory technique to Orangutan’s paper and, sure enough, found multiple bugs — 11
The Koala, the Orangutan, and the Walrus }p == 0 *p == *q \[\Gamma\ \vdash\ (\lambda x:\tau_2.e) : \tau_1\rightarrow \tau_2 \] unsurprising for a piece of mathematics of that complexity implemented in a programming language like LaTeX. After all, LaTeX is designed for beautiful output and provides no help to check invariants of mathematical formulas or to run examples to ensure they illustrate the intended points. 12
Moral: bugs are everywhere 13
A niche for mechanized metatheory: • lightweight: high level of expressiveness (think scripting language) Prototype model • supports the entire semantics lifecycle: Robust Write-up model 14
The Semantics Lifecycle Prototype model Robust Write-up model 15
The Semantics Lifecycle misrenamed Prototype non-terminal model Robust Write-up model 16
The Semantics Lifecycle misrenamed Prototype non-terminal model forgot typing rule Robust Write-up model 17
The Semantics Lifecycle misrenamed Prototype non-terminal model forgot typing rule lost a case in a helper function Robust Write-up model 18
The Semantics Lifecycle misrenamed Prototype non-terminal model forgot typing rule lost a case in a helper function Robust Write-up model added a case to wrong fn 19
The Semantics Lifecycle swappped args misrenamed Prototype non-terminal model forgot typing rule lost a case in a helper function Robust Write-up model added a case to wrong fn 20
The Semantics Lifecycle swappped args misrenamed Prototype non-terminal misused the model inductive hyp. forgot typing rule lost a case in a helper function Robust Write-up model added a case to wrong fn 21
The Semantics Lifecycle swappped args misrenamed Prototype non-terminal misused the model inductive hyp. forgot typing didn’t rule recheck a lemma lost a case in a helper function Robust Write-up model added a case to wrong fn 22
The Semantics Lifecycle swappped args misrenamed Prototype non-terminal misused the model inductive hyp. forgot typing didn’t rule recheck a lemma lost a case in a helper transcribed function Robust math wrong Write-up model added a case to wrong fn 23
The Semantics Lifecycle swappped args misrenamed Prototype non-terminal misused the model inductive hyp. forgot typing didn’t rule recheck a lemma lost a case in a helper transcribed function Robust math wrong Write-up model added a forgot case to to recheck wrong fn example 24
Redex our tool designed to fill this niche 25
Our study: • Can random testing find bugs in an existing, well-tested Redex model? • Can Redex find bugs in published papers? 26
Our study: • Can random testing find bugs in an existing, well-tested Redex model? Yes • Can Redex find bugs in published papers? Yes 27
10 10 papers in Redex 9 ICFP ’09 papers 8 written by others 2 mechanically verified 28
10 papers with errors 10 10 papers in Redex 9 ICFP ’09 papers 8 written by others 2 mechanically verified 29
10 Your papers 10 have errors too 30
Copy & Paste Typesetting Error: 31
Copy & Paste Typesetting Error: 32
Copy & Paste Typesetting Error: Typesetting should be automatic 33
Erroneous Example: 34
Erroneous Example: 35
Erroneous Example: 36
Erroneous Example: Examples can be tested 37
Unexpected Behavior: select(c, c) 38
Unexpected Behavior: select(c, c) compile ~ ⊙ c | select(c, c) 39
Unexpected Behavior: – stuck select(c, c) compile ~ select(c, c) – loops forever ⊙ c | Deadlock in source but busy waiting in target 40
Unexpected Behavior: – stuck select(c, c) compile ~ select(c, c) – loops forever ⊙ c | Deadlock in source but busy waiting in target Found this by playing with examples 41
False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way 42
False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way Counterexample: Not a fly-by-night If σ = {( δ ,1) → 2} then proof; 12 typeset ( λ δ x. x ) 1, σ ⇒ * 2, σ , pages in a dissertation chapter but ( λ δ x. x ) 1 ↦ 1 43
False Theorem: If a term reduces with a memo store, then the program without the memo store reduces the same way Counterexample: Not a fly-by-night If σ = {( δ ,1) → 2} then proof; 12 typeset ( λ δ x. x ) 1, σ ⇒ * 2, σ , pages in a dissertation chapter but ( λ δ x. x ) 1 ↦ 1 Random testing easily finds this 44
Recap: • Automatic typesetting • Unit Testing • Exploring Examples • Random testing 45
p ::= ( e ...) p ::= ( e ...) e ::= ( e e ...) e ::= ( e e ...) Γ ⊢ e 1 : ( → t 2 ... t 3 ) Γ ⊢ e 2 : t 2 ... | ( λ ( x : t ...) e ) Γ ⊢ ( e 1 e 2 ...) : t 3 | x | ( + e ...) ( x 1 : t 1 Γ ) ⊢ ( λ ( x 2 : t 2 ...) e ) : ( → t 2 ... t ) | number | number | ( amb e ...) | ( amb e ...) Γ ⊢ ( λ ( x 1 : t 1 x 2 : t 2 ...) e ) : ( → t 1 t 2 ... t ) t ::= ( → t ... t ) | num t ::= ( → t ... t ) | num Γ ⊢ e : t Γ ⊢ ( λ () e ) : ( → t ) P ::= ( e ... E e ...) E ::= ( v ... E e ...) | ( + v ... E e ...) ( x : t Γ ) ⊢ x : t | [] v ::= ( λ ( x : t ...) e ) Γ ⊢ x 1 : t 1 x 1 ≠ x 2 | number ( x 2 : t 2 Γ ) ⊢ x 1 : t 1 Γ ::= · | ( x : t Γ ) Γ ⊢ e : num ... [ β v] Γ ⊢ ( + e ...) : num P [(( λ ( x : t ... 1 ) e ) v ... 1 )] P [ e { x := v ...}] Γ ⊢ number : num [+] P [( + number 1 ...)] P [ Σ [[ number 1 , ... ] ] ] Γ ⊢ e : num Γ ⊢ e : num ... ... ( e 1 ... E [( amb e 2 ...)] e 3 ...) [amb] ( e 1 ... E [( amb e 2 ...)] e 3 ...) [amb] Γ ⊢ ( amb e ...) : num Γ ⊢ ( amb e ...) : num ( e 1 ... E [ e 2 ] ... e 3 ...) ( e 1 ... E [ e 2 ] ... e 3 ...) 46
Recommend
More recommend