A SYSTEMATIC STUDY OF AUTOMATED PROGRAM REPAIR: FIXING 55 OUT OF 105 BUGS FOR $8 EACH Claire Michael Stephanie Westley Le Goues Dewey-Vogt Forrest Weimer 1 http://genprog.cs.virginia.edu
“Everyday, almost 300 Annual cost of bugs appear […] far too many for only the Mozilla software errors in the programmers to handle.” US: $59.5 billion – Mozilla Developer, (0.6% of GDP). 2005 PROBLEM: BUGGY SOFTWARE 10%: Everything Else Average time to fix a security-critical error: 28 days. 90%: Maintenance 2 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
HOW BAD IS IT? 3 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
4 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
5 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
…REALLY? Tarsnap: 125 spelling/style 63 harmless 11 minor + 1 major 75/200 = 38% TP rate $17 + 40 hours per TP 6 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
…REALLY? Tarsnap: 125 spelling/style 63 harmless 11 minor + 1 major 75/200 = 38% TP rate $17 + 40 hours per TP 7 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
…REALLY? 8 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SOLUTION: PAY STRANGERS 9 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SOLUTION: PAY STRANGERS 10 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SOLUTION: AUTOMATE 11 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC 1 , SCALABLE, COMPETITIVE BUG REPAIR. 1 C. Le Goues, T . Nguyen, S. Forrest, and W. Weimer, “GenProg: A generic method for automated software repair,” Transactions on Software Engineering, vol. 38, no. 1, pp. 54– 72, 2012. W. Weimer, T . Nguyen, C. Le Goues, and S. Forrest, “Automatically finding patches using genetic programming,” in International Conference on Software Engineering, 2009, pp. 364–367. 12 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC 1 , SCALABLE, COMPETITIVE BUG REPAIR. 1 C. Le Goues, T . Nguyen, S. Forrest, and W. Weimer, “GenProg: A generic method for automated software repair,” Transactions on Software Engineering, vol. 38, no. 1, pp. 54– 72, 2012. W. Weimer, T . Nguyen, C. Le Goues, and S. Forrest, “Automatically finding patches using genetic programming,” in International Conference on Software Engineering, 2009, pp. 364–367. 13 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR. 14 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR. 15 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR. 16 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
INPUT EVALUATE FITNESS DISCARD ACCEPT OUTPUT MUTATE Claire Le Goues, ICSE 2012
INPUT EVALUATE FITNESS DISCARD ACCEPT OUTPUT MUTATE Claire Le Goues, ICSE 2012
BIRD’S EYE VIEW Search: random (GP) search through nearby patches. Approach: compose small random edits. • Where to change? • How to change it? 19 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
Input: 1 2 3 4 5 6 7 8 9 10 11 12 20 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
Input: 1 2 3 4 5 6 7 Legend: � High change probability. 8 9 10 � Low change probability. � Not changed. 11 12 21 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
1 2 3 4 5 6 7 An edit is: • Replace statement X with statement Y 8 9 10 • Insert statement X after statement Y • Delete statement X 11 12 22 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
1 2 3 4 5 6 7 An edit is: • Replace statement X with statement Y 8 9 10 • Insert statement X after statement Y • Delete statement X 11 12 23 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
1 2 3 4 5 6 7 An edit is: • Replace statement X with statement Y 8 9 10 • Insert statement X after statement Y • Delete statement X 11 12 24 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
1 2 4 3 5 6 7 An edit is: • Replace statement X with statement Y 9 10 8 • Insert statement X after statement Y • Delete statement X 11 12 25 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
1 2 4 3 4 5 6 7 An edit is: • Replace statement X with statement Y 9 10 8 • Insert statement X after statement Y • Delete statement X 11 12 26 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
1 2 4 3 4 5 6 7 An edit is: • Replace statement X with statement Y 9 10 8 • Insert statement X after statement Y • Delete statement X 11 12 27 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
1 2 4 3 4 5 6 7 An edit is: • Replace statement X with statement Y 9 10 4’ • Insert statement X after statement Y • Delete statement X 11 12 28 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
1 2 4 3 4 5 6 7 An edit is: • Replace statement X with statement Y 9 10 4’ • Insert statement X after statement Y • Delete statement X 11 12 29 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR. 30 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR. 31 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SCALABLE: SEARCH SPACE 1 2 4 3 5 6 7 8 9 10 11 12 32 32 http://genprog.cs.virginia.edu http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SCALABLE: SEARCH SPACE 1 2 4 3 5 6 7 8 9 10 11 12 33 33 http://genprog.cs.virginia.edu http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SCALABLE: SEARCH SPACE 1 2 4 3 5 6 7 8 9 10 11 12 34 34 http://genprog.cs.virginia.edu http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SCALABLE: SEARCH SPACE 1 2 4 3 Fix localization: 5 6 7 intelligently choose code to 8 9 10 move. 11 12 35 35 http://genprog.cs.virginia.edu http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SCALABLE: REPRESENTATION Naïve: New: 1 Input: 2 Delete(3) 1 4 5 2 3 1 4 5 2 5’ Replace(3,5) 4 5 36 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SCALABLE: REPRESENTATION Naïve: New: 1 Input: New fitness, crossover, and 2 Delete(3) mutation operators to work with 1 a variable-length genome. 4 5 2 3 1 4 5 2 5’ Replace(3,5) 4 5 37 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SCALABLE: PARALLELISM Fitness: • Subsample test cases. • Evaluate in parallel. Random runs: • Multiple simultaneous runs on different seeds. 38 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR. 39 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
AUTOMATED PROGRAM REPAIR GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR. 40 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
How many bugs can GenProg fix? COMPETITIVE How much does it cost? 41 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SETUP Goal: systematically test GenProg on a general, indicative bug set. General approach: • Avoid overfitting: fix the algorithm. • Systematically create a generalizable benchmark set. • Try to repair every bug in the benchmark set, establish grounded cost measurements. 42 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SETUP Goal: systematically evaluate GenProg on a general, indicative bug set. General approach: • Avoid overfitting: fix the algorithm. • Systematically create a generalizable benchmark set. • Try to repair every bug in the benchmark set, establish grounded cost measurements. 43 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
CHALLENGE: INDICATIVE BUG SET 44 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SYSTEMATIC BENCHMARK SELECTION Goal: a large set of important, reproducible bugs in non-trivial programs. Approach: use historical data to approximate discovery and repair of bugs in the wild. 45 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
SYSTEMATIC BENCHMARK SELECTION Consider top programs from SourceForge, Google Code, Fedora SRPM, etc: • Find pairs of viable versions where test case behavior changes. • Take all tests from most recent version. • Go back in time through the source control. Corresponds to a human-written repair for the bug tested by the failing test case(s). 46 http://genprog.cs.virginia.edu Claire Le Goues, ICSE 2012
Recommend
More recommend