representations and operators for improving evolutionary
play

REPRESENTATIONS AND OPERATORS FOR IMPROVING EVOLUTIONARY SOFTWARE - PowerPoint PPT Presentation

REPRESENTATIONS AND OPERATORS FOR IMPROVING EVOLUTIONARY SOFTWARE REPAIR Claire Westley Stephanie Le Goues Weimer Forrest 1 http://genprog.cs.virginia.edu Everyday, almost 300 Annual cost of bugs appear [] far too many for only


  1. REPRESENTATIONS AND OPERATORS FOR IMPROVING EVOLUTIONARY SOFTWARE REPAIR Claire Westley Stephanie Le Goues Weimer Forrest 1 http://genprog.cs.virginia.edu

  2. “Everyday, almost 300 Annual cost of bugs appear […] far too many for only the Mozilla software errors in the programmers to handle.” US: $59.5 billion – Mozilla Developer, (0.6% of GDP). 2005 PROBLEM: BUGGY SOFTWARE 10%: Everything Else Average time to fix a security-critical error: 28 days. 90%: Maintenance 2 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  3. APPROACH: EVOLUTIONARY COMPUTATION 3 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  4. Input: source code, specification Genetic Genetic Programming Programming Search Search Output: repaired version of the program 4 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  5. SEARCH SPACE 5 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  6. OTHER GP PROBLEMS GECCO GP-TRACK PROGRAM BEST PAPERS REPAIR Learning: expression Learning: patches or Learning: patches or trees or lists repaired programs repaired programs Population: 64 – 2500 Population: 40 Population: 40 Iterations: 50 – 10000 Iterations: 10 Iterations: 10 Max variant size: 16 Max variant size: Max variant size: operations, 48 operations, unbounded unbounded 17 levels, 11 levels Largest benchmark Largest benchmark program: 2.8 million lines program: 2.8 million lines of C code. of C code 6 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  7. EC-based repair starts with a large genome. SEARCH SPACE The starting individual is mostly correct. 7 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  8. Input: source code, specification Genetic Programming Search Output: repaired version of the program 8 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  9. OUR GOAL IN-DEPTH STUDY OF IN-DEPTH STUDY OF REPRESENTATION AND REPRESENTATION AND OPERATORS FOR OPERATORS FOR EVOLUTIONARY EVOLUTIONARY PROGRAM REPAIR. PROGRAM REPAIR. 9 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  10. INPUT EVALUATE FITNESS DISCARD ACCEPT OUTPUT CROSSOVER, MUTATE, SELECT 10 Claire Le Goues, GECCO 2012

  11. OUR GOAL IN-DEPTH STUDY OF REPRESENTATION AND OPERATORS FOR EVOLUTIONARY PROGRAM REPAIR. 11 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  12. INPUT EVALUATE FITNESS DISCARD ACCEPT OUTPUT CROSSOVER, MUTATE, SELECT 12 Claire Le Goues, GECCO 2012

  13. INPUT EVALUATE FITNESS DISCARD ACCEPT OUTPUT CROSSOVER, MUTATE, SELECT Claire Le Goues, GECCO 2012

  14. 1 2 3 4 5 6 7 8 9 10 11 12 14 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  15. REPRESENTATION AST/WP: Patch: 1 Input: 2 Delete(3) 1 4 5 2 3 Delete(3) Insert(2,4) Replace(5,1) Insert(5,4) Insert(3,3) Delete(4) … 1 4 5 2 5’ Replace(3,5) 4 5 15 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  16. GENETIC OPERATORS delete swap insert Mutation operators • Manipulate only existing genetic material. • Semantic checking improves probability that mutation is viable. Crossover: • One-point: on the weighted path or edit list. • Patch-subset: uniform, on the edit list. 16 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  17. GENETIC OPERATORS delete swap insert Aside: mutation operator selection matters! Mutation operators • Manipulate only existing genetic material. • Semantic checking improves probability that mutation is viable. Crossover: • One-point: on the weighted path or edit list. • Patch-subset: uniform, on the edit list. 17 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  18. Input: 1 1 2 2 3 3 4 4 5 5 6 6 7 7 Legend: � Likely faulty. probability 8 8 9 9 10 10 � Maybe faulty. probability � Not faulty. 11 11 12 12 18 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  19. Input: 1 1 2 2 3 3 4 4 5 5 6 6 7 7 Legend: � High change probability. 8 8 9 9 10 10 � Low change probability. � Not changed. 11 11 12 12 19 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  20. Input: 1 1 2 2 3 3 4 4 Default: 10 : 1 ratio 5 5 6 6 7 7 Legend: � High change probability. 8 8 9 9 10 10 � Low change probability. � Not changed. 11 11 12 12 20 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  21. OUR GOAL IN-DEPTH STUDY OF REPRESENTATION AND OPERATORS FOR EVOLUTIONARY PROGRAM REPAIR. 21 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  22. EXPERIMENTAL SETUP Benchmarks: 105 bugs in 8 real-world programs. 1 • 5 million lines of C code, 10,000 test cases. • Bugs correspond to human-written repairs for regression test failures. Default parameters, for comparison: • Patch representation. • Mutation operators selected with equal random probability. 1 mutation, 1 crossover/individual/iteration. • Population size: 40. 10 iterations or 12 wall-clock hours, whichever comes first. Tournament size: 2. 1 Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer, “A Systematic Study of Automated Program Repair: Fixing 55 out of 105 bugs for $8 Each.” International Conference on Software Engineering (ICSE), 2012, pp. 3 – 13. 22 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  23. EXPERIMENTAL SETUP 55/105 bugs repaired using default parameters. Some bugs are more difficult to repair than others! • Easy: 100% success rate on default parameters. • Medium: 50 – 100% success rate on default parameters • Hard: 1 – 50% success rate on default parameters • Unfixed: 0% success Metrics: • # fitness evaluations to a repair • GP success rate. 23 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  24. RESEARCH QUESTIONS Representation: • Which representation choice gives better results? • Which representation features contribute most to success? Crossover: Which crossover operator is best? Operators: • Which operators contribute the most to success? • How should they be selected? Search space: How should the representation weight program statements to best define the search space? 24 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  25. RESEARCH QUESTIONS Representation: • Which representation choice gives better results? • Which representation features contribute most to success? Crossover: Which crossover operator is best? Operators: • Which operators contribute the most to success? • How should they be selected? Search space: How should the representation weight program statements to best define the search space? 25 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  26. RESEARCH QUESTIONS Representation: • Which representation choice gives better results? • Which representation features contribute most to success? Crossover: Which crossover operator is best? Operators: • Which operators contribute the most to success? • How should they be selected? Search space: How should the representation weight program statements to best define the search space? 26 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  27. REPRESENTATION: RESULTS Procedure: Compare AST/WP to PATCH on original benchmarks with default parameters. For both representations, test effectiveness of: 1. Crossover. 2. Semantic check. Results: 1. Patch outperforms AST/WP (14 – 30%). 2. Semantic check strongly influences success rate of both representations. 3. Crossover also improves results. 27 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  28. RESEARCH QUESTIONS Representation: • Which representation choice gives better results? • Which representation features contribute most to success? Crossover: Which crossover operator is best? Operators: • Which operators contribute the most to success? • How should they be selected? Search space: How should the representation weight program statements to best define the search space? 28 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  29. CROSSOVER: RESULTS Fitness evaluations Crossover Operator to repair Success Rate None 54.4% 82.43 Default/“Uniform” 61.1% 163.05 One-Point/AST-WP 63.7% 114.12 One-Point/Patch 65.2% 118.20 29 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  30. RESEARCH QUESTIONS Representation: • Which representation choice gives better results? • Which representation features contribute most to success? Crossover: Which crossover operator is best? Operators: • Which operators contribute the most to success? • How should they be selected? Search space: How should the representation weight program statements to best define the search space? 30 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  31. SEARCH SPACE: SETUP Hypothesis: statements executed only by the failing test case(s) should be mutated more often than those also executed by the passing test cases. Procedure: examine that ratio in actual repairs. Result: Expected: 10 : 1 vs. Actual: 1 : 1.85 31 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

  32. SEARCH SPACE: REPAIR TIME 110 103.1 # fitness evaluations to repair Default 100 93.6 Realistic 90 Equal 75.7 80 67.0 66.0 70 59.5 57.7 58.6 60 49.1 50 36.3 34.3 40 27.1 30 20 10 0 Easy Medium Hard All Search difficulty 32 http://genprog.cs.virginia.edu Claire Le Goues, GECCO 2012

Recommend


More recommend