PROGRAMMING SLIDES BY CLAIRE LE GOUES (MOSTLY) BUT ALSO SOME BY - PowerPoint PPT Presentation

AUTOMATIC PROGRAM REPAIR USING GENETIC PROGRAMMING SLIDES BY CLAIRE LE GOUES (MOSTLY) BUT ALSO SOME BY MAHSA VARSHOSAZ & ANDRZEJ WASOWSKI Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming . In ICSE '09 . IEEE Computer 1

“Everyday, almost 300 bugs appear […] far too many for Annual cost of software only the Mozilla programmers errors in the US: $59.5 to handle.” – Mozilla Developer, billion (0.6% of GDP). 2005 PROBLEM: BUGGY SOFTWARE 10%: Everything Else Average time to fix a security-critical error: 28 days. 90%: Maintenance 3 http://www.clairelegoues.com

HOW DO HUMANS FIX NEW BUGS? 4 http://www.clairelegoues.com

Mike (developer) 5 http://www.clairelegoues.com

??! (Mike’s project) 6 http://www.clairelegoues.com

printf transformer 7 http://www.clairelegoues.com

Input: 1 2 3 4 5 6 7 1 8 9 0 1 1 1 2 8 http://www.clairelegoues.com

Input: 1 2 3 4 5 6 7 Legend Likely faultyability 1 Maybe faultyobabilit 8 9 0 Not faulty 1 1 1 2 9 http://www.clairelegoues.com

SECRET SAUCES • Test cases scalably inform about program behavior • Use test cases to evaluate candidate repairs • Existing program code contains the seeds of many repairs • Better use existing developer expertise than invent new code 10 http://www.clairelegoues.com

APPROACH Given a program and a set of test cases, conduct a biased, random search for a set of edits to a program that fixes a given bug. 11 http://www.clairelegoues.com

GENETIC PROGRAMMING: the application of evolutionary or genetic algorithms to program source code. 12 http://www.clairelegoues.com

INPUT EVALUATE FITNESS DISCARD ACCEPT OUTPUT MUTATE 13

GENETIC SEARCH 15 Fig. courtesy of Hossam Faris. https://www.researchgate.net/figure/Flow-chart-of-the-genetic-programming-approach_fig2_253458069

INDIVIDUAL CANDIDATES (INITIAL POPULATION) An individual is a candidate patch or set of changes to the input program. A patch is a series of statement-level edits: • delete X Reduces search space by at least • replace X with Y 2 — 10x • insert Y after X. Replace/insert: pick Y from somewhere else in the program. We are not touching the tests. 16 http://www.clairelegoues.com

MUTATION: HOW To mutate an individual, we add a new random edits to a given patch. • (or we generate a new individual by generating a couple of random edits to make a new patch) • We are not touching the tests 17 http://www.clairelegoues.com

SEARCH SPACE: FAULT LOCALIZATION Hypothesis: statements executed only by the failing test case(s) should be weighted more heavily than those also executed by the passing test cases. 18 http://www.clairelegoues.com

FAULT LOCALIZATION • Instrument the program to record lines visited during tests • The positive test case gcd(1071,1029) • visits lines 2 – 3 and 6 – 13 • The negative test case gcd(0,55) • visits lines 2 – 5, 6 – 7, and 9 – 10 When selecting portions of the program to modify we favor those: • Were visited during the negative test case • Were not also visited during the positive one • In this example, repairs are focused on lines 4 – 5 • This particular fault localization heuristics (custom for this paper) turned out not to be very good in long run. We return to this later. 19

> 1 void gcd(int a, int b) { if (a == 0) { 2 printf( “%d” , b); 3 } 4 while (b > 0) { 5 if (a > b) 6 a = a – b; 7 else 8 b = b – a; 9 } 10 printf (“ %d ”, a); 11 return; 12 13 } 20 http://www.clairelegoues.com

> gcd(4,2) 1 void gcd(int a, int b) { > 2 if (a == 0) { 2 printf( “%d” , b); 3 > } 4 > gcd(1071,1029) while (b > 0) { 5 > 21 if (a > b) 6 > a = a – b; 7 > gcd(0,55) else 8 > 55 b = b – a; 9 } 10 printf (“ %d ”, a); 11 (looping forever) return; 12 13 } 21 http://www.clairelegoues.com

(a=0; b=55) 1 void gcd(int a, int b) { true if (a == 0) { 2 > 55 printf( “%d” , b); 3 } 4 (a=0; b=55) true while (b > 0) { 5 false if (a > b) 6 ! a = a – b; 7 else 8 b = 55 - 0 b = b – a; 9 } 10 printf (“ %d ”, a); 11 return; 12 13 } 22 http://www.clairelegoues.com

Input: {block} while if(a==0) printf(a) return (b>0) {block} {block} {block} Legend High change probability printf(b) if(a>b) Low change probability {block} {block} Not changed a = a – b b = b – a 23 http://www.clairelegoues.com

Input: {block} while if(a==0) printf(a) return (b>0) {block} {block} {block} An edit is: • Insert statement X after statement Y printf(b) if(a>b) • Replace statement X with statement Y • Delete statement X {block} {block} a = a – b b = b – a 24 http://www.clairelegoues.com

Input: {block} while if(a==0) printf(a) return (b>0) {block} {block} {block} An edit is: • Insert statement X after statement Y printf(b) if(a>b) • Replace statement X with statement Y • Delete statement X {block} {block} a = a – b b = b – a 25 http://www.clairelegoues.com

Input: {block} while if(a==0) printf(a) return (b>0) {block} {block} {block} An edit is: • Insert statement X after statement Y printf(b) if(a>b) • Replace statement X with statement Y • Delete statement X {block} {block} return a = a – b b = b – a 26 http://www.clairelegoues.com

MOTIVATING EXAMPLE (CONT … ) • Consider the following program variant: gcd_2(1071,1029) produces 1029 instead of 21 • Thus, the variants must pass the negative test case while retaining other core functionality • This is enforced through positive test cases 28

FITNESS FUNCTION • The fitness function returns a number indicating the acceptability of the program • We first compile the variant’s AST to an executable program • Then record which test cases are passed by that executable • A program variant that does not compile: fitness zero • 32.19% of variants failed to compile in our experiment • The weights W PosT and W NegT should be positive values 30

PATCH MINIMIZATION • Exit(0) is inserted correctly • a = a - b in line 5 is extraneous • Patch minimization (by search, delta- debugging) 31

CLAIMS GenProg can generically fix a variety of bugs in real programs without a priori knowledge. GenProg is human competitive in both expressive power and actual cost. 32 http://www.clairelegoues.com

Program Description LOC Bug Type Time (s) gcd example 22 infinite loop 153 nullhttpd webserver 5575 heap buffer overflow (code) 578 zune example 28 infinite loop 42 uniq text processing 1146 segmentation fault 34 look-u dictionary lookup 1169 segmentation fault 45 look-s dictionary lookup 1363 infinite loop 55 units metric conversion 1504 segmentation fault 109 deroff document processing 2236 segmentation fault 131 indent code processing 9906 infinite loop 546 flex lexical analyzer generator 18774 segmentation fault 230 openldap directory protocol 292598 non-overflow denial of service 665 ccrypt encryption utility 7515 segmentation fault 330 lighttpd webserver 51895 heap buffer overflow (vars) 394 atris graphical game 21553 local stack buffer exploit 80 php scripting language 764489 integer overflow 56 wu-ftpd FTP server 67029 format string vulnerability 2256 leukocyte computational biology 6718 segmentation fault 360 tiff image processing 84067 segmentation fault 108 imagemagick image processing 450516 wrong output 2160

CONCLUSIONS GenProg: scalable, generic, expressive automatic bug repair • Genetic programming search for a patch that addresses a given bug. • Render the search tractable by restricting the search space intelligently. It works! • Fixes a variety of bugs in a variety of programs. • Repaired 60 of 105 bugs for < $8 each, on average. Benchmarks/results/source code/VM images available: • http://genprog.cs.virginia.edu 34 http://www.clairelegoues.com

WHAT COULD’VE GONE WRONG? • What if we write a new test case? what do we do about that? • Machine learning folks have known for years that minimization does not affect quality positively: model size can be independent of degree of overfitting. How could we evaluate overfitting? 35

SEMFIX: PROGRAM REPAIR VIA SEMANTIC ANALYSIS SemFix: program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13) 37

REPAIRING PROGRAMS WITH SEMANTIC CODE SEARCH Yalin Ke Kathryn T. Stolee Claire Le Goues Yuriy Brun Iowa State Iowa State Carnegie Mellon UMass Amherst Y. Ke, K. T. Stolee, C. L. Goues and Y. Brun. Repairing Programs with Semantic Code Search. In ASE ’15 38

OVERFITTING Does the patch generalize beyond the test cases used to create it? Edward K. Smith, Earl Barr, Claire Le Goues, and Yuriy Brun, Is the Cure Worse than the Disease? Overfitting in Automated Program Repair, ESEC/FSE 2015. 42

PROGRAMMING SLIDES BY CLAIRE LE GOUES (MOSTLY) BUT ALSO SOME BY - PowerPoint PPT Presentation

AUTOMATIC PROGRAM REPAIR USING GENETIC PROGRAMMING SLIDES BY CLAIRE LE GOUES (MOSTLY) BUT ALSO SOME BY MAHSA VARSHOSAZ & ANDRZEJ WASOWSKI Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding

voice Kate Howland End-user programming? End-user programming? End-user programming?

Hierarchy of Software Complexity Application Programs Sequential Programming Embedded

Programming Styles and Objects Fermilab - TARGET 2018 Week 3 Programming styles Imperative

+ f(x) = Python Functional Programming Python Functional Programming Functional Programming by

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

CS2281: Programming in UNIX Semester 3, 2004/05 CS2281: Programming in UNIX p.1/13 Syllabus

61A Lecture 26 Announcements Programming Languages Programming Languages 4 Programming

? P12 2 Getting Started/Lab Programming Lab Programming Program of Requirements PRELIMINARY

Introduction to Functional Programming in Python David Jones drj@ravenbrook.com Programming:

GPU programming in Haskell Henning Thielemann 2015-01-23 GPU programming in Haskell Motivation:

Programming Distributed Systems Programming Models for Distributed Systems Annette Bieniusa FB

MATHEMATICS 1 CONTENTS Mathematical programming Linear programming The LP-problem Old exam

Network Programming Network Programming as Programming across Machine Boundaries The

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Functional Programming in 40 minutes @russolsen Functional Programming in 40 minutes

Combining Combining Constraint Programming Constraint Programming and Integer Programming and

An Analysis of Patch Plausibility and Correctness of

Identifying Patch Correctness in Test-based Program Repair Yingfei Xiong, Xinyuan Liu, Muhan Zeng

CS 356 Lecture 9 Malicious Code Spring 2013 Review Chapter 1: Basic Concepts and

PASSIVE IMMUNOTHERAPY: TARGETING TUMOR CELLS CD38 in myeloma and beyond: groundwork & outlook

Leveraging Program Invariants to Promote Population Diversity in Search-Based Automatic Program

Symbolic Execution for Evolving Software Cristian Cadar Department of Computing Imperial College

Do Automated Program Repair Techniques Repair Hard and Important Bugs? Manish Motwani Sandhya

KATCH: High-Coverage Tes2ng of So6ware Patches Paul Marinescu

PROGRAMMING SLIDES BY CLAIRE LE GOUES (MOSTLY) BUT ALSO SOME BY - PowerPoint PPT Presentation

AUTOMATIC PROGRAM REPAIR USING GENETIC PROGRAMMING SLIDES BY CLAIRE LE GOUES (MOSTLY) BUT ALSO SOME BY MAHSA VARSHOSAZ & ANDRZEJ WASOWSKI Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding

voice Kate Howland End-user programming? End-user programming? End-user programming?

Hierarchy of Software Complexity Application Programs Sequential Programming Embedded

Programming Styles and Objects Fermilab - TARGET 2018 Week 3 Programming styles Imperative

+ f(x) = Python Functional Programming Python Functional Programming Functional Programming by

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

CS2281: Programming in UNIX Semester 3, 2004/05 CS2281: Programming in UNIX p.1/13 Syllabus

61A Lecture 26 Announcements Programming Languages Programming Languages 4 Programming

? P12 2 Getting Started/Lab Programming Lab Programming Program of Requirements PRELIMINARY

Introduction to Functional Programming in Python David Jones drj@ravenbrook.com Programming:

GPU programming in Haskell Henning Thielemann 2015-01-23 GPU programming in Haskell Motivation:

Programming Distributed Systems Programming Models for Distributed Systems Annette Bieniusa FB

MATHEMATICS 1 CONTENTS Mathematical programming Linear programming The LP-problem Old exam

Network Programming Network Programming as Programming across Machine Boundaries The

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Functional Programming in 40 minutes @russolsen Functional Programming in 40 minutes

Combining Combining Constraint Programming Constraint Programming and Integer Programming and

An Analysis of Patch Plausibility and Correctness of

Identifying Patch Correctness in Test-based Program Repair Yingfei Xiong, Xinyuan Liu, Muhan Zeng

CS 356 Lecture 9 Malicious Code Spring 2013 Review Chapter 1: Basic Concepts and

PASSIVE IMMUNOTHERAPY: TARGETING TUMOR CELLS CD38 in myeloma and beyond: groundwork &amp; outlook

Leveraging Program Invariants to Promote Population Diversity in Search-Based Automatic Program

Symbolic Execution for Evolving Software Cristian Cadar Department of Computing Imperial College

Do Automated Program Repair Techniques Repair Hard and Important Bugs? Manish Motwani Sandhya

KATCH: High-Coverage Tes2ng of So6ware Patches Paul Marinescu

PASSIVE IMMUNOTHERAPY: TARGETING TUMOR CELLS CD38 in myeloma and beyond: groundwork & outlook