LAVA: Large-scale Automated Vulnerability Addition Tim Leek, - PowerPoint PPT Presentation

LAVA: Large-scale Automated Vulnerability Addition Tim Leek, Patrick Hulin, Ryan Whelan (MIT/LL), Brendan Dolan-Gavitt (NYU), Fredrick Ulrich, Andrea Mambretti, Wil Robertson, and Engin Kirda (Northeastern) May 22, 2016 This work is sponsored by the Assistant Secretary of Defense for Research and Engineering under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government.

The problem: vulnerability discovery NEWS ACADEMIA 2016 1990 1995 2005 INDUSTRY Tim Leek- 2 TRL 02/25/16

Existing vulnerability corpora Forbes, 2012 Tim Leek- 3 TRL 02/25/16

Vulnerability corpora sources Source Cost Realism Yield Accident High Tiny Search $$$$ Med-High Low Injection $$ Med Low-Med LAVA Synthesis $ Low High Tim Leek- 4 TRL 02/25/16

LAVA concept • Vulnerability corpus requirements • Caveats q Cheap and plentiful – Works only on source q Realistic – C programs q Triggering input – Linux q Manifest only for one or very few inputs – Buffer overflows q Security-critical effect • Large-scale Automated Vulnerability Addition – Uses static and dynamic analysis to find attacker-controlled data that can be used to introduce new code that creates a bug – Change program and input at same time to insert bugs in known places – Special sauce: new taint-based measures Tim Leek- 5 TRL 02/25/16

Dynamic taint analysis • PANDA dynamic taint – Whole system (all processes + kernel) – Works on binaries – Includes all library code – Oddball x86 instructions all analyzed including FPU and SSE – Many labels supported: Every byte in 10MB file – Labels combine into sets to represent computation – Fast (enough). 50-100x Tim Leek- 6 TRL 02/25/16

Taint-based measures DEAD, Liveness: Taint compute number: UNCOMPLICATED, and Number of branches an input byte AVAILABLE data (DUA) Depth of lval tree of computation. is used to decide. How complicated a function of Attacker-controlled data How much effect upon control input bytes is an lval? that can be used to flow do specific input bytes have? create a vulnerability Tim Leek- 7 TRL 02/25/16

Taint-based measures DEAD, Liveness: Taint compute number : UNCOMPLICATED, and Number of branches an input byte AVAILABLE data (DUA) Depth of lval tree of computation. is used to decide. How complicated a function of Attacker-controlled data How much effect upon control input bytes is an lval? that can be used to flow do specific input bytes have? create a vulnerability Tim Leek- 8 TRL 02/25/16

LAVA Taint-based bug injection Instrument source Clang with taint queries Input corpus Run instrumented PANDA record program on inputs Find attacker- PANDA replay controlled data Injectable + taint analysis bugs and attack points Inject bug into Clang Bug program source, Corpus compile and test with modified input Tim Leek- 9 TRL 02/25/16

LAVA bug example • PANDA taint analysis tells us that bytes 0-3 in the buffer buf at line 115 of src/encoding.c is attacker-controlled • We also learn from PANDA that there is a pointer we can corrupt, ‘ &info ’, later in the execution, in src/readelf.c Attacker controlled data encoding.c 115: } else if (looks_extended(buf, nbytes, *ubuf, ulen)) { Corruptible New data flow pointer readcdf.c 365: if (cdf_read_header(&info, &h) == -1) Tim Leek- 10 TRL 02/25/16

LAVA bug example • PANDA taint analysis tells us that bytes 0-3 in the buffer buf at line 115 of src/encoding.c is attacker-controlled • We also learn from PANDA that there is a pointer we can corrupt, ‘ &info ’, later in the execution, in src/readelf.c Attacker controlled data encoding.c 115: } else if (looks_extended(buf, nbytes, *ubuf, ulen)) { Corruptible New data flow pointer readcdf.c 365: if (cdf_read_header(&info, &h) == -1) Tim Leek- 11 TRL 02/25/16

LAVA bug example // encoding.c: } else if (({int rv = looks_extended(buf, nbytes, *ubuf, ulen); if (buf) { int lava = 0; lava |= ((unsigned char *)buf)[0]; lava |= ((unsigned char *)buf)[1] << 8; lava |= ((unsigned char *)buf)[2] << 16; lava |= ((unsigned char *)buf)[3] << 24; lava_set(lava); }; rv; })) { // readcdf.c: if (cdf_read_header ((&info) + (lava_get()) * (0x6c617661 == (lava_get()) || 0x6176616c == (lava_get())), &h) == -1) Tim Leek- 12 TRL 02/25/16

Vulnerability injection effectiveness Over 200K possible? • Four open source programs 10K -> 2M LOC • 2000 injection attempts per target (of over 1M) • LAVA yield (validated injected bugs): 10->50% • Over 2000 bugs injected Tim Leek- 13 TRL 02/25/16

Using LAVA to evaluate tools • Created two corpora using LAVA – LAVA-1 programs containing individual bugs of varying difficulty – LAVA-M programs each with more than one bug • Evaluated two open-source vulnerability discovery tools by ability to detect LAVA bugs Detection < 2% – Fuzzer – Symbolic execution + SAT solving Tim Leek- 14 TRL 02/25/16

LAVA vulnerability realism Realism is a concern. But hard to quantify One possible measure is the fraction of the trace that is unaffected by LAVA yet must be analyzed correctly to discover the vulnerability LAVA’s bugs are inserted, generally quite far along in the trace. If anything we need some easier ones DUA ATP Execution trace Tim Leek- 15 TRL 02/25/16

Summary and future directions • Summary – Working system automates construction of large corpora for study and assessments – Novel taint-based measures are key: liveness and TCN • Future directions – Continuous on-line competition to encourage self-eval – Use in security competitions like Capture the Flag to re-use and construct challenges on-the-fly – Assess and improve realism of LAVA bugs – More types of vulnerabilities – More interesting effects (exploitable ones) Tim Leek- 16 TRL 02/25/16

LAVA: Large-scale Automated Vulnerability Addition Tim Leek, - PowerPoint PPT Presentation

LAVA: Large-scale Automated Vulnerability Addition Tim Leek, Patrick Hulin, Ryan Whelan (MIT/LL), Brendan Dolan-Gavitt (NYU), Fredrick Ulrich, Andrea Mambretti, Wil Robertson, and Engin Kirda (Northeastern) May 22, 2016 This work is sponsored

What are lava coasts? What are lava coasts? A lava coast is any beach or area of land A

Gururla servis edin! Serve proudly! XXX 1 Tm doallyla, her mutfaa Lava Lava for

Earthquake Vulnerability Earthquake Vulnerability Vulnerability Assessment & EVR measures

LAVA federated testing Testing with and by the community Remi Duraffort Whats LAVA? L inaro A

Vulnerability Management Spring 2020 Jay Chen What is a vulnerability? A vulnerability is a

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Crisis and Crisis and Vulnerability Vulnerability ILO Crisis Response : Trainers Guide

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Automated Design of Digital Automated Design of Digital Automated Design of Digital Automated

Roland Mill Lava Ul6mate Chad C. Duplan6s, DDS

Water pump + Air pump Air pump Substrats Seedlings in coconut coir / Semis dans fibre de coco

flow (rapid) Lava tubes Ash Bomb Cinder cone (cinders) Cinder cones St. Pierre,

vulnerability in urban area Olivier SANTONI FERDI AREQUIPA - May 2017 Risk assessment

Automated vulnerability scanning and exploitation Dennis Pellikaan Thijs Houtenbos University of

KERNEL C.I. USING LINAROS AUTOMATED VALIDATION ARCHITECTURE Wednesday, September 11, 13 TYLER

Vulnerability Assessm ent 2 0 0 4 Luanda, 2 5 June 2 0 0 4 Vulnerability Assessment 2004 - p1

Organic Compounds in Water and Wastewater Oil Spill Cleanup and Surfactant Use Kristie

Opinion Mining Exercises Feiyu Xu DFKI December 14, 2011 12/20/11 Language Technology I 1

DFKI at QA@Clef 2007 Gnter Neumann, Bogdan Sacaleanu, Christian Spurk, Rui Wang Language

Stencil-like operations on unstructured meshes wissen leben Christian Engwer 13.04.2015, WWU

Combinatorics of spoke systems for Frchet-Urysohn points Robert Leek Cardiff University, UK

( | ) ( ) P E H P H = ( | ) P H E P( E ) can be determined since categories are

Stata Conference Dario Sansone 2017 User Conference Baltimore Now You See Me High School

I w I want nt to do th o do the rig ight ht thi hing ng but ut SHAPE APES S

LAVA: Large-scale Automated Vulnerability Addition Tim Leek, - PowerPoint PPT Presentation

LAVA: Large-scale Automated Vulnerability Addition Tim Leek, Patrick Hulin, Ryan Whelan (MIT/LL), Brendan Dolan-Gavitt (NYU), Fredrick Ulrich, Andrea Mambretti, Wil Robertson, and Engin Kirda (Northeastern) May 22, 2016 This work is sponsored

What are lava coasts? What are lava coasts? A lava coast is any beach or area of land A

Gururla servis edin! Serve proudly! XXX 1 Tm doallyla, her mutfaa Lava Lava for

Earthquake Vulnerability Earthquake Vulnerability Vulnerability Assessment &amp; EVR measures

LAVA federated testing Testing with and by the community Remi Duraffort Whats LAVA? L inaro A

Vulnerability Management Spring 2020 Jay Chen What is a vulnerability? A vulnerability is a

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Crisis and Crisis and Vulnerability Vulnerability ILO Crisis Response : Trainers Guide

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Automated Design of Digital Automated Design of Digital Automated Design of Digital Automated

Roland Mill Lava Ul6mate Chad C. Duplan6s, DDS

Water pump + Air pump Air pump Substrats Seedlings in coconut coir / Semis dans fibre de coco

flow (rapid) Lava tubes Ash Bomb Cinder cone (cinders) Cinder cones St. Pierre,

vulnerability in urban area Olivier SANTONI FERDI AREQUIPA - May 2017 Risk assessment

Automated vulnerability scanning and exploitation Dennis Pellikaan Thijs Houtenbos University of

KERNEL C.I. USING LINAROS AUTOMATED VALIDATION ARCHITECTURE Wednesday, September 11, 13 TYLER

Vulnerability Assessm ent 2 0 0 4 Luanda, 2 5 June 2 0 0 4 Vulnerability Assessment 2004 - p1

Organic Compounds in Water and Wastewater Oil Spill Cleanup and Surfactant Use Kristie

Opinion Mining Exercises Feiyu Xu DFKI December 14, 2011 12/20/11 Language Technology I 1

DFKI at QA@Clef 2007 Gnter Neumann, Bogdan Sacaleanu, Christian Spurk, Rui Wang Language

Stencil-like operations on unstructured meshes wissen leben Christian Engwer 13.04.2015, WWU

Combinatorics of spoke systems for Frchet-Urysohn points Robert Leek Cardiff University, UK

( | ) ( ) P E H P H = ( | ) P H E P( E ) can be determined since categories are

Stata Conference Dario Sansone 2017 User Conference Baltimore Now You See Me High School

I w I want nt to do th o do the rig ight ht thi hing ng but ut SHAPE APES S

Earthquake Vulnerability Earthquake Vulnerability Vulnerability Assessment & EVR measures