RE-TARGETABLE GRAMMAR BASED TEST CASE GENERATION 2 TESTING PARSERS - PowerPoint PPT Presentation

JOE ROZNER / @JROZNER RE-TARGETABLE GRAMMAR BASED TEST CASE GENERATION

� 2 TESTING PARSERS IS HARD

� 3 HOW WE GOT HERE ▸ Mostly clean room (ish) implementation of complex languages (context-free- ish) ▸ ~35k lines of grammar in total (ANTLR) ▸ Implemented from incomplete, inaccurate, and contradictory documentation ▸ Radically different parsing algorithm(s) from original implementations ▸ Lack of public test cases for most dialects

� 4 ARE WE CORRECT? ▸ What is correct? ▸ Can we be 100% correct? ▸ How do we quantify how correct an implementation is? ▸ How do we test the implementations? ▸ How do we get better?

� 5 GETTING MORE TEST CASES? ▸ Request query logs from customers ▸ Stand up applications and record their queries ▸ Automatically generate test cases with a fuzzer

PROBLEMS WITH TRADITIONAL TEST CASE GENERATION ▸ Inflexibility with using test cases ▸ Inflexibility with providing feedback ▸ Existing tools solve many cases but as you deviate they become less useful

STYLES OF FUZZING

� 8 INSTRUMENTATION + RANDOM MUTATION ▸ Focus on path exploration and code coverage ▸ No concept of syntax/semantics ▸ Wont necessarily provide lot’s of coverage for variations of a specific parse tree ▸ Might spend a lot of time on uninteresting/non-relevant code paths ▸ Not immediately clear how to build a proper test harness ▸ Example of this strategy is AFL (American Fuzzy Lop) ▸ https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html

“THE FIRST IMAGE, HIT AFTER ABOUT SIX HOURS ON AN 8-CORE SYSTEM…”

“…CERTAIN TYPES OF ATOMICALLY EXECUTED CHECKS WITH A LARGE SEARCH SPACE MAY POSE AN INSURMOUNTABLE OBSTACLE TO THE FUZZER…”

if (strcmp(header.magic_password, "h4ck3d by p1gZ ” )) goto terminate_now;

“IN PRACTICAL TERMS, THIS MEANS THAT AFL- FUZZ WON'T HAVE AS MUCH LUCK ‘INVENTING’ PNG FILES OR NON-TRIVIAL HTML DOCUMENTS FROM SCRATCH…”

� 13 INSTRUMENTATION + SOLVING ▸ Focus on path exploration and code coverage ▸ Instrument the code and solve for new paths ▸ Still doesn’t care about syntax/semantics ▸ Still not clear how to build a more customer test harness ▸ Not necessarily easy to gate off specific paths that are uninteresting ▸ Example of this is KLEE

� 14 GRAMMAR BASED ▸ Uses a grammar to generate syntactically correct sentences ▸ Typically provide their own grammar language ▸ Mostly targeted at regular/context-free text based languages ▸ Example of this is Mozilla Dharma ▸ https://github.com/MozillaSecurity/dharma

� 15 LET’S ITERATE ▸ Create a platform with parser primitives that can generate instead of parse ▸ Provide support for multiple frontends so manual translation from one grammar language to another is not required ▸ Be expressive enough for regular, context-free, and context-sensitive languages both text and binary ▸ Embeddable and usable from any language ▸ Composable and flexible

� 16 STRUCTURE ▸ Composable libraries ▸ Target + Generation ▸ Frontends ▸ Test harnesses

� 18 WHAT’S NEXT? ▸ Expose a C-compatible API ▸ Starting work on frontends ▸ Better negation logic ▸ Context-Sensitive/Introspective generators ▸ Functional comparison between results with traditional fuzzers

JOE@DEADBYTES.NET / @JROZNER QUESTIONS?

RE-TARGETABLE GRAMMAR BASED TEST CASE GENERATION 2 TESTING PARSERS - PowerPoint PPT Presentation

JOE ROZNER / @JROZNER RE-TARGETABLE GRAMMAR BASED TEST CASE GENERATION 2 TESTING PARSERS IS HARD 3 HOW WE GOT HERE Mostly clean room (ish) implementation of complex languages (context-free- ish) ~35k lines of grammar in total

Working Together What does his future hold? Carres Grammar School Carres Grammar School

Solo5: A sandboxed, re-targetable execution environment for unikernels Dan Williams (IBM

Grammar and word order Grammar and word order Grammar Grammar Includes morphology and syntax

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

Syntax-based test coverage (part 2) - grammar-based testing - Basic grammar concepts

Mutation Testing of Memory- Related Operators Jay Nanavati, Fan Wu, Mark Harman, Yue Jia, Jens

Automated Test Case Generation or: How to not write test cases Stefan Klikovits EN-ICE-SCD

MODEL-BASED, MUTATION-DRIVEN TEST CASE GENERATION VIA HEURISTIC-GUIDED BRANCHING SEARCH Andreas

GRAMMAR THROUGH HUMOR BRANDY SHOOKS & WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having

General Context-Free Grammar Parsing: Application of grammar rewrite rules A phrase structure

Introduction to English Linguistics 4: Grammar and Syntax I Grammar and Syntax Grammar The

Grammar: The Heart of Numeracy 18 Nov, 2017 0B 2017 NNN2 Grammar: The Heart of Numeracy 1 0B

Introduction to English Linguistics 4: Grammar and Syntax Grammar and Syntax Grammar The rules

General Context-Free Grammar Parsing Application of grammar rewrite rules A phrase structure

A .NET-based Test-Data Generator for Combinatorial Grammar- and Schema-based Testing Vadim

Test Instance Generation Test Instance Generation for MAX 2SAT for MAX 2SAT Mitsuo Motoki

OpenACC Tutorial GridKa School 2017: make science && run Andreas Herten , Forschungszentrum

msb( x ) in O(1) steps using 5 multiplications [M.L. Fredman, D.E. Willard, Surpassing the

Exploring Quantum Secret Sharing with the ZX Calculus Vladimir Nikolaev Zamdzhiev Oriel College,

Power Efficiency in Smart Camera Chips Ricardo Carmona-Galn, Jorge Fernndez-Berni, M. Trevisi

Software Security Dynamic analysis and fuzz testing Bhargava Shastry Prof. Jean-Pierre Seifert

Bro stuff Justin Azoff Aug 4, 2015 try.bro.org on github Figure : try.bro on github Bro

Disclosures Atrial Flutter: Optimal Management Major Strategies in 2016 Research grant:

Outline More secure design principles Software engineering for security CSci 5271 Introduction

RE-TARGETABLE GRAMMAR BASED TEST CASE GENERATION 2 TESTING PARSERS - PowerPoint PPT Presentation

JOE ROZNER / @JROZNER RE-TARGETABLE GRAMMAR BASED TEST CASE GENERATION 2 TESTING PARSERS IS HARD 3 HOW WE GOT HERE Mostly clean room (ish) implementation of complex languages (context-free- ish) ~35k lines of grammar in total

Working Together What does his future hold? Carres Grammar School Carres Grammar School

Solo5: A sandboxed, re-targetable execution environment for unikernels Dan Williams (IBM

Grammar and word order Grammar and word order Grammar Grammar Includes morphology and syntax

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

Syntax-based test coverage (part 2) - grammar-based testing - Basic grammar concepts

Mutation Testing of Memory- Related Operators Jay Nanavati, Fan Wu, Mark Harman, Yue Jia, Jens

Automated Test Case Generation or: How to not write test cases Stefan Klikovits EN-ICE-SCD

MODEL-BASED, MUTATION-DRIVEN TEST CASE GENERATION VIA HEURISTIC-GUIDED BRANCHING SEARCH Andreas

GRAMMAR THROUGH HUMOR BRANDY SHOOKS &amp; WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having

General Context-Free Grammar Parsing: Application of grammar rewrite rules A phrase structure

Introduction to English Linguistics 4: Grammar and Syntax I Grammar and Syntax Grammar The

Grammar: The Heart of Numeracy 18 Nov, 2017 0B 2017 NNN2 Grammar: The Heart of Numeracy 1 0B

Introduction to English Linguistics 4: Grammar and Syntax Grammar and Syntax Grammar The rules

General Context-Free Grammar Parsing Application of grammar rewrite rules A phrase structure

A .NET-based Test-Data Generator for Combinatorial Grammar- and Schema-based Testing Vadim

Test Instance Generation Test Instance Generation for MAX 2SAT for MAX 2SAT Mitsuo Motoki

OpenACC Tutorial GridKa School 2017: make science &amp;&amp; run Andreas Herten , Forschungszentrum

msb( x ) in O(1) steps using 5 multiplications [M.L. Fredman, D.E. Willard, Surpassing the

Exploring Quantum Secret Sharing with the ZX Calculus Vladimir Nikolaev Zamdzhiev Oriel College,

Power Efficiency in Smart Camera Chips Ricardo Carmona-Galn, Jorge Fernndez-Berni, M. Trevisi

Software Security Dynamic analysis and fuzz testing Bhargava Shastry Prof. Jean-Pierre Seifert

Bro stuff Justin Azoff Aug 4, 2015 try.bro.org on github Figure : try.bro on github Bro

Disclosures Atrial Flutter: Optimal Management Major Strategies in 2016 Research grant:

Outline More secure design principles Software engineering for security CSci 5271 Introduction

GRAMMAR THROUGH HUMOR BRANDY SHOOKS & WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having

OpenACC Tutorial GridKa School 2017: make science && run Andreas Herten , Forschungszentrum