Security Testing fuzzing protocol fuzzing m odel-based testing - - PowerPoint PPT Presentation

security testing
SMART_READER_LITE
LIVE PREVIEW

Security Testing fuzzing protocol fuzzing m odel-based testing - - PowerPoint PPT Presentation

Security Testing fuzzing protocol fuzzing m odel-based testing autom ated reverse engineering autom ated reverse engineering Erik Poll Erik Poll Radboud University Nijmegen Testing Ingredients T ti I di t Two things are needed to test


slide-1
SLIDE 1

Security Testing

fuzzing protocol fuzzing m odel-based testing autom ated reverse engineering autom ated reverse engineering Erik Poll Erik Poll Radboud University Nijmegen

slide-2
SLIDE 2

T ti I di t

Two things are needed to test a SUT ( System Under Test)

Testing Ingredients

Two things are needed to test a SUT ( System Under Test) 1 . test suite, ie collection of input data 2 . a test oracle that decides if a test was ok or reveals an error, , i.e. some way to decide if the SUT behaves as we want A nice & simple test oracle: just seeing if the SUT crashes A nice & simple test oracle: just seeing if the SUT crashes Both defining test suite and test oracles can be a lot of work: for each individual test case the test oracle may need to be y tweaked by specifying exactly what should happen

2

slide-3
SLIDE 3

C C it i

Measures of how good a test suite is

Coverage Criteria

Measures of how good a test suite is

  • statement coverage
  • branch coverage

Statement coverage does not imply branch coverage; eg for g p y g ; g void f (int x, y) { if (x>0) {y++}; y--; } statement coverage needs 1 test case, branch coverage needs 2

  • More complex coverage criteria exists, eg MCDC (Modified

condition/ decision coverage) which is used in avionics g )

<#>

slide-4
SLIDE 4

P ibl P Eff t f C C it i

High coverage criteria may discourage defensive programming

Possible Perverse Effect of Coverage Criteria

High coverage criteria may discourage defensive programming void m(File f){ if <security_check_fails> {throw (SecurityException)} try { <the main part of the method> } catch (SomeException) { <take some measures>; throw (SecurityException) } ( y p ) } } If th d f i d i h d t t i i t t If the green defensive code is hard to trigger in tests, programmers may be tempted (or forced) to remove it to improve coverage in testing... testing...

4

slide-5
SLIDE 5

S it t ti i HARD i l

  • Normal testing will look at right wanted behaviour for sensible

Security testing is HARD, in general

  • Normal testing will look at right, wanted behaviour for sensible

inputs, and some inputs on borderline conditions Security testing also involves looking for the wrong unwanted

  • Security testing also involves looking for the wrong, unwanted

behaviour for really silly inputs Si il l l f t i lik l t l

  • Similarly, normal use of a system is more likely to reveal

functional problems (users will complain) than than security problems (hackers won’t complain)

5

slide-6
SLIDE 6

S it t ti i HARD i l Security testing is HARD, in general

all possible inputs normal inputs

. input that triggers security bug . . . . .

inputs

. .

6

slide-7
SLIDE 7

JML t ti t t l

Tools for runtime assertion checking of JML annotations can be used

JML annotations as test oracle

Tools for runtime assertion checking of JML annotations can be used when testing

  • code instrumented with check to test annotations which throw
  • code instrumented with check to test annotations, which throw

special exceptions for violations

  • effectively the annotations serve as test oracle
  • effectively, the annotations serve as test oracle

Benefits: T l f f b di d d

  • Test oracle for free: you can test by sending random data
  • More precise and detailed feedback: adding

//@ invariant contents != null; //@ invariant contents != null; an application may crash with an Invariant Violation in line 18000 after 1 minute with runtime assertion checking, whereas otherwise it would crash NullpointerException in line 12000 after 5 minutes - pointing to the real origin of the problem, not the eventual effect

7

slide-8
SLIDE 8

S b li E ti f t t it

  • Symbolic execution can be used to generate test suites with good

Symbolic Execution for test suites

  • Symbolic execution can be used to generate test suites with good

coverage Basic idea symbolic execution:

  • Basic idea symbolic execution:

instead of giving variables a concrete value (say 42), variables are i b li l ( N) d th i t d ith given a symbolic value (say N), and the program is executed with these symbolic values to see when certain program points are reached reached

8

slide-9
SLIDE 9

S b li E ti

m(int x y){

Symbolic Execution

m(int x,y){ x = x + y; y = y – x; if (2*y > 8) { .... ( y ) { } else if (3*x < 10){ ... } }

9

slide-10
SLIDE 10

S b li E ti

m(int x y){

Symbolic Execution

// let x == N and y == M m(int x,y){ x = x + y; // let x == N and y == M // x becomes N+M y = y – x; if (2*y > 8) { .... // y becomes M-(N+M) == -N // taken if 2*-N > 8, ie N < -4 ( y ) { } // , else if (3*x < 10){ ... } // taken if N>=-4 and 3(M+N)<10 } There are tools that given such sets of constraints try to produce test There are tools that, given such sets of constraints, try to produce test data that meets these constraints

10

slide-11
SLIDE 11

S b li E ti

Symbolic execution can also be used for program verification:

Symbolic Execution

Symbolic execution can also be used for program verification: 1. symbolically execute a method (or piece of code) 2. assuming precondition (and invariant) on initial values, prove postcondition (and invariant) for final values p p ( )

11

slide-12
SLIDE 12

Fuzzing

12

slide-13
SLIDE 13

F i

Fuzzing

Fuzzing

Fuzzing try really long inputs for string arguments to trigger segmentation faults and hence find buffer overflows faults and hence find buffer overflows Benefit: can be automated, because test suite of long inputs can b t ti ll t d d t t l i t i i l l ki if th be automatically generated, and test oracle is trivial: looking if the program crashes This original idea has been generalised to other settings: The general idea of fuzzing: using semi-random, automatically generated test data that is likely to trigger security problems g y gg y p

13

slide-14
SLIDE 14

F i i f l

For memory safe languages such as Java or C(+ + ) fuzzing can still

Fuzzing in memory safe languages

For memory safe languages such as Java or C(+ + ), fuzzing can still reveal bugs in a VM, bytecode verifier, or libraries with native code Eg fast graphics libraries often rely on native code Eg, fast graphics libraries often rely on native code

CVE reference: CVE-2007-0243 Release Date: 2007 01 17 Release Date: 2007-01-17 Sun Java JRE GIF Image Processing Buffer Overflow Vulnerability Critical: Highly critical Impact: System access Where: From remote g y p y Description: A vulnerability has been reported in Sun Java Runtime Environment (JRE), which can be exploited by malicious people to i l bl t Th l bilit i d d t compromise a vulnerable system. The vulnerability is caused due to an error when processing GIF images and can be exploited to cause a heap- based buffer overflow via a specially crafted GIF image with an image width of 0 width of 0. Successful exploitation allows execution of arbitrary code.

14

slide-15
SLIDE 15

Fil f t f i

Incorrectly formatted files or corner cases in file formats can cause

File format fuzzing

Incorrectly formatted files, or corner cases in file formats can cause trouble

Eg Eg

  • GIF image with width 0 on previous slide
  • Microsoft Security Bulletin MS04-028

Buffer Overrun in JPEG Processing (GDI+ ) Could Allow Code Execution Impact of Vulnerability: Remote Code Execution Maximum Severity Rating: Critical Recommendation: Customers should apply the update immediately Recommendation: Customers should apply the update immediately Root cause: a zero sized comment field, without content.

15

slide-16
SLIDE 16

F i b li ti ?

  • Could we fuzz a web application in the hope to find security flaws?

Fuzzing web-applications?

  • Could we fuzz a web application in the hope to find security flaws?
  • SQL injection
  • XSS
  • ...
  • What would be needed?
  • test inputs that trigger these security flaws
  • test inputs that trigger these security flaws
  • some way of detecting if a security flaw occurred
  • looking at website response, or log files

16

slide-17
SLIDE 17

F i b li ti

  • There are many tools to fuzz web-applications

Fuzzing web-applications

  • There are many tools to fuzz web-applications
  • Spike proxy, HP Webinspect, AppScan, WebScarab, Wapiti,

w3af RFuzz WSFuzzer SPI Fuzzer Burp Mutilidae w3af, RFuzz, WSFuzzer, SPI Fuzzer Burp, Mutilidae, ...

  • Some fuzzers crawl a website, generating traffic themselves,
  • ther fuzzers modify traffic generated by some other means
  • ther fuzzers modify traffic generated by some other means.
  • As usual, there will be false positives & negatives, eg
  • false negative for SQL injection due to not recognizing some SQL

database errors

  • false positives for XSS due to signalling a correctly quoted echoed

response as XSS

[ Frank van der Loo, Comparison of penentration testing tools for web applications, MSc

thesis]

17

slide-18
SLIDE 18

P t l F i

Protocol fuzzing based on known protocol format

Protocol Fuzzing

Protocol fuzzing based on known protocol format

ie format of packets or messages

Typical things to try in protocol fuzzing: t i t / ll ibl l f ifi fi ld

  • trying out many/ all possible value for specific fields

esp undefined values, or values Reserved for Future Use (RFU)

  • giving incorrect lengths, length that are zero, or payloads that are

too short/ long Tools for protocol fuzzing exist, eg SNOOZE Tools for protocol fuzzing exist, eg SNOOZE

18

slide-19
SLIDE 19

E l GSM t l f i

  • GSM is a extremely rich & complicated protocol

Example : GSM protocol fuzzing

  • GSM is a extremely rich & complicated protocol

19

slide-20
SLIDE 20

SMS fi ld

Field size

SMS message fields

Field size Message Type Indicator 2 bit Reject Duplicates 1 bit eject up cates b t Validity Period Format 2 bit User Data Header Indicator 1 bit Reply Path 1 bit Message Reference integer Destination Address 2-12 byte Protocol Identifier 1 byte Data Coding Scheme (CDS) 1 byte Data Coding Scheme (CDS) 1 byte Validity Period 1 byte/ 7 bytes User Data Length (UDL) integer User Data Length (UDL) integer User Data depends on CDS and UDL

20

slide-21
SLIDE 21

E l GSM t l f i

  • Lots of stuff to fuzz!

Example: GSM protocol fuzzing

  • Lots of stuff to fuzz!
  • We can use a USRP

with open source cell tower software (OpenBTS) p ( p ) f h to fuzz phones

[ Mulliner et al, SMS of Death: from analyzing to attacking mobile phones on a large scale] [ Brinio Hond, Fuzzing the GSM protocol, MSc thesis]

21

slide-22
SLIDE 22

E l GSM t l f i

  • Fuzzing SMS layer of GSM reveals weird functionality in GSM

Example : GSM protocol fuzzing

  • Fuzzing SMS layer of GSM reveals weird functionality in GSM

standard and in phones

22

slide-23
SLIDE 23

E l GSM t l f i

  • Fuzzing SMS layer of GSM reveals weird functionality in GSM

Example : GSM protocol fuzzing

  • Fuzzing SMS layer of GSM reveals weird functionality in GSM

standard and on phones eg possibility to send faxes (!?)

  • eg possibility to send faxes (!?)

you have a fax! Only way to get rid if this icon; reboot the phone

23

slide-24
SLIDE 24

E l GSM t l f i

  • Malformed SMS text messages showing raw memory contents

Example : GSM protocol fuzzing

  • Malformed SMS text messages showing raw memory contents,

rather than content of the text message

24

slide-25
SLIDE 25

E l GSM t l f i

  • Lots of success to DoS phones: phones crashing disconnecting

Example : GSM protocol fuzzing

  • Lots of success to DoS phones: phones crashing, disconnecting

from the network, or stopping accepting calls

  • eg requiring reboot or battery removal to restart to accept calls again
  • eg requiring reboot or battery removal to restart, to accept calls again,
  • r to remove weird icons
  • after reboot the network might redeliver the SMS message if no

after reboot, the network might redeliver the SMS message, if no acknowledgement was sent before crashing But: not all these SMS messages could be sent over real network g

  • There is not always a correlation between problems and phone

brands & firmware versions

  • how many implementations of the GSM stack does Nokia have?

The scary part: what would happen if we fuzz base stations

  • The scary part: what would happen if we fuzz base stations...

25

slide-26
SLIDE 26

E l f i t

  • e-passports implement protocol to prevent giving any info to

Example: fuzzing e-passports

  • e-passports implement protocol to prevent giving any info to

passive eavesdropper of active attacker correct protocols runs don’t leak info to an eavesdropper

  • correct protocols runs don’t leak info to an eavesdropper
  • Fuzzing unexpected but correctly formatted instructions

leaks a unique fingerprint per implementation, and hence (almost) unique per country ( ) q p y

  • for Australian, Belgian, Dutch, French, German, Greek, Italian, Polish, Spanish, Swedish passports

Here we don’t fuzz to crash, Here we don t fuzz to crash, but to see if there is information leakage

[ Henning Richter et al. , Fingerprinting passport, NLUUG 2009] [ g , g p g p p , ]

26

slide-27
SLIDE 27

St t b d P t l F i

Instead of fuzzing the content of individual messages

State-based Protocol Fuzzing

Instead of fuzzing the content of individual messages, we can also fuzz the order of messages using protocol state-machine to 1. reach an interesting state in the protocol and then fuzz content of g p messages there; 2 fuzz the order of messages to discover effect of strange sequences 2. fuzz the order of messages to discover effect of strange sequences

27

slide-28
SLIDE 28

St t b d P t l F i

  • Most protocols have different types of messages

State-based Protocol Fuzzing

  • Most protocols have different types of messages,

which should come in a particular order We can fuzz a protocol by trying out the different types of

  • We can fuzz a protocol by trying out the different types of

messages in all possible orders Thi l l h l i th li ti l i

  • This can reveal loop-holes in the application logic

Essentially this is a from of model-based testing where we Essentially this is a from of model-based testing, where we automatically test if an impementation conforms to a model

[ Tools for this: Peach jTor] [ Tools for this: Peach, jTor]

28

slide-29
SLIDE 29

P t l C l it

NB most real protocols are much more complicated than the ones you

Protocol Complexity

NB most real protocols are much more complicated than the ones you study in Verification of Security Protocols Essence of SSH transport layer Essence of SSH transport layer

1. C -> S: NC 2 S C NS 2. S -> C: NS 3. C -> S: exp(g,X) 4 S -> C: k S exp(g Y) { H} inv(k S) 4. S -> C: k_S.exp(g,Y).{ H} _inv(k_S) with K= exp(exp(g,X),Y), H= hash(NC.NS.k_S.exp(g,X).exp(g,Y).K) C S { } CS 5. C -> S: { XXX} _KCS with SID= H, KCS= hash(K.H.c.SID) 6. S -> C: { YYY} _KSC with SID= H, KSC= hash(K.H.d.SID) 29

slide-30
SLIDE 30

P t l C l it

NB most real protocols are much more complicated than the ones you

Protocol Complexity

NB most real protocols are much more complicated than the ones you study in Verification of Security Protocols Essence of SSH transport layer Real SSH transport layer Essence of SSH transport layer Real SSH transport layer

1. C -> S: NC 2 S C NS 2. S -> C: NS 3. C -> S: exp(g,X) 4 S -> C: k S exp(g Y) { H} inv(k S) 4. S -> C: k_S.exp(g,Y).{ H} _inv(k_S) with K= exp(exp(g,X),Y), H= hash(NC.NS.k_S.exp(g,X).exp(g,Y).K) C S { } CS 5. C -> S: { XXX} _KCS with SID= H, KCS= hash(K.H.c.SID) 6. S -> C: { YYY} _KSC with SID= H, KSC= hash(K.H.d.SID)

excluding all the error transitions back to the initial state

30

back to the initial state

slide-31
SLIDE 31

M d l b d t ti

General framework for automating testing

Model based testing

General framework for automating testing 1. make a formal model M of (some aspect of) the SUT 2. fire random inputs to M and the SUT 3. look for differences in the response p Such a difference means an error in the SUT, or the model...

31

slide-32
SLIDE 32

E l d l b d t ti f t Example: model based testing of e-passport

test tool tool

... ...

SUT model

Test tool sends the same random sequence of

model

commands to the model and the SUT, and checks if the responses match

32

slide-33
SLIDE 33

E l d l b d t ti f MIDPSSH

MIDPSSH i l t ti f SSH f J bl d f t h

Example: model based testing of MIDPSSH

MIDPSSH implementation of SSH of Java-enabled feature phone Implementors of MIDPSSH forgot to track the protocol state: any sequence of messages would be accepted So a Man-in-the-Middle attacker could eg. ask the client for a username/ password before a session key had been agreed

any message state machine implemented in MIDPSSH state machine model of SSH

33 [ Aleksy Schubert et al, Verifying an implementation of SSH, WITS 2007]

slide-34
SLIDE 34

Reverse Engineering

34

slide-35
SLIDE 35

I th th di ti

Instead of using protocol knowledge when testing

In the other direction:

Instead of using protocol knowledge when testing in protocol fuzzing or model-based fuzzing l k l d b l we can also use testing to gain knowledge about a protocol

  • r a particular implementation of a protocol

In order to

  • analyse your own code and hunt for bugs, or
  • reverse-engineer someone else’s unknown protocol
  • reverse-engineer someone else s unknown protocol,

eg a botnet,

to fingerprint or to analyse (and attack) it

35

slide-36
SLIDE 36

Wh t t i ?

Different aspects that can be learned:

What to reverse engineer?

Different aspects that can be learned:

  • timing/ traffic analysis
  • protocol formats
  • ie format of protocol packets

[ eg using Discoverer, Dispatcher, Tupni,.... ]

  • protocol state-machine

[ eg using LearnLib]

  • both protocol format & state-machine

both protocol format & state machine

[ eg using Prospex]

36

slide-37
SLIDE 37

H t i ? How to reverse engineer?

  • passive

vs active learning

  • passive

vs active learning ie passive observing or active testing

  • active learning involves a form of fuzzing
  • active learning is harder, as it requires more software in test

harness that produces meaningful data

  • these approaches learns different things;

passive learning produces statistics on normal use, active learning will more aggresvely try our strange things

  • black box vs w hite box

ie only observing in/ output or also looking inside running code y g / p g g

37

slide-38
SLIDE 38

R i i t d t ffi ?

  • Can we reverse engineer protocol formats if traffic is encrypted?

Reverse engineering encrypted traffic?

  • Can we reverse engineer protocol formats if traffic is encrypted?

say for a botnet

  • Trace the encrypted data through the code,

to see where it gets decrypted, and then look at the parsing and case distinctions made on the buffer look at the parsing and case distinctions made on the buffer containing the decrypted data

  • Such white-box analyses of encypted traffic, by looking at handling
  • f data after decryption, is done by ReFormat at TaintScope

38

slide-39
SLIDE 39

A ti l i ith A l i ’ L* l ith Active learning with Angluin’s L* algorithm

Basic idea: compare a deterministic system’s response to Basic idea: compare a deterministic system s response to

  • a
  • b ; a

a a ? ?

If response is different, then

a a b a ?

  • therwise ?

b

39

slide-40
SLIDE 40

A ti l i ith L* Active learning with L*

Implemented in LearnLib library;

reset

p y; The learner builds hypothesis H of what the real system M is

Learner Teacher

input

H M

  • utput

equivalence: M = H ? yes or a counterexample

Equivalence can only be approximated in a black box setting; by doing model-based testing to see if a difference can be detected

40

slide-41
SLIDE 41

L i t f EMV b ki d Learning set-up for EMV banking cards

abstract instructions abstract instructions and response concrete instructions and response

Learner

Teacher

instruction INS M

H

test

2 byte INS + args M

harness

y status word SW data + SW

[ Fides Aarts et al, Formal models of banking cards for free, SECTEST 2013] 41

slide-42
SLIDE 42

T t h f EMV Test harness for EMV

Our test harness implements standard EMV instructions eg Our test harness implements standard EMV instructions, eg

  • SELECT (to select application)
  • INTERNAL AUTHENTICATE (for a challenge-response)
  • VERIFY (to check the PIN code)
  • READ RECORD
  • GENERATE AC (to generate application cryptogram)
  • GENERATE AC (to generate application cryptogram)

LearnLib then tries to learn all possible combinations

  • Most commands with fixed parameters, but some with different
  • ptions

42

slide-43
SLIDE 43

Maestro application on Volksbank bank card pp raw result

43

slide-44
SLIDE 44

Maestro application on Volksbank bank card pp merging arrows with identical outputs

44

slide-45
SLIDE 45

Maestro application on Volksbank card pp merging all arrows with same start & end state

45

slide-46
SLIDE 46

F l d l f b ki d f f ! Formal models of banking cards for free!

  • Experiments with Dutch German and Swedish banking and credit
  • Experiments with Dutch, German and Swedish banking and credit

cards Learning takes between 9 and 26 minutes

  • Learning takes between 9 and 26 minutes
  • Editing by hand to merge arrows and give sensible names to states
  • could be automated
  • Limitations
  • We do not try to learn response to incorrect PIN as cards would quickly

block...

  • We cannot learn about one protocol step which requires knowledge of

card’s secret 3DES key

  • We would also like to learns some integer parameter used in protocol
  • We would also like to learns some integer parameter used in protocol
  • No security problems found, but interesting insight in

implementations implementations

46

slide-47
SLIDE 47

S C d li ti R b b k d SecureCode application on Rabobank card

used for internet banking, hence entering PIN with VERIFY obligatory entering PIN with VERIFY obligatory

47

slide-48
SLIDE 48

d t di & i i l t ti understanding & comparing implementations

Volksbank Maestro Rabobank Maestro

Are both implementations correct & secure? And compatible?

Volksbank Maestro implementation Rabobank Maestro implementation

Are both implementations correct & secure? And compatible?

Presumably they both passed a Maestro-approved compliance test suite...

48

slide-49
SLIDE 49

Differences between TLS implementations Differences between TLS implementations

(work in progress)

GnuTLS OpenSSL

49

slide-50
SLIDE 50

U i h t l t t di Using such protocol state diagrams

  • Analysing the models by hand or with model checker for flaws
  • Analysing the models by hand, or with model checker, for flaws
  • to see if all paths are correct & secure

F i d l b d t ti

  • Fuzzing or model-based testing
  • using the diagram as basis for “deeper” fuzz testing
  • eg fuzzing also parameters of commands
  • which Erik Boss did for SSH

P ifi ti

  • Program verification
  • proving that there is no functionality beyond that in the

diagram which using testing you can never establish diagram, which using testing you can never establish

  • which we did for MIDPSSH, using ESC/ Java2
  • Using it when doing a manual code review
  • Using it when doing a manual code review
  • which we did for OpenSSH

50

slide-51
SLIDE 51

L i h i t f ? Learning human interfaces?

We would like to extend such learning to also take into account the We would like to extend such learning to also take into account the human user interface (keyboard & display) Then reverse engineering the state diagram of an ATM or Then reverse engineering the state diagram of an ATM or smartcard reader could be automated Eg, security bug in ABN-AMRO’s e.dentifier2 could have been found by automated learning by automated learning

[ Arjan Blom et al, Designed to Fail: a USB-connected reader for online banking, NORDSEC 2012] 51

slide-52
SLIDE 52

C l i Conclusions

  • Various forms of fuzzing are great techniques to spot some
  • Various forms of fuzzing are great techniques to spot some

security flaws More advanced forms of (protocol) fuzzing and automated reverse

  • More advanced forms of (protocol) fuzzing and automated reverse

engineering (or learning) are closely related St t hi t ifi ti f li

  • State m achines are a great specification formalism
  • easy to draw on white boards, typically omitted in official specs

and you can extract them for free from implementations

  • using standard, off-the-shelf, tools like LearnLib

Useful for security analysis of protocol implementations

  • for reverse engineering, fuzz testing, code reviews, or formal

program verification

52

slide-53
SLIDE 53

Q ti ? Questions?

53