2005-05-04 Static Analysis methods and tools An industrial study t Pär Emanuelsson – Ericsson AB and LiU Prof Ulf Nilsson – LiU t itle Outline t t Why static analysis? t t What is it? Underlying technology t Some tools (Coverity, KlocWork, PolySpace, …) Some case studies from Ericsson Conclusions 2 2012-10-30 1
2005-05-04 itle Method used t t Tool comparision based on t t White papers t Research reports from research groups behind tools Interviews with Ericsson staff Interviews with technical staff from tool vendors 3 2012-10-30 itle What is SA and t what can it be used for? t Definition: t t – Analysis that does not actually run the code t Our interest is: – Finding defects (preventing run-time errors) – Finding security vulnerabilities Other uses – Code optimization (e.g. removing run-time checks in safe languages) – Metrics – Impact analysis 4 2012-10-30 2
2005-05-04 itle Pros and cons of static analysis t t Pros t t – No test case design needed – t No test-oracle needed – May detect hard-to-find bugs – Analyzed program need not be complete – Stub writing easier Cons – Potentially large number of ”false positives” – Does not relate to functional requirements – Takes programming competence to understand reports 5 2012-10-30 itle Comparison to other techniques t t Compared to Testing t t – No test case design needed – t No test-oracle needed – Can find defects that no amount of testing can do Compared to Formal proofs (e.g. model checking) – More lightweight – SA is much easier to use – SA does not need formal requirements 6 2012-10-30 3
2005-05-04 itle Software defects and errors t t Software defect : an anomaly in code that might t t manifest itself as an error at run-time t Types of defects found by static analysis – Abrupt termination (e.g. division by zero) – Undefined behavior (e.g. array index out of bounds) – Performance degradation (e.g. memory leaks, dead code) – Security vulnerabilities (e.g. buffer overruns, tainted data) Defects not (easily) found by static analysis – Functional incorrectness – Infinite loops/non-termination 7 2012-10-30 itle Examples of checkers (C-code) t t Null pointer dereference t t Uninitialized data t Buffer/array overruns Dead code/unused data Bad return values Return pointers to local data Arithmetic operations with undefined result Arithmetic over-/underflow (Stack use) (Non-termination) 8 2012-10-30 4
2005-05-04 itle Security vulnerabilities t t Unsafe system calls t t Weak encryption t Access problems Unsafe string operations Buffer overruns Race conditions (Time-of-check, time-of-use) Command injections Tainted (untrusted) data 9 2012-10-30 itle Buffer overflow t t Char dst[256]; t t Char* s = read_string(); t Strcpy(dst, s); 10 2012-10-30 5
2005-05-04 itle Imprecision of analyses t t Defects checked for by static analysis are undecidable t t Analyses are necessarily imprecise As a consequence t – Code complained upon may be correct (false positives) – Code not complained upon may be defective (false negatives) Classic approaches to static analysis (sound analyses) report all defects checked for (no false negatives), but sometimes produce large amounts of false positives; Most industrial systems try to eliminate false positives but introduce false negatives as a consequence 11 2012-10-30 itle Example t 1: f = 1 t fact(int n) { t n t 1) int f = 1; 2: n > 0 t 2) while( n > 0 ) { y 3) f = f * n; 3: f = f * n n = n – 1; 4) } 4: n = n - 1 5) return f; } 5: return f Control Flow Graph (CFG) 12 2012-10-30 6
2005-05-04 itle Program states (configurations) t t A program state is a mapping (function) from program t t variables to values. For example t 1 = { n 1, f 0 } 2 = { n 3, f 0 } 3 = { n 5, f 0 } 13 2012-10-30 itle Semantic equations t t We associate a set x i of states with node i of the CFG t t (the set of states that can be observed upon reaching t the node) x 1 = {{ n 1, f 0 }, { n 3, f 0 }} % Example x 2 = { | ’ x 1 & (n)= ’(n) & (f)=1 } { | ’ x 4 & (n)= ’(n) -1 & (f)= ’(f) } x 3 = { | x 2 & (n) > 0 } x 4 = { | ’ x 3 & (n)= ’(n) & (f)= ’(f)* ’(n) } x 5 = { | x 2 & (n) 0 } 14 2012-10-30 7
2005-05-04 itle Example run t t Initially x1 = x2 = x3 = x4 = x5 = t t x1 = {{n=1,f=0},{n=3,f=0}} t x2 = {{n=1,f=1},{n=3,f=1}} x3 = {{n=1,f=1},{n=3,f=1}} x4 = {{n=1,f=1},{n=3,f=3}} x2 = {{n=0,f=1},{n=1,f=1},{n=2,f=3},{n=3,f=1}} x3 = {{n=1,f=1},{n=2,f=3},{n=3,f=1}} x4 = {{n=1,f=1},{n=2,f=6},{n=3,f=3}} x2 = {{n=0,f=1},{n=1,f=1},{n=1,f=6},{n=2,f=3},{n=3,f=1}} x3 = {{n=1,f=1},{n=1,f=6},{n=2,f=3},{n=3,f=1}} x4 = {{n=1,f=1},{n=1,f=6},{n=2,f=6},{n=3,f=3}} x2 = {{n=0,f=1},{n=0,f=6},{n=1,f=1},{n=1,f=6},{n=2,f=3},{n=3,f=1}} x3 = {{n=1,f=1},{n=1,f=6},{n=2,f=3},{n=3,f=1}} x5 = {{n=0,f=1},{n=0,f=6}} 15 2012-10-30 itle Abstract descriptions of data t t t t ? ? = the set of all integers t + = the set of all positive integers + - 0 = the set { 0 } 0 - = the set of all negative integers = the empty set (=unreachable) 16 2012-10-30 8
2005-05-04 itle Abstract operations t t Abstract multiplication t t t ? + 0 - Any integer ? ? ? 0 ? > 0 + ? + 0 - = 0 0 0 0 0 0 - ? - 0 + < 0 17 2012-10-30 itle Abstract operations t t Abstract subtraction t t t ? + 0 - Any integer ? ? ? ? ? > 0 + ? ? + + = 0 0 ? - 0 + - ? - - ? < 0 18 2012-10-30 9
2005-05-04 itle Abstract semantic equations t t t t x 1 = { n = +,f = ? } t x 2 = { n = lub*(x 1 (n), (x 4 (n) +)), f = lub*(+, x 4 (f)) } x 3 = { n = +, f = x 2 (f) } x 4 = { n = x 3 (n), f = x 3 (f) x 3 (n)} x 5 = { n = ?, f = x 2 (f) } (*) lub(A,B) is the smallest description that contain both A and B (kind of set union) 19 2012-10-30 itle Example abstract run t t Initially x1 = x2 = x3 = x4 = x5 = { n= , f= } t t t x1 = { n=(+),f= ? } x2 = { n=(+),f=(+) } x3 = { n=(+),f=(+) } x4 = { n=(+),f=(+) } x2 = { n= ?,f=(+) } x3 = { n=(+),f=(+) } x5 = { n= ?,f=(+) } 20 2012-10-30 10
2005-05-04 itle SA techniques t t 1. Pattern matching t t 2. Data flow analysis t 3. Value analysis 1. Intervals 2. Aliasing analysis 3. Variable dependencies 4. Abstract interpretation 21 2012-10-30 itle Examples of dataflow analysis t t Reaching definitions (which definitions reach a point) t t Liveness (variables that are read before definition) t Definite assignment (variable cannot be read) Available expressions (already computed expressions) Constant propagation (replace variable with value) 22 2012-10-30 11
2005-05-04 itle Aliasing t t x [ i ] = 5 x = 5 t t x [ j ] = 10 y = 10 t = x[i] = x 23 2012-10-30 itle Tool comparison t t t Tool Coverity Klocwork Polyspace Flexelint t t Language C/C++/Java C/C++/Java C/C++/ADA C/C++ Program size MLOC MLOC 60KLOC MLOC Soundness Unsound Unsound Sound Unsound False positives few few many many Analysis def,sec def,sec,met def def incrementality yes no no no 24 2012-10-30 12
2005-05-04 itle Coverity Prevent t t Company founded in 2002 t t Originates from Dawson Engeler’s research at Stanford t Well documented through research papers Commonly viewed as market leading product Good results from Homeland Security’s audit project Coverity Extend allows user-defined checks (Metal language) Good explanations of faults Good support for libraries Incremental 25 2012-10-30 itle Klocwork K7 t t Company founded by development group at Nortel t t 2001 t Similar to Coverity (in checkers provided) Besides finding defects: refactoring, code metrics, architecture analysis Easy to get started and use Good explanations of faults Good support for foreign libraries 26 2012-10-30 13
2005-05-04 itle Polyspace Verifier/Desktop t t French company co-founded by students of Patrick t Cousot 1999. Aquired by Mathworks 2007. t Claims to intercept 100% of the runtime errors checked t for in C/C++/ADA programs. Customers in airline industry and the European space program (embedded software). Very thorough – especially on arithmatic Can be slow and produces many false positives Documentation hard to read Restricted support for security vulnerabilities and management of dynamic memory 27 2012-10-30 itle t t t t t 28 2012-10-30 14
Recommend
More recommend