Semantics: Application to C Programs Lecture Slides by Dr. Marie-Christine Jakobs Prof. Dr. Dirk Beyer Dirk.Beyer@sosy-lab.org SoSy-Lab, LMU Munich, Germany
Organization Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 2 / 100
Lecture and Tutorial Lecture Feb 27, 2019, 10:00 – 14:00 Munich, Oettingenstr. 67, C003 Tutorial Feb 27, 2019, 14:00 – 15:30 Munich, Oettingenstr. 67, C003 Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 3 / 100
Course Material https: //www.sosy-lab.org/Teaching/2019-SS-Semantik/ Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 4 / 100
Introduction Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 5 / 100
Software Analysis Computes an (over-)approximation of a program’s behavior . Applications ◮ Optimization ◮ Correctness (i.e. whether program satisfies a given property) Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 6 / 100
Software Verification Formally proves whether a program P satisfies a property ϕ . ◮ Requires program semantics, i.e., meaning of program ◮ Relies on mathematical methods, ◮ logic ◮ induction ◮ . . . Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 7 / 100
Software Verification Formally proves whether a program P satisfies a property ϕ . TRUE � Program P � Verifier Property ϕ FALSE × Disprove ( × ) Find a program execution ( counterexample ) that violates the property ϕ Prove ( � ) Show that every execution of the program satisfies the property ϕ . Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 8 / 100
What Could an Analysis Find out? double divTwiceCons( double y) { int cons = 5; int d = 2 ∗ cons; if (cons != 0) return y/(2 ∗ cons); else return 0; } Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 9 / 100
Some Analysis Results double divTwiceCons( double y) { int cons = 5; // expression 2*cons has value 10 // variable d not used int d = 2 ∗ cons; if (cons != 0) // expression 2*cons evaluated before return y/(2 ∗ cons); else // dead code return 0; } Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 10 / 100
One Resulting Code Optimization double divTwiceCons( double y) { int cons = 5; // expression 2*cons has value 10 // variable d not used int d = 2 ∗ cons; if (cons != 0) // expression 2*cons evaluated before return y/(2 ∗ cons); else // dead code return 0; } double divTwiceConsOptimized( double y) { return y/10; } Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 11 / 100
Does This Code Work? double avgUpTo( int [] numbers, int length) { double sum = 0; for ( int i=0;i<length;i++) sum += numbers[i]; return sum/( double )length; } Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 12 / 100
Problems With This Code double avgUpTo( int [] numbers, int length) { double sum = 0; for ( int i=0;i<length;i++) // possible null pointer access (numbers==null) // index out of bounds (length>numbers.length) // integer overflow sum += numbers[i]; // division by zero (length==0) return sum/( double ) length; } Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 13 / 100
Why Should One Care for Bugs? Intel Pentium FDIV bug . . . Costs Ariane V88 Mars Polar Lander endanger human lives . . . Safety-criticality Therac-25 Uber autonomous car Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 14 / 100
Analysis and Verification Tools Sapienz Klee PeX SLAM Infer Lint Error Prone SpotBugs UltimateAutomizer CBMC . . . CPAchecker Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 15 / 100
Overview on Analysis and Verification Techniques Type Dynamic Static Systems Runtime Testing Interactive Automatic Verification Theorem Program Model Proving Analysis Checking Dataflow Abstract Analysis Interpretation This lecture Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 16 / 100
Why Different Static, Automatic Techniques? Theorem of Rice Any non-trivial, semantic property of programs is undecidable. Consequences Techniques are ◮ incomplete, e.g. answer UNKNOWN, or ◮ unsound, i.e., report ◮ false alarms (non-existing bugs), ◮ false proofs (miss bugs). Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 17 / 100
Verifier Design Space TRUE � Program P � Ideal verifier Verifier UNKNOWN Property ϕ FALSE × false proof correct TRUE � Program P � Unreliable verifier Verifier UNKNOWN Property ϕ FALSE × false alarm violation Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 18 / 100
Verifier Design Space ◮ Overapproximating verifier (superset of program behavior) without precise counterexample check TRUE � Program P � Verifier UNKNOWN Property ϕ FALSE × false alarm violation ◮ Underapproximating verifier (subset of program behavior) false proof correct TRUE � Program P � Verifier UNKNOWN Property ϕ FALSE × Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 19 / 100
Other Reasons to Use Different Static Techniques ◮ State space grows exponentially with number of variables ◮ (Syntactic) paths grow exponentially with number of branches ⇒ Precise techniques may require too many resources (memory, time,. . . ) ⇒ Trade-off between precision and costs Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 20 / 100
Flow-Insensitivity Order of statements not considered E.g., does not distinguish between these two programs x=0; x=0; y=x; x=x+1; x=x+1; y=x; ⇒ very imprecise Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 21 / 100
Flow-Sensitivity Plus Path-Insensitivity ◮ Takes order of statements into account ◮ Mostly, ignores infeasibility of syntactical paths ◮ Ignores branch correlations E.g., does not distinguish between these two programs if (x>0) if (x>0) y=1; y=1; else else y=0; y=0; if (x>0) if (x>0) y=y+1; y=y+2; else else y=y+2; y=y+1; Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 22 / 100
Path-Sensitivity ◮ Takes (execution) paths into account ◮ Excludes infeasible, syntactic paths (not necessarily all infeasible ones) ◮ Covers flow-sensitivity To detect that y has value 0, 1, or 3 if (x>0) y=1; ◮ must exclude infeasible, syntactic path else along first else-branch and second y=0; if-branch if (x>0) ◮ need to detect correlation between the y=y+2; if-conditions else ◮ requires path-sensitivity y=y+1; ⇒ very precise Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 23 / 100
Precision vs. Costs Dataflow Analysis Abstract Interpretation Program Analysis Model Checking Flow-insensitive Flow-sensitive Path-sensitive imprecise precise cheap expensive Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 24 / 100
Program Syntax and Semantics Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 25 / 100
Programs Theory : simple while-programs ◮ Restriction to integer constants and variables ◮ Minimal set of statements (assignment, if, while) ◮ Techniques easier to teach/understand Practice : C programs ◮ Widely-used language ◮ Tool support Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 26 / 100
While-Programs ◮ Arithmetic expressions aexpr := Z | var | -aexpr | aexpr op a aexpr op a standard arithmetic operation like + , − , /, % , . . . ◮ Boolean expressions bexpr := aexpr | aexpr op c aexpr | !bexpr | bexpr op b bexpr ◮ integer value 0 ≡ false, remaining values represent true ◮ op c comparison operator like <, < = , > = , >, == , ! = ◮ op b logic connective like &&( ∧ ) , || ( ∨ ) , ˆ (xor) , . . . ◮ Program S:= var=aexpr; | while bexpr S | if bexpr S else S | if bexpr S | S;S Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 27 / 100
Syntax vs. Semantics Syntax Representation of a program Semantics Meaning of a program Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 28 / 100
How to Represent a Program? 1. Source code if (x>0) abs = x; ◮ Basically sequence of characters else ◮ No explicit information about the abs = − x; i = 1; structure or paths of programs while (i<abs) i = 2 ∗ i; Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 29 / 100
How to Represent a Program? 2. Abstract-syntax tree (AST) Program Sequence Sequence if Assignment Condition if-Block else-Block while x>0 i=1; Assignement Assignement Condition while-Block abs=x; abs=-x; i<abs Assignement i=2*i; ◮ Hierarchical representation ◮ Flow, paths hard to detect Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 30 / 100
How to Represent a Program? 3. Control-flow graph 4. Control-flow automaton l 0 x>0 ! ( x>0 ) x>0 TRUE FALSE l 1 l 2 abs=x; abs=-x; abs=x; abs=-x; l 3 i=1; i=1; l 4 i=2*i; i<abs ! ( i<abs ) i<abs TRUE FALSE l 5 l 6 i=2*i; � Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 31 / 100
Control-Flow Automaton Definition A control-flow automaton (CFA) is a three-tuple P = ( L, l 0 , G ) consisting of ◮ the set L of program locations (domain of program counter) ◮ the initial program location l 0 ∈ L , and ◮ the control-flow edges G ⊆ L × Ops × L . Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 32 / 100
Operations Ops Two types ◮ Assumes (boolean expressions) ◮ Assignments ( var=aexpr; ) Prof. Dr. Dirk Beyer SoSy-Lab, LMU Munich, Germany 33 / 100
Recommend
More recommend