SMT-BASED ANALYSIS OF BIOLOGICAL SYSTEMS Nicola Paoletti CS department, Oxford University Molecular Programming and Biological Computation reading group 8 th June 2016
• Motivation Information processing in biosystems à Biological computation • Examples: • Medicine in 2050: “Doctor in a Cell” Synthetic biology • Molecular computing • Molecular Input Applications: • Smart therapeutics • Programmable Computer Biofuels, bioremediation, … • U. Shapiro, E. Shapiro, “How many computers can fit into a drop of water?”
• Motivation Understand life • Development • Disease onset and progression • Target discovery, pluripotent stem cells, … • Dunn, S-J., et al, Science 344.6188 (2014): 1156-1160.
• Motivation Formal computational modelling and analysis • Verified behaviour of synthetic biosystems / DNA circuits • Verify known biological hypotheses • Derive new hypotheses, suggest lab experiments • Design automation of molecular programs •
• Motivation Formal computational modelling and analysis • Verified behaviour of synthetic biosystems / DNA circuits • Verify known biological hypotheses • Derive new hypotheses, suggest lab experiments • Design automation of molecular programs • SMT-based analysis • Expressive framework • Supports both verification and synthesis • Scalable (can handle models of practical interest) •
• Outline TODAY Overview of SMT solving • Bounded Model Checking • Analysis of Chemical Reaction Networks • NOT TODAY SAT and SMT algorithms • Analysis of Gene Regulatory Networks •
SMT SOLVING
• SAT Boolean Satisfiability Problem (SAT) • IN: Boolean formula (CNF) φ = ( x 1 ∨ ¬ x 2 ) ∧ x 2 • OUT: is satisfiable? + Truth assignment φ NP-complete problem • Modern SAT solvers can handle millions of variables and clauses • Established technique for verification and synthesis of hardware • and logic circuits Limited expressiveness (cannot encode e.g. arithmetic or datatypes) •
• SMT Satisfiability Modulo Theories Problem (SMT) • IN: First-Order-Logic (FOL) formula over one or more background theories φ = x 1 ≤ x 2 = ⇒ f ( x 1 ) ≤ f ( x 2 ) • OUT: is satisfiable? + Interpretation of free variables in φ their respective domains
• SMT Satisfiability Modulo Theories Problem (SMT) • IN: First-Order-Logic (FOL) formula over one or more background theories φ = x 1 ≤ x 2 = ⇒ f ( x 1 ) ≤ f ( x 2 ) • OUT: is satisfiable? + Interpretation of free variables in φ their respective domains In the general case, FOL is undecidable [Church, Turing] • But in SMT, background theories fix the interpretation of (function, • predicate and constant) symbols With decidable theories, SMT is decidable too • With undecidable theories, semi-decision procedures often work • well in practice
• SMT COMMON (LAZY) APPROACH: Provide ad-hoc, specialised theory solvers, able to handle negation • and conjunction of atomic propositions Integrate with SAT solver to handle arbitrary Boolean structures • Methods exists for combination of multiple theories •
• SMT COMMON (LAZY) APPROACH: Provide ad-hoc, specialised theory solvers, able to handle negation • and conjunction of atomic propositions Integrate with SAT solver to handle arbitrary Boolean structures • Methods exists for combination of multiple theories • APPLICATIONS (in CS): Program verification (Boogie, Spec#, …) • Model checking (BLAST, CBMC, nuXmv, …) • Symbolic execution • Program synthesis •
• Some examples (Non-)Linear Integer/Real Arithmetic • a > b + 2 ∧ a = 2 · c + 10 ∧ b + c ≤ 1000 SAT, [a = 10, b = 0, c = 0] x · x = x + 2 . 0 ∧ x · y = x ∧ ( y − 1 . 0) · z = 1 . 0 UNSAT Bit-vectors • Validity of De Morgan’s law: ¬ (( x & y ) = ( x | y )) UNSAT Play with http://rise4fun.com/Z3/tutorial/
• Some examples Strings • SAT, [a = ‘bcc’, b = ‘a’] a.b = “ abc ” .a [1] Uninterpreted Functions • f (10) > f (2) ∧ f (10) > f ( a ) SAT, [a = 3, f(x) = {0 if x=10, -1 if x=2, -1 if x=3, -1 otherwise} ] Ø Less conventional, but useful Optimisation (http://rise4fun.com/z3opt/tutorial/ ) • Theory of ODEs – delta-satisfiability (http://dreal.github.io/ ) • Play with http://rise4fun.com/Z3/tutorial/
BOUNDED MODEL CHECKING
Bounded Model Checking (BMC) [Biere et al, TACAS 99] • • A SAT/SMT-based method to verify properties on finite paths • Looks for counter-examples (CEs) of finite length , by negating the property and finding SAT assignment • Based on unrolling the transition relation INVARIANT/SAFETY PROPERTY ^ _ I ( ρ [0]) ∧ T ( ρ [ i ] , ρ [ i + 1]) ∧ ¬ φ ( ρ [ i ]) 0 ≤ i<k 0 ≤ i<k
Bounded Model Checking (BMC) [Biere et al, TACAS 99] • • A SAT/SMT-based method to verify properties on finite paths • Looks for counter-examples (CEs) of finite length , by negating the property and finding SAT assignment • Based on unrolling the transition relation INVARIANT/SAFETY PROPERTY ^ _ I ( ρ [0]) ∧ T ( ρ [ i ] , ρ [ i + 1]) ∧ ¬ φ ( ρ [ i ]) 0 ≤ i<k 0 ≤ i<k • If SAT , the property DOES NOT hold • If UNSAT , increase length until: A CE is found (SAT) • The search becomes intractable • A fixed bound is reached •
• Example • 2-bit counter value of j-th bit at i-th step • x i [ j ] • At each step, increment counter: T ( x, x 0 ) = ( x 0 [1] = x [1] ⊕ x [0]) ∧ ( x 0 [0] = ¬ x [0]) Verify (bounded) invariant: • φ ( x ) = ¬ x [0] ∨ ¬ x [1] (it always holds that at least one of the two bits is 0) Initial state: I ( x ) = ¬ x [0] ∧ ¬ x [1] (counter set to 00) •
• Example I ( x ) = ¬ x [0] ∧ ¬ x [1] φ ( x ) = ¬ x [0] ∨ ¬ x [1] T ( x, x 0 ) = ( x 0 [1] = x [1] ⊕ x [0]) ∧ ( x 0 [0] = ¬ x [0]) T(x 0 ,x 1 ) T(x 1 ,x 2 ) T(x 2 ,x 3 ) x 0 [0] x 2 [0] x 1 [0] x 3 [0] x 0 [1] x 2 [1] x 1 [1] x 3 [1] x 0 x 2 x 1 x 3
• Example I ( x ) = ¬ x [0] ∧ ¬ x [1] φ ( x ) = ¬ x [0] ∨ ¬ x [1] T ( x, x 0 ) = ( x 0 [1] = x [1] ⊕ x [0]) ∧ ( x 0 [0] = ¬ x [0]) T(x 0 ,x 1 ) T(x 1 ,x 2 ) T(x 2 ,x 3 ) x 0 [0] x 2 [0] x 1 [0] x 3 [0] x 0 [1] x 2 [1] x 1 [1] x 3 [1] x 0 x 2 x 1 x 3 NEED TO CHECK: ^ _ I ( x 0 ) ∧ T ( x i , x i +1 ) ∧ ¬ φ ( x i ) 0 ≤ i< 3 0 ≤ i ≤ 3
• Example I ( x ) = ¬ x [0] ∧ ¬ x [1] φ ( x ) = ¬ x [0] ∨ ¬ x [1] T ( x, x 0 ) = ( x 0 [1] = x [1] ⊕ x [0]) ∧ ( x 0 [0] = ¬ x [0]) T(x 0 ,x 1 ) T(x 1 ,x 2 ) T(x 2 ,x 3 ) x 0 [0] x 2 [0] x 1 [0] x 3 [0] x 0 [1] x 2 [1] x 1 [1] x 3 [1] x 0 x 2 x 1 x 3 Step 1: I ( x 0 ) ∧ ¬ φ ( x 0 ) [UNSAT, safe so far]
• Example I ( x ) = ¬ x [0] ∧ ¬ x [1] φ ( x ) = ¬ x [0] ∨ ¬ x [1] T ( x, x 0 ) = ( x 0 [1] = x [1] ⊕ x [0]) ∧ ( x 0 [0] = ¬ x [0]) T(x 0 ,x 1 ) T(x 1 ,x 2 ) T(x 2 ,x 3 ) x 0 [0] x 2 [0] x 1 [0] x 3 [0] x 0 [1] x 2 [1] x 1 [1] x 3 [1] x 0 x 2 x 1 x 3 Step 1: I ( x 0 ) ∧ ¬ φ ( x 0 ) [UNSAT, safe so far] Step 2: [UNSAT, safe so far] I ( x 0 ) ∧ T ( x 0 , x 1 ) ∧ ¬ φ ( x 1 )
• Example I ( x ) = ¬ x [0] ∧ ¬ x [1] φ ( x ) = ¬ x [0] ∨ ¬ x [1] T ( x, x 0 ) = ( x 0 [1] = x [1] ⊕ x [0]) ∧ ( x 0 [0] = ¬ x [0]) T(x 0 ,x 1 ) T(x 1 ,x 2 ) T(x 2 ,x 3 ) x 0 [0] x 2 [0] x 1 [0] x 3 [0] x 0 [1] x 2 [1] x 1 [1] x 3 [1] x 0 x 2 x 1 x 3 Step 1: I ( x 0 ) ∧ ¬ φ ( x 0 ) [UNSAT, safe so far] Step 2: [UNSAT, safe so far] I ( x 0 ) ∧ T ( x 0 , x 1 ) ∧ ¬ φ ( x 1 ) [UNSAT, safe so far] Step 3: I ( x 0 ) ∧ T ( x 0 , x 1 ) ∧ T ( x 1 , x 2 ) ∧ ¬ φ ( x 2 )
• Example I ( x ) = ¬ x [0] ∧ ¬ x [1] φ ( x ) = ¬ x [0] ∨ ¬ x [1] T ( x, x 0 ) = ( x 0 [1] = x [1] ⊕ x [0]) ∧ ( x 0 [0] = ¬ x [0]) T(x 0 ,x 1 ) T(x 1 ,x 2 ) T(x 2 ,x 3 ) x 0 [0] x 2 [0] x 1 [0] x 3 [0] x 0 [1] x 2 [1] x 1 [1] x 3 [1] x 0 x 2 x 1 x 3 Step 1: I ( x 0 ) ∧ ¬ φ ( x 0 ) [UNSAT, safe so far] Step 2: [UNSAT, safe so far] I ( x 0 ) ∧ T ( x 0 , x 1 ) ∧ ¬ φ ( x 1 ) [UNSAT, safe so far] Step 3: I ( x 0 ) ∧ T ( x 0 , x 1 ) ∧ T ( x 1 , x 2 ) ∧ ¬ φ ( x 2 ) Step 4: I ( x 0 ) ∧ T ( x 0 , x 1 ) ∧ T ( x 1 , x 2 ) ∧ T ( x 2 , x 3 ) ∧ ¬ φ ( x 3 ) [SAT, CE found: x 3 [0]=1, x 3 [1]=1, …]
• The Length Problem Finite k à incomplete (cannot capture CEs at k’>k) • Complexity of BMC depends on k • How to find k such that BMC is complete? • A possible solution is using diameter of transition system: length of the • longest loop-free path (…but diameter computation can be very expensive) •
• The Length Problem Finite k à incomplete (cannot capture CEs at k’>k) • Complexity of BMC depends on k • How to find k such that BMC is complete? • A possible solution is using diameter of transition system: length of the • longest loop-free path (…but diameter computation can be very expensive) • In practice… • “BMC is normally used for detecting bugs, not for proving their absence.” • So when the property doesn’t hold, BMC is efficient since returns the CE with minimal length • Many problems consider bounded properties • BMC remains the most effective SAT/SMT-based method in general
• Alternative SAT/SMT Methods k-induction [Sheeran et al, FMCAD 2000] • CEGAR [Chauhan et al, FMCAD 2002] • Interpolation [McMillan, CAV 2003] • IC3 [Bradley, VMCAI 2011] •
Recommend
More recommend