SAT and SMT algorithms Paul Jackson School of Informatics University of Edinburgh Formal Verification Spring 2018
Basic question Given a propositional logic formula, is it satisfiable? Standard to always put formulas into Conjunctive Normal Form or CNF. ◮ By introducing new variables this can be done with only constant-factor growth in formula size. Terminology ◮ An atom p is a propositional symbol ◮ A literal l is an atom p or the negation of an atom ¬ p . ◮ A clause C is a disjunction of literals l 1 ∨ . . . ∨ l n . ◮ A CNF formula F is a conjunction of clauses C 1 ∧ . . . ∧ C m 2 / 31
Abstract rules for DPLL Core algorithms used in SAT and SMT solvers derived from DPLL algorithm (Davis,Putnam,Logemann,Loveland) from 1962. Here present algorithms using abstract rule-based system due to Nieuwenhuis, Oliveras and Tinelli. ◮ General structure of algorithms easy to see ◮ Can work through simple examples on paper 3 / 31
General approach ◮ Try to incrementally build a satisfying truth assignment M for a CNF formula F ◮ Grow M by ◮ guessing truth value of a literal not assigned in M ◮ deducing truth value from current M and F . ◮ If reach a contradiction ( M | = ¬ C for some C ∈ F ), undo some assignments in M and try starting to grow M again in a different way. ◮ If all variables from M assigned and no contradiction, a satisfying assignment has been found for F ◮ If exhaust possibilities for M and no satisfying assignment is found, F is unsatisfiable 4 / 31
Assignments and States States: or M � F fail where ◮ M is sequence of literals and decision points • denoting a partial truth assignment ◮ F is a set of clauses denoting a CNF formula First literal after each • is called a decision literal Decision points start suffixes of M that might be discarded when choosing new search direction Def: If M = M 0 • M 1 • · · · • M n where each M i contains no decision points ◮ M i is decision level i of M ◮ M [ i ] = M 0 • · · · • M i 5 / 31
Initial and final states Initial state ◮ () � F 0 Expected final states ◮ fail if F 0 is unsatisfiable ◮ M � G otherwise, where ◮ G is equivalent to F 0 ◮ M satisfies G 6 / 31
Classic DPLL rules Decide � l or ¬ l in clause of F , M � F = ⇒ M • l � F if l is undefined in M UnitPropagate � M | = ¬ C , M � F , C ∨ l = ⇒ M l � F , C ∨ l if l is undefined in M Fail � M | = ¬ C , M � F , C = ⇒ fail if • �∈ M Backtrack � M • l N | = ¬ C M • l N � F , C = ⇒ M ¬ l � F , C if • �∈ N 7 / 31
Strategies for applying rules ◮ Are many heuristics for choosing literal l in Decide rule. ◮ MOMS: choose literal with the Maximum number of Occurrences in Minimum Size clauses. ◮ VSIDS: choose literal that has most frequently been involved in recent conflict clauses. ◮ UnitPropagate applied with higher priority than Decide since it does not introduce branching in search ◮ Typically many UnitPropagate applications for each Decide ◮ BCP (Boolean Constraint Propagation): repeated application of UnitPropagate 8 / 31
Strategies for applying rules (cont) ◮ After each Decide or UnitPropagate should check for a conflicting clause, a clause C for which M | = ¬ C . If there is a conflicting clause, Backtrack or Fail are applied immediately to avoid pointless search. 9 / 31
Example execution C 1 C 2 C 3 C 4 M x 1 ∨ x 2 ¯ x 3 ∨ x 4 ¯ x 5 ∨ ¯ ¯ x 6 x 6 ∨ ¯ x 5 ∨ ¯ x 2 Rule () u u u u u u u u u Decide x 1 • x 1 0 u u u u u u u u UnitProp C 1 • x 1 x 2 0 1 u u u u u u 0 Decide x 3 • x 1 x 2 • x 3 0 1 0 u u u u u 0 UnitProp C 2 • x 1 x 2 • x 3 x 4 0 1 0 1 u u u u 0 Decide x 5 • x 1 x 2 • x 3 x 4 • x 5 0 1 0 1 0 u u 0 0 UnitProp C 3 • x 1 x 2 • x 3 x 4 • x 5 ¯ x 6 0 1 0 1 0 1 0 0 0 Backtrack C 4 • x 1 x 2 • x 3 x 4 ¯ x 5 0 1 0 1 1 u u 1 0 Decide ¯ x 6 • x 1 x 2 • x 3 x 4 ¯ x 5 ¯ x 6 0 1 0 1 1 1 0 1 0 ◮ Last state here is final – no further rules apply ◮ Derivation shows that C 1 ∧ C 2 ∧ C 3 ∧ C 4 is satisfiable ◮ Final M is a satisfying assignment 10 / 31
Implication graphs An implication graph describes the dependencies between literals in an assignment ◮ 1 node per assigned literal ◮ Node label l @ i indicates literal l is assigned true at decision level i . ◮ Roots of graph (nodes without in-edges) are literals in M 0 and decision literals ◮ Edges l 1 → l , · · · , l n → l added if unit propagation with clause ¬ l 1 ∨ · · · ∨ ¬ l n ∨ l sets literal l ◮ Each edge labelled with clause ◮ When current assignment is conflicting with conflicting clause ¬ l 1 ∨ · · · ∨ ¬ l n , then conflict node κ and edges l 1 → l , · · · , l n → l are added ◮ Each edge labelled with conflicting clause 11 / 31
Partial Implication graph example Only shows current decision-level nodes and immediately-preceding nodes. a ∨ ¯ C 3 = ¯ d ∨ ¯ C 1 = ¯ b ∨ c C 2 = ¯ c ∨ d f C 4 = ¯ d ∨ e ∨ g C 5 = f ∨ ¯ g ¯ b @2 f @4 C 1 C 5 C 3 a @4 c @4 d @4 C 1 C 2 κ Decision literal → C 4 C 5 g @4 e @1 ¯ C 4 12 / 31
Backjump clause inference The implication graph enables inference of new clauses entailed by the current formula F and made false by the current assignment. ◮ Consider any cut of an implication graph with ◮ On right: conflicting node κ ◮ On left: decision literal for current level and all literals at lower levels ◮ If literals on immediate left of cut are l 1 , . . . , l n , then can infer the new clause ( l 1 ∧ · · · ∧ l n ) ⇒ false or equivalently ¬ l 1 ∨ · · · ∨ ¬ l n 13 / 31
Clause inference example a ∨ ¯ C 3 = ¯ d ∨ ¯ C 1 = ¯ b ∨ c C 2 = ¯ c ∨ d f C 4 = ¯ d ∨ e ∨ g C 5 = f ∨ ¯ g Cut 1 Cut 2 ¯ b @2 f @4 C 1 C 5 C 3 a @4 c @4 d @4 C 1 C 2 κ Decision literal → C 4 C 5 g @4 e @1 ¯ C 4 ¯ ¯ Backjump clause: b ∨ ¯ a ∨ e d ∨ e 14 / 31
Backjumping If ◮ current assignment has form M • l N , and ◮ the inferred clause has form C ′ ∨ l ′ where l ′ is the only literal at the current decision level, and ◮ all literals of C ′ are assigned in M , then it is legitimate to ◮ backjump, set the assignment to M , and ◮ noting that C ′ ∨ l ′ has exactly one literal unassigned in M , to apply unit propagation to extend the assignment to M l ′ . Such a clause C ′ ∨ l ′ is called a backjump clause A backjump clause can always be formed using the decision literal from the current level Smaller backjump clauses can sometimes be discovered that exploit unique implication points (UIPs), literals on every path from the current decision literal to the conflict node κ . 15 / 31
Backjump rule Replaces and generalises Backtrack rule in modern DPLL implementations Backjump M • l N | = ¬ C , and there is some clause C ′ ∨ l ′ such that: = C ′ ∨ l ′ , − F , C | ⇒ M l ′ � F , C if M • l N � F , C = − M | = ¬ C ′ , − l ′ is undefined in M , and − l ′ or ¬ l ′ occurs in F or in M • l N ◮ C is the conflicting clause ◮ C ′ ∨ l ′ is the backjump clause 16 / 31
Learning Learn each atom of C occurs in F or in M , M � F = ⇒ M � F , C if F | = C ◮ Common C are backjump clauses from the Backjump rule. ◮ Learned clauses record information about parts of search space to be avoided in future search ◮ CDCL (Conflict Driven Clause Learning) = Backjump + Learn 17 / 31
Forgetting Forget M � F , C = ⇒ M � F if F | = C ◮ Applied to C considered less important. ◮ Essential for controlling growth of required storage. ◮ Performance can degrade as F grows, so shrinking F can improve performance. 18 / 31
Restarting Restart M � F = ⇒ () � F ◮ Only used if F grown using learning. ◮ Additional knowledge causes Decide heuristics to work differently and often explore search space in more compact way. ◮ To preserve completeness, applied repeatedly with increasing periodicity. 19 / 31
Why is DPLL correct? 1 Lemma (1 - nature of reachable states) ⇒ ∗ M � F ′ . then Assume () � F = 1. F and F ′ are equivalent 2. If M is of the form M 0 • l 1 M 1 · · · • l n M n where all M i are • free, then F , l 1 , . . . l i | = M i for all i in 0 . . . n. Lemma (2 - nature of final states) ⇒ ∗ S and S is final (no further transitions possible), If () � F = then either 1. S = fail , or 2. S = M � F ′ where M | = F 20 / 31
Why is DPLL correct? 2 Lemma (3 - transition sequences never go on for ever) Every derivation () � F = ⇒ S 1 = ⇒ S 2 = ⇒ · · · is finite Proof. Given M of form M 0 • M 1 · · · • M n where all M i are • free, define the rank of M , ρ ( M ) as � r 0 , r 1 , . . . , r n � where r i = | M i | . Every derivation must be finite as each basic DPLL rule strictly increases the rank in a lexicographic order and the image of ρ is finite. 21 / 31
Why is DPLL correct? 3 Theorem (1 - termination in fail state) ⇒ ∗ S and S is final, then If () � F = 1. if S is fail , then F is unsatisfiable 2. if F is unsatisfiable then S is fail 22 / 31
Recommend
More recommend