Sudoku’s story Satisfiability ◮ According to Wikipedia, Sudoku was invented by a retired architect from Connersville, IN, Howard Garns, in 1979 and published regularly in Dell Puzzles ◮ (But surely Leonhard Euler must have known sudokus because sudoku is just an Euler square with an additional constraints) ◮ It was popularized by Japanese (because it is hard to produce meaningful crossword puzzles in Japanese) ◮ A correct Sudoku puzzles possesses exactly one solution (which is sometimes used in the solving process) 19 / 1
Satisfiability Sudokus, cont’d ◮ But what does it have to do with SAT? ◮ Let us start with a set of propositional variables that has exactly 729 elements (why 729?) ◮ We will list all these variables as a three-dimensional array, but instead of a [ i ][ j ][ k ] we will write a i , j , k with i , j , k ranging over (1..9) ◮ The intention is that that the atom a i , j , k is true if the cell in the i th row, j th column contains the number ◮ So, Sudoku problem is then something like this: the designer tells us that a certain number of these 729 variables are true ◮ And she tells us: there is a solution (fill of the grid), in fact exactly one ◮ And she tells us: find it! 20 / 1
Satisfiability Sudokus, cont’d ◮ If we think about our problem, then we see that the data given to us by the designer is a partial assignment of our 729 variables ◮ The issue is to find an extension of the given partial assignment (data) to a complete assignment so that all variables are assigned values, and the grid is completely filled ◮ Now we have two choices. We can either use paper and pencil to solve it, or maybe, just maybe, out of laziness, we use a computer to solve it 21 / 1
Satisfiability Sudokus, cont’d ◮ We could write a program in C that will search the grid for solutions ◮ Or we could write once and for all a single collection of constraints T so that out of a satisfying assignment for T ∪ D (where D is the representation of data provided by the designer) we can read-off the solution ◮ (Of course there are numerous Sudoku solvers on WWW, so the alternative above is not complete) 22 / 1
Satisfiability Sudokus, cont’d ◮ So we now have these 729 variables and what do we do with them? ◮ We write couple of clauses For instance, we want to say that every cell contains at most one number: ¬ a i , j , k 1 ∨ ¬ a i , j , k 2 whenever k 1 < k 2 � 9 ◮ Actually, there is � , i.e. 36 of these. As there are 81 2 cells, we have 2916 of such clauses ◮ But it is not all ... 23 / 1
Satisfiability Sudokus, cont’d ◮ We need to make sure that each row has each number in it: a 1 , 1 , 1 ∨ a 1 , 2 , 1 ∨ . . . ∨ a 1 , 9 , 1 (And this says ...) ◮ 9 rows, 9 numbers, so 81 clauses (each of length 9), altogether 81 such clauses ◮ And similar 81 clauses for columns... 24 / 1
Satisfiability Sudokus, cont’d ◮ And now about sections: a 1 , 1 , 1 ∨ a 1 , 2 , 1 ∨ a 1 , 3 , 1 . . . ∨ a 3 , 3 , 1 (This one was about the symbol 1 in the first section) ◮ Altogether 81 clauses of this sort ◮ Thus 2916 + 243 clauses, i.e. 3159 clauses (assuming there were no errors ...) ◮ The following is pretty obvious: there is a one-to-one correspondence between sudoku solutions and the satisfying assignments for the collection of clauses defined above 25 / 1
Satisfiability Sudokus, cont’d ◮ But this was about grids, not about solutions of the individual sudoku problems ◮ We have the following fact, not especially deep (a kind of Bayes Theorem, only simpler): Let T , S be two propositional theories in the language L At . Then models of T ∪ S are precisely the models of T that satisfy S ◮ So, adding clues (such as a 1 , 7 , 2 ) selects out of all valid sudoku grids those which satisfy the clues 26 / 1
Satisfiability Sudokus, cont’d ◮ But what is the pragmatics of encoding? Surely solvers can not understand expressions such as a 1 , 7 , 2 ◮ Indeed, there is a standard, called DIMACS format for describing SAT (clausal) theories ◮ Actually, it is quite simple; the literals are represented as signed integers, and the number 0 is used as a clause separator ◮ Since the only functors are negation (applied to variables) and disjunction, we drop disjunction, and a clause is modeled as a list of signed integers separated by spaces and ending with 0 ◮ We also need to tell the solver how many propositional variables are there, and the number of lines (although surely we can derive this) 27 / 1
Coding sudoku, cont’d Satisfiability ◮ Let us represent variable a i , j , k by the integer 81 ∗ ( i − 1 ) + 9 ∗ ( j − 1 ) + k (why k and not k − 1?) ◮ So, for instance a 1 , 2 , 1 is 10 ◮ And the clause ¬ a 1 , 2 , 1 ∨ ¬ a 1 , 2 , 7 becomes: − 10 − 16 0 ◮ Altogether we get 3159 clauses and then the header line: p cnf 729 3159 ◮ (the comment lines use c as a first two characters) 28 / 1
Coding sudoku, cont’d Satisfiability ◮ In the process the meaning of this collection of line as a sudoku description entirely disappears ◮ So the translation of whatever (if anything) solver returns us is needed to recover the meaning ◮ Let us observe that the knowledge representation is a major step in the use of solvers ◮ Knowledge Representation is not automated in SAT - only solving is push button 29 / 1
Solving Satisfiability ◮ So let us assume we downloaded the minisat or any other reasonable SAT solver ◮ We also assume that we wrote a script that produced the sudoku constraints ◮ We also grabbed the sudoku problem from the newspaper ◮ That problem had 32 clues ◮ For instance 4 in the (1,2) cell; this generated the (unit) clause 140 ◮ We computed all 32 lines describing the clues ◮ Then we added these at the end and changed the header line to p cnf 729 3191 ◮ Saved the file as sudoku.cnf 30 / 1
Solving, cont’d Satisfiability ◮ We now run: » minisat sudoku.cnf > res ◮ We get some gibberish telling us which variables are true ◮ We decode the content of the grid that is the solution 31 / 1
Satisfiability Formats ◮ Every Boolean function is represented by a CNF T so that f ( v ) = 1 if and only if v | = T ◮ Thus every Boolean function is represented by a DIMACS file ◮ There are competing formats; we can represent functions by DNFs and then the natural way to think is in terms of equations f ( v ) = 0 ◮ There is a representation standard for And-Inverter-Graphs (you sure know who invented it) ◮ An important format is for representing BDDs (much liked by EE) 32 / 1
So now we know that solving can be done Satisfiability ◮ But how is it done? ◮ What kind of algorithms are used? ◮ The fundamental algorithm for solving is so-called DPLL algorithm named after Davis, Putnam, Logemann and Loveland ◮ This algorithm is based on two basic ideas: (1) Boolean constraint propagation (2) Backtracking search in the space of partial assignments ◮ (Significant improvements in the idea (2), based on the analysis of the idea (1) resulted in the power of modern solvers) 33 / 1
Boolean constraint propagation (BCP) Satisfiability ◮ The idea is simple: Say I have a clause C l 1 ∨ l 2 ∨ . . . ∨ l n − 1 ∨ l n ◮ And I already established that the literals ¯ l 1 , . . . , ¯ l n − 1 are true (whatever it means ...) ◮ Therefore to make the clause C true I have to commit to the truth of l n ◮ (The order of literals in a clause is immaterial) ◮ In terms of partial assignments this means that if my current partial assignment w contains ¯ l 1 , . . . , ¯ l n − 1 then I need to commit to l n ◮ But can I always do this? ◮ What if my w contains ¯ l n ? 34 / 1
Satisfiability BCP, cont’d ◮ We can formalize BCP as an operation BCP ( · , · ) where the first argument is a partial assignment, and the second argument is a clausal theory ◮ We need to be a bit careful, because the result does not have to be a partial assignment (i.e. a consistent set of literals); it may become inconsistent ◮ Example: we have w = {¬ p , q , r } , and T contains a clause: p ∨ ¬ q ∨ ¬ r ◮ We compute ¬ r and we no longer have a partial assignment, but a set of literals that is inconsistent ◮ This means that the choices we did on our way to the assignment w resulted in the inconsistency and hence a backtrack will be needed 35 / 1
Satisfiability BCP, cont’d ◮ But if we are careful, and extend the definition of BCP so that it accepts as the first argument a set of literals, then we can check that BCP is monotone, so we can use Knaster-Tarski theorem and claim existence of a least fixpoint ◮ We are interested in the case when the first argument is a partial assignment, and we would like to have the output that is a partial assignment, but if it is not, then we also get useful information ◮ But, what is BCP , and does it have a logic meaning, or just the algebraic one? ◮ It turns out it has logical and algebraic and graph-theoretical meanings 36 / 1
Satisfiability Resolution ◮ Resolution is the following rule of proof: C 2 ∨ ¯ C 1 ∨ l l C 1 ∪ C 2 ◮ So what happens is that if one of clauses contains l and another ¯ l then we eliminate l , ¯ l from both and take the disjunction of the remaining literals ◮ Example: Given C 1 := p ∨ q ∨ ¬ r , C 2 := p ∨ r ∨ s , we conclude the clause C 1 ⊗ C 2 := p ∨ q ∨ s ◮ Resolution rule is sound , if an assignment v satisfies both C 1 and C 2 , then it also satisfies C 1 ⊗ C 2 37 / 1
Subsumption (weakening) Satisfiability ◮ The subsumption rule is C D where C ⊆ D ◮ Subsumption Rule is also sound ◮ Quine Theorem: A non-tautological clause C is a consequence of a clausal theory T if and only if some resolution consequence D of T subsumes C ◮ Thus the proof system based on resolution and subsumption is complete for clausal logic ◮ (Actually Resolution alone is not complete for clausal logic) ◮ A slightly stronger form of completeness, limiting proofs to at most one application of subsumption is also valid by the second Quine Theorem 38 / 1
Resolution and satisfiability Satisfiability ◮ Resolution is a (partial) binary function in the set of all clauses ◮ So, given a clausal theory T , we define Res ( T ) as the closure of T under resolution ◮ Quine Theorem: a CNF theory T is satisfiable if and only if empty clause does not belong to Res ( T ) ◮ Unfortunately, there is O ( 3 n ) clauses (why?) ◮ So, testing satisfiability via closure under resolution is not a feasible option 39 / 1
Satisfiability Resolution and BCP ◮ BCP ( v , T ) is closely related to resolution, though ◮ Specifically, unit resolution is a restriction of resolution where one of the inputs must be a unit ◮ Then it turns out that the literals computed with BCP ( T ) are precisely literals that possess a unit resolution proof out of T ◮ Thus we now have a proof-theoretic representation of BCP ◮ Of course, a proof is a labeled directed acyclic graph ◮ When we think about BCP in these terms, we get graphs labeled both with clauses and literals ◮ Specifically, literals labeling immediate predecessors form inputs to a clause labeling the node ◮ (So a node is kind of a machine, on some inputs it produces outputs) 40 / 1
Reduct of a CNF by a partial assignment Satisfiability ◮ Reduct of a clausal theory T by a partial assignment w is defined as follows: ◮ If there is a literal l ∈ w such that l ∈ C - eliminate C (i.e. C becomes null ◮ If there is a literal l ∈ w such that ¯ l ∈ C - eliminate ¯ l from C ◮ This is done until fixpoint (i.e. no more eliminations possible) ◮ The result is called reduct of T by w ◮ (We will denote it by T / w ) 41 / 1
Reduct and satisfiability Satisfiability ◮ Example: v = { p , ¯ q } , T = { p ∨ r , ¬ p ∨ s ∨ t , q ∨ u } ◮ The first clause is eliminated, the second shortened, the third one is also shortened, producing a new literal (which could be further used if we had more clauses) ◮ (Reduct of a theory by a partial assignment may be contradictory, can you craft an example?) ◮ A clausal theory T is satisfiable if and only if BCP ( T ) is a partial assignment, and the reduct of T by that partial assignment is satisfiable ◮ In fact when the r.h.s. is true every satisfying assignment for the reduct expands to a satisfying assignment for T 42 / 1
BCP and satisfiability, cont’d Satisfiability ◮ The last fact of the previous viewgraph tells us that we can use BCP as a preprocessing and also inprocessing in every node of the search tree ◮ (We do not know yet how this search tree looks like, but we know that we will have such tree) ◮ It turns out that BCP contains much more information, due to the fact that the we have the proof-theoretic interpretation, and that there is ordering of decisions on each branch of the search tree 43 / 1
Basic DPLL algorithm Satisfiability ◮ Given an input CNF theory T we compute BCP ( T ) ◮ If BCP ( T ) is contradictory we leave with failure ◮ O/w v := BCP ( T ) ◮ If v 3 ( T ) = 1 we return any complete assignment of At extending v ◮ O/w we select a literal l whose underlying variable is not in the domain in v , set variable first to 1 ◮ We set w := BCP ( v ∪ { l } , T ) ◮ If w is non-contradictory, we set v := w and call our algorithm recursively (thus new decision will be made, and the variable first will be reset to 1) ◮ If w is contradictory and first equals to 1 we set first to 0 and call the algorithm recursively (with the same v , not w ) ◮ If w is contradictory and first equals to 0 then we backtrack to the last decision where the first was 1, change first to 0 and continue 44 / 1
Satisfiability Data structures ◮ We need appropriate data structures to support this algorithm ◮ The basic one is the history that tells us: ◮ What decisions were taken and at what level ◮ What were the values of the first bit ◮ What were the results of the BCP after the decisions ◮ Additional information that will support conflict analysis, for instance levels of the literals computed by BCP 45 / 1
Satisfiability Autarkies ◮ A partial valuation v touches a clause C if one of the variables on which v is defined occurs (positively or negatively) C ◮ For instance p touches ¬ p ∨ q ∨ r ◮ A partial valuation v is an autarky for a clausal theory T if for all clauses C of T , if v touches C then v satisfies C (i.e. v 3 ( C ) = 1) ◮ A pure literal for T is a literal l so that ¯ l does not occur in T ◮ Of course, if l is pure in T then { l } is an autarky for T ◮ Autarkies are very desirable, because if v is an autarky for T then T is satisfiable if and only if the reduct T / v is satisfiable ◮ But unfortunately existence of nontrivial autarkies is an NP-complete problem (but see below, when we discuss Krom clauses) 46 / 1
’Easy cases’ of SAT Satisfiability ◮ There are numerous classes of SAT problems where solving (i.e. searching for solution) are easy (i.e. in the class P ) ◮ In fact there are various hierarchies of SAT problems with the increasing degree of difficulty ◮ We will discuss here only 7 classes ◮ Positive formulas ◮ Negative formulas ◮ Krom formulas ◮ Horn formulas ◮ Dual-Horn formulas ◮ Renameable-Horn formulas ◮ Systems of linear equations over Z 2 (affine formulas) 47 / 1
Satisfiability Positive formulas ◮ A positive formula is one where on any path from the root to a leaf there is an even number of nodes labeled by negation symbol ◮ (Equivalently: if we “push negation inward”, the result will have no negations) ◮ It is easy to test a positive CNF for satisfiability; it is satisfiable if and only if when none of the clauses is ⊥ ◮ For non-CNF and positive it is a bit harder, but also easy ◮ Moreover if such formula is satisfiable then it is satisfied by constant function 1 At 48 / 1
Negative formulas Satisfiability ◮ A negative formula is one where on any path from the root to a leaf there is an odd number of nodes labeled by negation symbol ◮ (Equivalently: if we “push negation inward”, the result will have negations over all atoms in the leaves) ◮ It is easy to test a negative CNF for satisfiability; it is satisfiable if and only if when none of the clauses is ⊥ ◮ For non-CNF and negative it is a bit harder, but also easy ◮ Moreover if such formula is satisfiable then it is satisfied by constant function 0 At ◮ (There is also a proof using permutations, can you guess it?) 49 / 1
Satisfiability Krom CNFs ◮ Krom clause is one of size ≤ 2 ◮ Krom CNF is one consisting of Krom clauses ◮ Collection of all Krom clauses is closed under resolution ◮ When BCP is run over a set of Krom clauses and we do not get inconsistency, the result is a partial assignment and a set of clauses of length exactly 2 ◮ When l ∨ m is a Krom clause can assign to it two m to l , another from ¯ edges; one from ¯ l to m ◮ Then, a set K of clauses of length 2 determines a directed graph G K 50 / 1
Satisfiability Krom formulas, cont’d ◮ (What is a strongly connected component in a digraph?) ◮ A Krom CNF is satisfiable if and only if no strongly connected component in G K contains a pair of dual literals ◮ This can be used to devise a polynomial time algorithm for finding a satisfying assignment for a given Krom CNF K ◮ We topologically sort (what is it?) strongly connected components of G K and then in that order, going backwards, we assign the values to literals ◮ Because no component has a pair of dual literals set of duals of a connected component is also a connected component 51 / 1
Satisfiability Krom formulas, cont’d ◮ So, we repeatedly take the first (remember the reverse order) unassigned strongly connected component, set on it value 1 and set the value 0 on its dual, repeat till all variables are assigned values ◮ The resulting assignment satisfies K ◮ The reason is that if both literals l , m in a clause l ∨ m are evaluated as 0 then ¯ l (which is evaluated as 1) m (because ¯ must be assigned 1 after ¯ l precedes m m must be assigned 1 after ¯ in G K ) and likewise ¯ l 52 / 1
Krom formulas, example Satisfiability ◮ Let us look at this K : ◮ ¬ p ∨ q ◮ ¬ q ∨ r ◮ ¬ r ∨ p ◮ p ∨ s ◮ The graph G K has 4 strongly connected components ◮ The two of these are: { p , q , r } , another is {¬ s } ◮ (The other two are immaterial, why?) ◮ We get an assignment s = 0 , p = q = r = 1 53 / 1
Another SAT algorithm for Krom theories Satisfiability ◮ Krom CNFs have the following desirable property: given any partial valuation v , if w = BCP ( v , T ) is consistent then it is an autarky for T ◮ So here is an algorithm (we assume BCP ( ∅ , T ) = ∅ ): ◮ Select any l , compute w = BCP ( { l } , T ) , set visited to 1 ◮ If w consistent: ◮ Reduce T by W , and if the reduct empty, return any complete extension of w ◮ O/w call the algorithm recursively ◮ O/w set visited to 0, call the algorithm recursively but with ¯ l instead of l . If the computed BCP is inconsistent, leave with failure ◮ Otherwise reduce, choose new literal, continue 54 / 1
Satisfiability Krom formulas, cont’d ◮ Operation maj is a ternary Boolean function that returns the value of the majority of arguments ◮ (Thus maj ( 1 , 0 , 0 ) is ...) ◮ Various programming languages have built-in bitwise operations on Boolean strings of same length, say &&, or || ◮ We can imagine a ternary operation maj on Boolean strings of some fixed length n ◮ (When the ordering of propositional variables in At is fixed, such string is a code for an assignment) 55 / 1
Satisfiability Krom formulas, cont’d ◮ Let V be a set of assignments of the set of atoms At and let us think about V as a collection of Boolean strings ◮ Then there is Krom CNF K such that V is a set of all satisfying assignments for K if and only if V is closed under bitwise maj ◮ This can be used to devise yet another polynomial time algorithm for finding a satisfying assignment for a given Krom CNF K 56 / 1
Satisfiability Krom clauses, conclusion ◮ Krom clauses are quite useful in processing of clausal theories (we will see it below) ◮ One important property is thet if | C | = n , D is Krom and C , D rsolvable, then | C ⊗ D | = n so resolving with Krom clauses does not increase sizes of the resolvent ◮ (There is plenty of literature on the use of Krom clauses) 57 / 1
Satisfiability Horn clauses ◮ Horn clause is one that has at most one positive literal (i.e. one or none) ◮ A clause with one positive literal is called (depending on the community) definite Horn clause , or program clause ◮ A clause with no positive literal is sometimes called a constraint (quite misleading...) ◮ The reason why program clause is so called is that the clause p ∨ ¯ q 1 ∨ . . . ∨ ¯ q n is really an implication: ( q 1 ∧ . . . ∧ q n ) ⇒ p that is a rule : once q 1 , . . . , q n are computed (derived), compute p ◮ A Horn CNF splits naturally into two parts: P – consisting of program rules, and C consisting of constraints 58 / 1
Satisfiability Horn clauses, cont’d ◮ A definite Horn theory (i.e. a program) is satisfiable (obviously, why?) ◮ Given a set of atoms A and a definite clause C := ( q 1 ∧ . . . ∧ q n ) ⇒ p , we say that A matches C if q 1 , . . . , q n all belong to A ◮ ( p is called the head of C , denoted head ( C ) ) ◮ Then we define an operator T P in the family of subsets of At by: T P ( A ) = { head ( C ) : C ∈ P and M matches C } ◮ The operator T P is monotone, so it possesses a fixpoint, called M P 59 / 1
Satisfiability Horn clauses, cont’d ◮ We can think about assignments as sets of atoms ◮ The reason is that an assignment is the characteristic function of a subset M ⊆ At ◮ If H is a definite-Horn (i.e. consists of definite Horn clauses) then there is a ⊆ -least set that satisfies H ◮ This set coincides with the least fixed point of the operator T H ◮ When H is a Horn theory but not definite-Horn, then H uniquely splits into the disjoint union H 1 ∪ H 2 where H 1 is definite-Horn and H 2 consists of constraints ◮ If H is Horn, then H is consistent if and only if the least model of H 1 satisfies H 2 60 / 1
Satisfiability Horn clauses, cont’d ◮ This last fact implies an algorithm for testing Horn theories for satisfiability and computing a satisfying assignment (if satisfiable) ◮ Given input Horn theory H , we first split H into H 1 ∪ H 2 (where H 1 consists of definite-Horn clauses, H 2 consists of constraints) ◮ Next, we compute the least fixpoint of the operator T H 1 ◮ Finally, we test if the least model, M H 1 satisfies H 2 ◮ (But are these two above steps algorithmic? and if so, how complex?) 61 / 1
Computing the least model of H 1 , cont’d Satisfiability ◮ It should be obvious that we can do this in quadratic time ◮ The reason is that we can start with the empty set of atoms, and each time the body of a clause is matched, add the head ◮ We will run more than m times (where m is the number of clauses) ◮ This gives the quadratic upper bound on the complexity 62 / 1
Computing the least model of H 1 , cont’d Satisfiability ◮ But we can do better with Dowling - Gallier algorithm ◮ In this algorithm we associate with each definite-Horn clause C a counter holding the number of atoms in the body of C which are not yet computed ◮ When that counter drops to 0, we add the head of C to the list M P ◮ With appropriate data structure this results in linear number of steps (each atom can be computed only once) 63 / 1
Testing satisfaction of constraints Satisfiability ◮ Once M H 1 is computed, we go through the list of constraints only once ◮ For each constraint ¬ q 1 ∨ . . . ∨ ¬ q m in H 2 we test if M H 1 ∩ { q 1 , . . . , q m } � = ∅ ◮ If so we leave with failure ◮ If all constraints satisfied - we return M H 1 ◮ With appropriate data structures this can be done in linear time ◮ (Dowling - Gallier algorithm allows for computation of BCP ( T ) for clausal theory T in linear time, why?) 64 / 1
Satisfiability Horn clauses, cont’d ◮ Assuming At finite, any family of sets closed under intersections (equivalently: family of assignments closed under bitwise-conjunction) is the family of models for suitably chosen Horn clausal theory H ◮ (Actually, an old but important piece by Makowsky explains why Horn theories matter in CS) ◮ Any monotone operator in finite set of atoms At is of the form T H for suitably chosen definite-Horn theory H ◮ Any Horn theory that contains no positive units is satisfiable (why?) 65 / 1
So, maybe this could be of use? Satisfiability ◮ Let T be a CNF, we can split T = H ∪ R where H is Horn part of T ◮ Then we could quickly compute M H . These guys have to be in evaluated as 1 for every assignment satisfying H , thus T ◮ But the gain is illusory. The reason is that elements of M H are computed by as BCP ( H ) and since BCP is monotone, also out of T ◮ Thus DPLL will compute M H in the first step 66 / 1
Horn Theories, example Satisfiability ◮ H := { p ∨ ¬ q ∨ ¬ r , r ∨ ¬ s , r , q , ¬ q ∨ ¬ p } ◮ H 1 := { p ∨ ¬ q ∨ ¬ r , r ∨ ¬ s , r , q } ◮ H 2 := {¬ q ∨ ¬ r } ◮ M H 1 = { p , q , r } ◮ M H 1 does not satisfy H 2 because { p , q , r }∩ { q , r } � = ∅ ◮ If instead of ¬ q ∨ ¬ r we have a constraint ¬ q ∨ ¬ s , M H 1 satisfies H 2 , thus H ◮ (Why do we give such trivial examples?) 67 / 1
Satisfiability Permutations of atoms ◮ Given a permutation π of the set At we can extend the actions of π to formulas ◮ Since the formulas are trees with atoms and constants in the leaves, we just put π ( x ) for x in the corresponding leaf ◮ Then, we can extend the action of permutations to assignments, namely π ( v )( x ) = v ( π ( x )) ◮ Then, kind of obviously, v | = ϕ iff π ( v ) | = π ( ϕ ) ◮ But there is more to permutations ◮ Since clauses consist of literals, we want to deal with permutations of literals 68 / 1
Satisfiability Permutations of literals ◮ Obviously, we need to be careful, because if we are not then we may not preserve logical properties ◮ For instance, say we have a formula ϕ := p ∧ q , we map literal p to r , and literal q to ¬ r ◮ Then ϕ is satisfiable, but π ( ϕ ) is not satisfiable ◮ This motivates the following definition: a permutation π of Lit is consistent if for all literals l , π (¯ l ) = π ( l ) ◮ Thus we can change variable, and we can change sign, but in a consistent way ◮ Clearly, a consistent permutation of literals is determined by two objects: ◮ A permutation of variables, and ◮ A binary sequence (keep the sign or not) 69 / 1
Satisfiability Permutations of literals, cont’d ◮ Thus there is precisely 2 n · n ! of consistent permutations of literals when | At | = n ◮ As before we can extend the action of consistent permutations of literals to formulas, and to assignments (partial assignments, too) ◮ The permutation lemma above holds for consistent permutations of literals ◮ A shift permutation of literals is one which does not change variables, but only the signs (that is moves l to l or to ¯ l ) 70 / 1
Satisfiability Permutations of literals, cont’d ◮ Shifts form a group, of size 2 n ◮ Shifts commute with permutations of atoms ◮ Square of a shift is always the identity ◮ Image of a Horn clause under permutation of atoms is Horn; satisfiability is preserved ◮ But a consistent permutation of literals does not have to preserve Horn theories (although it preserves satisfiability) ◮ Image of a Krom clause under consistent permutation of literals is Krom, satisfiability is preserved 71 / 1
Satisfiability Dual-Horn theories ◮ A dual-Horn clause is one that has at most one negative literal ◮ A dual-Horn theory is a CNF consisting of dual-Horn clauses ◮ The shift sending l �→ ¯ l sends Horn theories to dual-Horn and conversely, too ◮ Thus every result we had on Horn clauses has a corresponding result for dual-Horn theories 72 / 1
Satisfiability Dual-Horn theories, cont’d ◮ Also the algorithms that we have for Horn theories can be “refurbished” to be used for dual-Horn ◮ And, of course, we can shift dual-Horn theory to a Horn one, do something, and then shift back ◮ (State theorems, get proofs) ◮ Sets of all assignments for dual-Horn CNFs can be characterized by a property analogous to that characterizing Horn formulas; can you characterize it? 73 / 1
Satisfiability Renameable-Horn theories ◮ A renameable-Horn theory is one for which a shift permution transforming it into a Horn theory exist ◮ Certainly it is a very desirable property (why?) ◮ The issue is if we could find such shift, if one exists ◮ And of course at a non-significant cost ◮ Miraculously, this can be done 74 / 1
Example Satisfiability ◮ Here is a simple example of what happens when we try ◮ Let us look at a clause C := p ∨ q ∨ ¬ r ¬ s ∨ ¬ u ◮ We introduce new atom shift ( x ) for each atom x ◮ What are the constraints on shift ? ◮ We need to shift at least one of p or q (if there were positive literals, we would have more similar constraints) ◮ We can not shift r and s at the same time, likewise r and u at the same time and s and u at the same time ◮ If we shift r then we must shift both p and q (and the same for s and for u ◮ I believe that I specified 6 constraints on our shift (but check) ◮ Let us denote this set of clauses by S C 75 / 1
Satisfiability Renameable-Horn theories, cont’d ◮ The clauses in S C are all Krom ◮ Given a CNF T , let us form S T = � C ∈ T S C ◮ Clearly, S T is Krom, and | S T | = O ( | T | 2 ) ◮ Hence, in principle, we can solve S T ◮ (Why “in principle”?) 76 / 1
Satisfiability Renameable-Horn theories, cont’d ◮ Lewis Theorem: There is a one-to-one correspondence between satisfying assignments for S T and shift permutations that transform T into a Horn ◮ Thus we can form S T , use the algorithm for solving Krom theories ◮ If we succeed, we have a shift π turning T into a T ′ that is Horn ◮ Now we solve T ′ ◮ If we succeed, we get a solution w for T ′ ; the assignment π ( w ) is a solution for T theory T (how?) 77 / 1
Algebraic methods Satisfiability ◮ The structure Z 2 := � Bool , + , ∧ , 0 , 1 � is a field ◮ (Here + is XOR, not ∨ , ∧ is in our context denoted · ) ◮ Z 2 is a very nice field because over that field every function is a polynomial (actually, the only field with this property) ◮ That is, for every function f : Bool n → Bool there is g ∈ Bool [ x 1 , . . . , x n ] so that f ≡ g ◮ Also, Boolean polynomials provide a canonical representation of Boolean functions (once the order of variables is fixed), like tables, and ROBDDs) ◮ (Actually, polynomials are very closely related to ROBDDs) 78 / 1
Algebraic methods, cont’d Satisfiability ◮ This has consequences; algebraic methods such as Gröbner bases, etc. apply (not that I am a specialist) ◮ For a time, esp. in the 90ies, there was a great hope that so-called Nullstellensatz of Hilbert would provide a processing methods for satisfiability ◮ Here is the idea: Let ϕ [ x 1 , . . . , x n ] be a formula. Then there is a polynomial f [ x 1 , . . . , x n ] so that for any assignment � a 1 , . . . , a n � f ( a 1 , . . . , a n ) = 0 if and only if � a 1 , . . . , a n � | = ϕ ◮ But now, satisfiability of a set of formulas (maybe clauses) is transformed to finding common zeros of a set of polynomials ◮ Here is when Nullstellensatz comes in, for it gives a criterion on a (power of) polynomial belonging to the ideal generated by a set of polynomials 79 / 1
Satisfiability Affine formulas ◮ A linear equation over At = { x 1 , . . . , x n } is one of the form ε 1 x 1 + ε 2 x 2 + . . . + ε n x n + ε n + 1 = 0 ◮ An affine formula is a conjunction of linear equations ◮ Since Z 2 is a field, therefore we have various techniques for solving affine formulas (i.e. finding satisfying assignments) 80 / 1
Satisfiability Affine formulas ◮ Gaussian elimination, Gauss-Jordan variation ◮ (See Apt’s book) ◮ (Because the characteristics of Z 2 is 2 the determinants do not apply) ◮ It is possible to extract out of a CNF implied affine formula ◮ This can (and in fact was) used for preprocessing and inprocessing (see below) 81 / 1
Satisfiability Affine formulas, cont’d ◮ The idea is the same which failed us at Horn case but succeeds here; ◮ Let us extract out of a CNF T a set of linear equations A ◮ If A determines some specific values - we can assign these values for all putative satisfying assignments ◮ For other values, we get additional, quickly computable constraints ◮ Example: if we found that A entails p 1 = 0 we can skip the branch where p 1 = 1; if we found that p 2 = p 5 + p 7 + 1 then each time we assign values to p 5 and p 7 we know how to assign the value to p 2 82 / 1
Satisfiability Affine formulas, cont’d ◮ There is a characterization of the collections of assignments that are satisfying assignments for a set of linear equation ◮ Let A be a set of assignments. Then there is an affine formula ϕ such A is the set of satisfying assignments for ϕ if and only if A is closed under bitwise sum of three ◮ Generally, affine formulas are closely related to linear error-correcting codes - which is an important (and very applicable) area of Combinatorics 83 / 1
Why these characterizations? Satisfiability ◮ In various characterizations we had “inverse” results (all of these were of the form A is a set of assignments satisfying a theory T of some syntactic form) ◮ These characterizations connect logic of SAT with Universal Algebra, to be precise with an important object called Post Lattice studied by people in computational Universal Algebra ◮ (They talk about clones , i.e. classes of functions closed under substitutions and polymorphisms - and those are not related to polymorphic data structures, etc.) ◮ (The mathematics of Post Lattice is breathtakingly beautiful ) ◮ (See the book by Dietlinde Lau, Function Algebras on Finite Sets ) 84 / 1
Are there more “easy classes”? Satisfiability ◮ Of course there are ◮ Trivial example is a class of formulas consisting of Horn clauses and a bounded number (say 7) of other clauses ◮ But there are others, one is called SLUR , single literal unit resolution class of Franco and van Gelder, where contradictory sets of clauses generate contradiction via BCP ◮ There is a beatiful result of Kuˇ cera et.al. showing that every Boolean function possesses such representation (but checking that a representation is of this kind is a co-NP-complete problem) 85 / 1
Opportunity for ’Divide-and-Conquer’? Satisfiability ◮ Let us explore the case when T can be represented as T 1 ∪ T 2 , both T 1 and T 2 belonging to “easy cases” ◮ Unfortunately this approach fails spectacularly ◮ Here is an example: There is a polynomial-time transformation of clause sets T into clause sets T ′ so that: ◮ T ′ = T 1 ∪ T 2 ◮ T 1 consists of positive clauses ◮ T 2 consists of Krom clauses ◮ There is a one-to-one polynomial time correspondence between the satisfying assignments of T and satisfying assignments for T ′ 86 / 1
‘Divide and conquer’, cont’d Satisfiability ◮ Let us look at an example. Say I have a clause s ∨ ¯ C := p ∨ q ∨ ¯ t ∨ ¯ u ◮ I introduce three new atoms s ′ , t ′ , and u ′ slors ′ (you ◮ I introduce six new Krom clauses: s ∨ s ′ , ¯ guess the remaining 4) and get T ′ ◮ ( s ′ plays the role of ¬ s ) ◮ My original clause becomes C ′ := p ∨ q ∨ s ′ ∨ t ′ ∨ u ′ ◮ Obviously, there is a one-to-one, polynomial correspondence between satisfying assignments for T := { C } and T ′ ◮ (Here T 1 consists of C ′ , T 2 consists of newly introduced Krom clauses) 87 / 1
‘Divide and conquer’, cont’d Satisfiability ◮ The construction outlined above is general, we do this for all clauses in T , take union, get T ′ , obviously polynomial-time in T ◮ T ′ = T 1 ∪ T 2 , T 1 is positive, T 2 is Krom, and there is one-to-one polynomial correspondence between satisfying assignments for T and T ′ ◮ So, if we can solve the case positive/Krom in polynomial time, we can solve the general case in polynomial time 88 / 1
‘Divide and conquer’, cont’d Satisfiability ◮ With seven classes I considered above there is 21 (why?) cases to consider ◮ There are two trivial cases (all negative clauses are Horn, all positive clauses are dual-Horn) ◮ The remaining 19 cases are all NP-complete ◮ (We are not finite-group theorists and do not consider so many cases) ◮ The only mildly interesting case is the case of linear equations and Krom clauses ◮ In that case we first reduce SAT to 3-SAT (what is 3-SAT?) and then transform a 3-clause to one linear equation and two Krom clauses 89 / 1
Satisfiability Closer look at BCP ◮ Let us fix F and write BCP ( v ) instead of BCP ( v , F ) ◮ As we progress on a branch of the search-tree we make consecutive assumptions ◮ The branch itself determines the order of these assumptions ◮ Let us use notation l 1 @ 1 , l 2 @ 2 , . . . ◮ Example: Our clause set contains among others p 1 ∨ p 17 ∨ ¯ p 9 , ¯ p 9 ∨ p 7 and at level 1 we make decision p 9 (i.e. p 9 @ 1) and at level 2 we make decision ¯ p 8 (i.e. ¯ p 8 @ 2) ◮ Literal p 7 is computed (using BCP ) also at level 1, so we write p 7 @ 1 ◮ Formally, if l 1 @ 1 , . . . , l n @ n is the sequence of decisions taken on the branch B , then we write m @ k if m ∈ BCP ( { l 1 , . . . , l k } \ BCP ( { l 1 , . . . , l k − 1 } ◮ Thus on a given branch, every computed literal gets its level 90 / 1
Satisfiability What about a contradiction? ◮ Say we are on a branch B , we made the sequence of decisions l 1 @ 1 , . . . l n @ n , and BCP ( { l 1 , . . . , l n } ) is contradictory ◮ The choice of decisions { l 1 , . . . , l n } is called no-good ◮ When { l 1 , . . . , l n } is no-good then = ¯ l 1 ∨ . . . ¯ F | l n ◮ The reason is that a unit-resolution proof of ⊥ can be constructed (recall the connection between BCP and unit resolution discussed above) ◮ Let us also observe that BCP ( v ) = BCP ( BCP ( v )) ◮ Thus there may be other no-goods on the branch, kind of secondary 91 / 1
Contradiction analysis Satisfiability ◮ Let us look at a kind of drastic case ◮ In our F we may have ¯ p 1 ∨ p 34 ∨ p 57 and also p 34 ∨ ¯ p 57 ◮ We make decisions p 1 @ 1 , p 12 @ 2 , ¯ p 17 @ 3 and then at level 4 we make the decision ¯ p 34 ◮ We get a contradictory BCP but if we look closely, we see that really only decisions p 1 @ 1 and ¯ p 34 @ 4 are responsible for contradiction ◮ This is an important discovery, because instead a longer no-good ¯ p 1 ∨ ¯ p 12 ∨ p 17 ∨ p 34 we get a short one: ¯ p 1 ∨ p 34 ◮ We say that we learned ¯ p 1 ∨ p 34 92 / 1
Contradiction analysis, cont’d Satisfiability ◮ During the current search the original no-good ¯ p 1 ∨ ¯ p 12 ∨ p 17 ∨ p 34 can not be reused to close branches in other parts of the search tree ◮ But the new one, ¯ p 1 ∨ p 34 can, because if we ever encounter p 1 and ¯ p 34 on some other branch then we now know we need to backtrack because the reasoning that generated ⊥ out of F , p 1 , and ¯ p 34 can be repeated ◮ So, the issue is how can we learn new no-goods ◮ (This was discovery, in the GRASP project, that was a major improvement to DPLL) 93 / 1
Contradiction analysis, cont’d Satisfiability ◮ The literature reports various techniques ◮ The one used in GRASP and then fine-tuned in zChaff and now used in most solvers is based on a graph-theoretic method ◮ Whenever BCP ( { l 1 , . . . , l n } ) is contradictory (i.e. contains a pair of contradictory literals) we can build a directed graph G describing the contradiction proof 94 / 1
Contradiction analysis, cont’d Satisfiability ◮ The decisions are sources in that graph; the sink is ⊥ ◮ Contradiction may be obtained out of { l 1 , . . . , l n } in more than one way; we select one ◮ (Can you cook an example where there is more than one contradiction after some decision?) ◮ Nodes are labeled by clauses used to derive a literal that is the output of such node ◮ (Incoming) edges are labeled by literals used to reduce the clause to a single literal 95 / 1
Contradiction analysis, cont’d Satisfiability ◮ In our example we had a node labeled with the clause ¯ p 1 ∨ p 34 ∨ p 57 ◮ Into that node came in edges labeled with p 1 , and ¯ p 34 ◮ The first one was the decision, the second came from some other derivation ◮ The third one was the output ◮ Actually, in our example we had, eventually, the node producing the output p 57 and another with ¯ p 57 (which resulted in ⊥ ) 96 / 1
Contradiction analysis, cont’d Satisfiability ◮ Graph-theorists know that there is a node x at level n with the following property: every path from the decision point of level n to the sink must pass through x ◮ That node is called (first) unique implication point 1-UIP ◮ (There are other implication points, also used) ◮ Then we look at that UIP and do the following: all literals to which are proved (remember our graph is a proof) after x are on one side of our cut (this side is called by Malik and coauthors of zChaff conflict side ◮ The remaining literals (including u ) are on the other side of the cut (called reason side ) 97 / 1
Contradiction analysis, cont’d Satisfiability ◮ The maximal elements of the reason side have the property that they entail the contradiction, because once we have them, we can use our graph G to get formally the contradiction ◮ Let R be conjunction of these maximal elements ◮ We see that F | = ¬ R l ∈ R ¯ ◮ Therefore F | = � l l ∈ R ¯ ◮ This clause � l is what we learn 98 / 1
Contradiction analysis, cont’d Satisfiability ◮ But really, it should be obvious that the following holds: Let RBCP ( L ) where L is the set of decisions made when we got a contradictory BCP be conjunction of inclusion-minimal set of literals such that BCP ( R ) is contradictory. Then F | = ¬ R , that is l ∈ R ¯ F | = � l ◮ Thus we have a method to learn new clauses ◮ The method to learn from 1-UIP is effective (with the appropriate data structures); the general method mentioned on this slide is not ◮ Let us observe, though, that such inclusion-minimal set must contain at least one element of level n (why?) ◮ So, 1-UIP method tells us which element of the last level to choose, and moreover it chooses a specific one 99 / 1
Contradiction analysis, cont’d Satisfiability ◮ But there is no reason to learn just one clause in this fashion ◮ As mentioned above each proof of contradiction (and there can be several) allows to learn a new clause ◮ But we do not have to stop a 1-UIP (i.e. at the last level) ◮ We can do more then one UIP , and learn more clauses ◮ But this is ‘embarrasse de richesses”. The reason is that one, presumably, learns at each backtrack ◮ As there are, in principle O ( 3 m ) potential backtracks one needs to manage the learned clauses ◮ zChaff proposed a heuristics for managing learned clauses ◮ This heuristics, called VSIDS drops learned clauses (why only learned?) when they are no longer likely to contribute 100/ 1
Recommend
More recommend