advanced algorithms
play

Advanced algorithms Based on texts from: H. Cormen, S. Dasgupta, G. - PowerPoint PPT Presentation

Introduction Divide and conquer D.P. G.A. NP-completeness Advanced algorithms Based on texts from: H. Cormen, S. Dasgupta, G. Bertrand, and M. Couprie X. Hilaire ESIEE Paris, IT department October 15, 2018 Introduction Divide and conquer


  1. Introduction Divide and conquer D.P. G.A. NP-completeness Theorem (weak Master) Suppose that T ( n ) = aT ( n / b ) + α n d for some a > 0 ,b > 1 ,d ≥ 0 , and n = b r (with r > 0 integer). Then it holds that  O ( n d ) if d > log b a   O ( n d log n ) T ( n ) = if d = log b a   O ( n log b a ) if d < log b a T(n) Proof. Observe first that . . . . T(n/b) T(n/b) T(n/b) the recurrence turns into a tree, whose height is ex- actly r = log b n : T(n/b^2) T(n/b^2) T(n/b^2) T(n/b^2) . . . . . . . . . . . . T(1) T(1) T(1) T(1)

  2. Introduction Divide and conquer D.P. G.A. NP-completeness Substitute the definition of T ( n ) in itself r times: T ( n ) = aT ( n / b ) + α n d = a ( aT ( n / b 2 ) + α ( n / b ) d ) + α n d = a 2 T ( n / b 2 ) + α ( a ( n / b ) d + n d )) = a 3 T ( n / b 3 ) + α ( a 2 ( n / b 2 ) d + a ( n / b ) d + n d ) = ... r − 1 r − 1 a i ( a � � = a r T (1) + α n d ( b d ) i = a r T (1) + α n d b d ) i i =0 i =0 From the last term, we have to distinguish wether the series ratio a / b d is equal to 1 or not to continue. If a / b d = 1, or equivalently, d = log b a , then the terms under summation all equal 1 and it remains T ( n ) = a r T (1) + α n d r = a log b n T (1) + α n d log b ( n ) = O ( n d log( n )) which proves the second case of the theorem.

  3. Introduction Divide and conquer D.P. G.A. NP-completeness Suppose now d � = log b a . Then the summation in T ( n ) evaluates to � a � r − 1 �� a � r � � � n log b a − d − 1 b d − 1 = β = β a b d b d − 1 � a � − 1 . Therefore, we always have that putting β = b d − 1 T ( n ) = a log b n T (1) + αβ n d � � n log b a − d − 1 � n log b a − n d � = n log b a T (1) + αβ provided d � = log b a . Two cases need be distinguished: Case 1: d < log b a . Then the n log b a term dominates, and β > 0, so T ( n ) = n log b a T (1) + αβ n log b a = O ( n log b a ) which proves the last case of the theorem. Case 2: d > log b a . Then this time the n d term dominates, and β < 0, so T ( n ) = a log b n T (1) + α | β | n d = O ( n d ) as claimed in the first case of the theorem. �

  4. Introduction Divide and conquer D.P. G.A. NP-completeness A similar result holds with integer parts, n not necessarily a power of b , and big O notation: Theorem (Master) Suppose that T ( n ) = aT ( ⌈ n / b ⌉ ) + O ( n d ) for some a > 0 , b > 1 , and d ≥ 0 . Then it holds that  O ( n d ) if d > log b a   O ( n d log n ) T ( n ) = if d = log b a   O ( n log b a ) if d < log b a Back to our multiplication... since we can answer the question: T ( n ) = 4 T ( n / 2) + O ( n ) So a = 4 , b = 2 , d = 1, and the master theorem tells our D&Q strategy runs in O ( n 2 ) :(

  5. Introduction Divide and conquer D.P. G.A. NP-completeness Can we do better? DQ ( x , y ) = 2 n x H y H + 2 n / 2 ( x H y L + x L y H ) + x L y L x H y L + x L y H = ( x H + x L )( y H + y L ) − x H y H − x L y L � �� � already known So ... DQ_mul (xH,xL,yH,yL,n) { a= xH*yH; b= xL*yL; c= (xH+xL)*(yH+yL)-a-b; return a << n + c << (n/2) + b; } which contains 6 additions, but only 3 multiplications now. So now, a = 3, b = 2, d = 1, and an application of Master theorem tells us that DQ runs in O ( n 1 . 585 ) time.

  6. Introduction Divide and conquer D.P. G.A. NP-completeness Example 2: merge sort (MS) The MS algorithms sorts an input array of n elements by recursively splitting it into 2 subarrays of size n / 2, and merges the result of the sorted subarrays to keep the solution sorted itself. function MS(tab[begin..end]) 1 2 3 4 4 7 8 9 if begin = end then return tab; 1 2 4 8 3 4 7 9 else s1 ← MS(tab[begin.. ⌊ begin + end ⌋ ]); 2 s2 ← MS(tab[ ⌊ begin + end ⌋ +1..end]); 1 8 2 4 3 7 4 9 2 return merge(s1,s2); end if 1 8 4 2 7 3 9 4 end function

  7. Introduction Divide and conquer D.P. G.A. NP-completeness The merge function is easily achieved in linear time: function merge(tab1[b1..e1],tab2[b2..e2]) p1 ← b1, p2 ← b2, t ← 1, s ← b2-e2+b1-e1+2; res ← new array(1..s); for t=1..s do if tab1[p1] < tab2[p2] then res[t] ← tab1[p1]; p1 ← min(p1+1,e1); else res[t] ← tab2[p2]; p2 ← min(p2+1,e2); end if t ← t+1; end for return res; end function

  8. Introduction Divide and conquer D.P. G.A. NP-completeness What is the time complexity of mergesort? For an array of size n : T (2 p ) = 2 T ( p ) + O (2 p ) T (2 p + 1) = T ( p ) + T ( p + 1) + O (2 p + 1) ⇒ 2 T ( ⌊ n / 2 ⌋ ) + O ( n ) ≤ T ( n ) ≤ 2 T ( ⌈ n / 2 ⌉ ) + O ( n ) The master theorem, applied on the right-hand side, tells that than T ( n ) = O ( n log n ) since a = b = 2, and log 2 (2) = 1 = d . The left-hand side tells us the same, so mergesort has a time-complexity of Θ( n log n ). It is its biggest advantage : n log n operations ensured whatever the input. Its drawback: running time constants (must copy all subarrays... see exercise on quicksort).

  9. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise Find the Ω and O bounds for each of the following recurrences (floors ⌊ . ⌋ have been omitted to simplify notations): T ( n ) = 2 T ( n / 3) + n 2 T ( n ) = T ( n − 1) + n T ( n ) = T ( n − 1) + 1 / n T ( n ) = T ( n − 1) + T ( n − 2) + n / 2 T ( n ) = T ( n / 2) + T ( n / 3) T ( n ) = 2 T ( n / 2) + log n T ( n ) = T ( √ n ) + 1 T ( n ) = √ nT ( √ n ) + n

  10. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise Professor Sempron has n chipsets, not all of them are in good state. He can only test them pairwise, and according to the following: if a chip is good, it will always say the truth on the other chip (that is, the other chip is good or damaged) if a chip is damaged, its opinion on the other chip is unpredictable 1 Show it is impossible to diagnose which chips are reliable if more than n / 2 of them are damaged. 2 Assuming that more than n / 2 chips are good, show that only Θ( n ) operations are sufficient to find a good chip amongst the n 3 Show that Θ( n ) operations are sufficient to diagnose the good chips, still assuming more than n / 2 of them are good.

  11. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise You are given two sorted arrays A and B of integers of size m and n. Give an O (log m + log n ) algortihm to find the kth smallest element of A ∪ B. Exercise You are given k sorted arrays of integers of n elements each, and you would like to merge them into a single array of kn elements. Give an efficient algorithm to do this.

  12. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise Suppose you are given a set S of n points of the plane P i = ( x i , y i ) , i = 0 , . . . , n. You would like to find which pair of points P i , P j has smallest Euclidean distance: ( i , j ) = arg u , v ∈ [1: n ] , u � = v || P u − P v || min How can you find a value x such that the sets of points L and R for which x i ≤ x and x i ≥ y are of equal size (up to a unit)? Say than S has been split that way. Recursively find the pairs of points ( p L , q L ) ∈ L × L and ( p R , q R ) ∈ R × R whose Euclidean distance is smallest in each subset. Put δ = min( || p L − q L || , || p R − q R || ) . It remains to check wether we can find ( p , q ) ∈ L × R such that || p − q || < δ . How can you achieve this? (hint: sort only a very specific set of points along the y axis)

  13. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise ((continued)) Show that your algorithm is correct (hint: show that any box of size δ × δ contains at most 4 points) Write the full pseudocode of your algorithm, and show that its running time obeys T ( n ) = 2 T ( n / 2) + O ( n log n ) Show that this recurrence solves to O ( n log 2 n ) Can you bring the time complexity down to O ( n log n ) ? What happens if your points live in R d , d ≥ 3 rather than R 2 ? (From “Algorithms” by Dasgupta, Papadimitriou, and Vazirani. Related problem: Guibas-Stolfi & Leach algorithms for Delaunay triangulation)

  14. Introduction Divide and conquer D.P. G.A. NP-completeness Dynamic programming

  15. Introduction Divide and conquer D.P. G.A. NP-completeness What is dynamic programming (DP)? In short: a technique suitable to solve problems which exhibit two mains features: 1 Optimal substructure: An 1 optimal solution for a problem of size n can be exressed as a combination of the optimal solutions of subproblems of sizes less than n . Subproblems may partially (and not completely) overlap one each other. 2 Any subproblem of size less than n is easier to solve than a problem of size n . In other words, a solution S can be expressed as k � I = A i i =1 such that | A i | < | I | , ∀ i = 1 , . . . , k , and A i ⊆ A j does not hold for any pair ( i , j ). All problems pertaining of A i ’s must be solvable. 1 Unicity is not required

  16. Introduction Divide and conquer D.P. G.A. NP-completeness The word “programming” does not stand for “writing code”, but rather for “program the solution”, thanks to tables. Example 1: the shortest path problem Let G = ( V , E ) be a graph (fig. 1) with vertices V and edges E , and let s , t ∈ V . A path P between s and s is a sequence of vertices ( x 1 , ..., x n ) such that x 1 = s , x n = s , all x i ’s are different, and ( x i , x i +1 ) ∈ E , i = 1 , ..., n − 1. The length of P is | P | = n . S=a b d e f c j=T g h i Figure 1: The shortest path problem between S and T

  17. Introduction Divide and conquer D.P. G.A. NP-completeness How to compute a shortest path SP ( s , t ; G ) between s and t given G ? Suppose s = a and t = j as in the figure. Whatever the solution (if it exists), a shortest path ending at j must necessarily pass through one of g or i . Moreover, the “cost” for moving from g or i is the same in both cases: unitary. Therefore: cost ( a , j ; G ) = 1 + min x ∈{ g , i } cost ( a , x ; G x ) SP ( a , j ; G ) = { j } ∪ { arg min x ∈{ g , i } cost ( a , x ; G x ) } or more generally: cost ( u , u ) = 0 cost ( u , v ; G ) = 1 + x ∈ pred ( v ) cost ( u , x ; G x ) min SP ( u , v ; G ) = { v } ∪ { arg x ∈ pred ( v ) cost ( u , x ; G x ) } min

  18. Introduction Divide and conquer D.P. G.A. NP-completeness where pred ( x ) is the set of neighbours of x not visited yet, G x is the subgraph of G with nodes V \ { x } , and edges E \ { ( z , x ) : z ∈ V } . Can we exploit these relations directly, in a recursive manner? From Fig. 1, we see that after at most 2 recursions, both subproblems require to solve SP ( a , f ; G ghij ), as any path from a to j must pass through f . More glaring: x −2l x −1l x −nl s t x x x −nr −2r −1r Figure 2: A failure case for recursive SP

  19. Introduction Divide and conquer D.P. G.A. NP-completeness Running time T ( n ) = 2 cost ( n − 2) + O (1), which solves to T ( n ) = Θ(2 n ). Wheezes: 1 Wouldn’t it be better to use a table so as to avoid recompute the solutions we already know? 2 Do we really need recursion? Towards to the solution... Observe first that SP ( a , j ; G ) = SP ( j , a ; G ) (“backpropagating” towards to a , or “propagating” from a towards to j are equivalent) Basic idea of the algorithm: keep two tables C [] and P [], both indexed by the vertices of G , up to date so that at each iteration and for any x ∈ V : C [ x ] contains either the length of a shortest path known from a to x , or ∞ if no path is known yet.

  20. Introduction Divide and conquer D.P. G.A. NP-completeness P [ x ] contains the node y such that yx belongs to the shortest path if C [ x ] � = ∞ Suppose we start from a , which we insert into A , the set of active vertices. Informal algorithm: 1 Initialize A to { a } , C [ a ] to 0, and other C ’s to ∞ . P need not be initialized. 2 For all active vertices x ∈ A , look for neighbours y (that is, all y ∈ V such that ( x , y ) ∈ E ) 3 For all such y : if C [ y ] > C [ x ] + 1 then the path stemming from x is better than the best path known so far, so we should update the tables: C [ y ] = C [ x ] + 1, P [ y ] = x 4 Insert y in A 5 Iterate to step 3 until no more y 6 Remove x from A 7 Iterate to step 3 unless y = j 8 Iterate to step 2 unless y = j or A is empty

  21. Introduction Divide and conquer D.P. G.A. NP-completeness More formally: Input: G = ( V , E ) a non-empty graph, s , t ∈ V . Output: either A = ∅ , and there is no solution; or a shortest path can be derived by walking P (see exercise next frame) Init: C [ s ] ← 0, C [ V \ { s } ] ← ∞ , A ← { s } , P created while A � = ∅ do A ′ ← ∅ for all x ∈ A do if x = t then stop else for all ( x , y ) ∈ E do if C [ y ] > C [ x ] + 1 then C [ y ] ← C [ x ] + 1, P [ y ] ← x , A ′ ← A ′ ∪ { y } end if end for end if end for A ← A ′

  22. Introduction Divide and conquer D.P. G.A. NP-completeness Remarks Within the while loop, we built an A ′ subset, that we finally substituted to A This is for the sake of clarity : the loop should indeed not modify A itself Entering the while loop for the n -th time, we update all nodes touched by a path whose length is exactly n The very first time t is hit therefore correponds to the shorted possible path that effectively reaches t More remarks: The solution returned is not unique, and the only case in which we terminate with no solution is one where s and t belong to different connected components

  23. Introduction Divide and conquer D.P. G.A. NP-completeness The above algorithm would work equally well if we used positve weighted edges It is a variant of Dijkstra’s algorithm, which compute the shortest paths to all nodes reachable from s : suffice, for this, to remove the if test. Exercise 1 Run the above algorithm on the data of Fig. 1, and for each while iteration, give the values of A, C, and P. 2 Write the pseudo-code to print the solution of the shortest path in case the algorithm has stopped with t ∈ A � = ∅ . 3 What is the complexity of the above algorithm? Could you improve it?

  24. Introduction Divide and conquer D.P. G.A. NP-completeness Remarks: (cont’d) Bellman-Ford’s algorithm does the same, but weights can be < 0 too provided G carries no cycle. Floyd-Warshall algorithm does the same job for all source nodes Example 2: the Levenstein distance of two strings Suppose you are given two strings S (source) and T (target), defined over the same alphabet: S is a corrupted version of R : some characters have either been replaced, inserted, or deleted with a unitary cost 2 : cost(insertion)=cost(replacement)=cost(deletion)=1, cost(match)=0 T is not corrupted Which sequence of insertions/replacements transforms S into T with minimal cost? 2 Only to keep the problem simple...

  25. Introduction Divide and conquer D.P. G.A. NP-completeness T= h o r s e s S= h e r b i s t s Could be: e → o, b → ∅ , i → ∅ , t → e : cost = C(S,T)= 4 Put s ∈ [1 : | S | ] and t ∈ [1 : | T | ], and consider S [ s ] and T [ t ] alone. The crucial point: how S [ s ] matches T [ t ] does not depend on how S [1 .. s − 1] matched T [1 .. t − 1]. Indeed, when we arrive at s : If S [ s ] is neither an inserted nor a deleted character, then C ( S [1 .. s ] , T [1 .. t ]) = C ( S [ s ] , T [ t ])+ C ( S [1 .. s − 1] , T [1 .. t − 1]) If S [ s ] is an inserted character, then C ( S [1 .. s ] , T [1 .. t ]) = 1 + C ( S [1 .. s − 1] , T [1 .. t ]) If S [ s ] is a deleted character, then C ( S [1 .. s ] , T [1 .. t ]) = 1 + C ( S [1 .. s ] , T [1 .. t − 1])

  26. Introduction Divide and conquer D.P. G.A. NP-completeness Of all these costs, only the one that is minimal is of interest. In other words, we always have that   C ( S [ s ] , T [ s ]) + C ( S [1 .. s − 1] , T [1 .. t − 1])   C ( S [1 .. s ] , T [1 .. t ]) = min 1 + C ( S [1 .. s − 1] , T [1 .. t ]) ,   1 + C ( S [1 .. s ] , T [1 .. t − 1]) The relation is trivially true at s = | S | and t = | T | , and true at s = t = 1 if we commit that S [1 .. 0] and T [1 .. 0] are empty strings. As usual, direct use of recursion is a bad idea. The Levenstein algorithms reads as follows:

  27. Introduction Divide and conquer D.P. G.A. NP-completeness Input: two strings S and T Output: the Levenstein distance in C [ | S | , | T | ] C ← array (0 .. | S | , 0 .. | T | ); s ← | S | , t ← | T | for i=0..s do C[i,0] ← i; end for for i=0..t do C[0,i] ← i; end for for j=1..t do for i=1..s do ok ← C[i-1,j-1] + 1 S [ i ]= T [ j ] ; ins ← C[i-1,j]+1; del := C[i,j-1]+1; C[i,j] ← min(ok, ins, del); end for end for

  28. Introduction Divide and conquer D.P. G.A. NP-completeness Todo (in class) : run the alg. on the example blade05$ gnatmake dp_levenstein gcc-4.4 -c dp_levenstein.adb gnatbind -x dp_levenstein.ali gnatlink dp_levenstein.ali blade05$ ./dp_levenstein h e r b i s t s 0 1 2 3 4 5 6 7 8 h 1 0 1 2 3 4 5 6 7 o 2 1 1 2 3 4 5 6 7 r 3 2 2 1 2 3 4 5 6 s 4 3 3 2 2 3 3 4 5 e 5 4 3 3 3 3 4 4 5 s 6 5 4 4 4 4 3 4 4 blade05$

  29. Introduction Divide and conquer D.P. G.A. NP-completeness Example 3: matrix multiplication Suppose A , B , C , D are 4 matrices with dimensions m × n , n × p , p × q , and q × r . We want to evaluate A × B × C × D . Matrix multiplication is not commutative, but associative: A × B × C = ( A × B ) × C = A × ( B × C ) Multiplicating two matrices of dimensions m × n and n × p takes mnp multiplications and ( m − 1)( n − 1) p additions, so running time dominated by the mnp multiplications. operation multiplications numeric ((AB)C)D mnp + mpq + mqr 23500 A((BC)D) npq + nqr + mnr 12500 A(B(CD)) pqr + npr + mnr 28750 (AB)(CD) mnp + pqr + mpr 31000 Table 1: Effect of different parenthesing order ( m = 30, n = 8, p = 20, q = 50, r = 25)

  30. Introduction Divide and conquer D.P. G.A. NP-completeness How to compute the parenthesing order that yield the lowest number of multiplications? Take a tree representation of the solution. At toplevel: cost ( ABCD ) = mnr ���� + cost ( A ) � �� � + cost ( BCD ) � �� � ABCD and cost ( BCD ) has to be in turn optimal. We can easily generalize to any chain of j − i + 1 A BCD matrices: M i ∈ M d i − 1 × d i BC D M i +1 ∈ M d i × d i +1 ..... B C M n ∈ M d n − 1 × d n

  31. Introduction Divide and conquer D.P. G.A. NP-completeness If we decide to cut at index k ( k matrices for the left child), then: We have two matrices of dimensions d i − 1 × d k and d k × d j to multiply → cost = d i − 1 d k d j Final cost = d i − 1 d k d j + cost ( i − 1 , . . . , k )+ cost ( k , . . . , j ) Of all possible costs, we want the lowest possible, so: cost ( i , j ) = k = i ,..., j − 1 { d i − 1 d k d j + cost ( i , k ) + cost ( k + 1 , j ) } min The code readily follows:

  32. Introduction Divide and conquer D.P. G.A. NP-completeness Input: dimensions d 0 , . . . , d n of n matrices Output: C [1 , n ] as the lowest cost Init: C ← array ( n , n ) for i = 1 , . . . , n do C [ i , i ] ← 0 end for for size = 1 , . . . , n − 1 do for i = 1 , . . . , n − size do j ← size + i C [ i , j ] ← min k =1 ,..., j − 1 { d i − 1 d k d j + C [ i , k ] + C [ k + 1 , j ] } end for end for

  33. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise A contiguous subsequence of a list S is a subsequence made up of consecutive elements of S. For instance, if S= { 1 , − 3 , 2 , 7 , 8 , 0 , 3 } , then {− 3 , 2 , 7 } is a subsequence, but { 2 , 7 , 3 } is not. Give a linear time algorithm to compute the contiguous subsequence of maximum sum. Hint: consider subsequences ending exactly at position j. Exercise You are given a piece of fabric with integer dimensions X × Y . You have a set of n template objects, each of which requires a piece of fabric with integer dimensions x i × y i to be copied. If you produce a copy of object i, your profit is c i ; you can produce as many copies of any object you want, or none. You have a machine that can cut any piece of fabric into two pieces, either horizontally or vertically. Propose an algorithm which tells you how to maximize your profit.

  34. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise Let P be a convex polygon with n vertices. A triangulation of P is an array of n − 3 diagonals of P, no two of which intersect each other. The cost of a triangulation is the sum of the lengths of all diagonals. Give an efficient algorithm to compute a triangulation of P of minimal cost, and evaluate its complexity. Exercise The travelling salesman problem is NP-complete (see forthcoming chapter). A relaxed version restricts the problem to bitonic cycles: in such cycles, one is allowedi only to visit points from left to right, then right to left to visit points. Propose an O ( n 2 ) to solve the TSP problem under this hypothesis.

  35. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise You have a machine that can process only a single task at a time. You have n such tasks a 1 , ..., a n , whose respective duration is t i seconds, and (absolute) execution deadline is d i . If you terminate task a i before d i , you earn p i Euros; otherwise, you earn nothing. Propose an algorithm to find the scheduling that maximise your benefit, and evaluate its complexity.

  36. Introduction Divide and conquer D.P. G.A. NP-completeness Greedy algorithms

  37. Introduction Divide and conquer D.P. G.A. NP-completeness Consider a slightly modified version 3 of the last exercise from the dynamic programming chapter: Suppose you have a machine that can process only a single task at a time. You have a set S of n such tasks S = { τ 1 , ..., τ n } . Task τ i must begin at time b i , and must be terminated before time e i . Propose an algortihm to find the scheduling that maximise the number of tasks performed. Can we do better than the solution of D.P? Assume (wlog) that the tasks are indexed by increasing deadlines order: d 1 ≤ d 2 ≤ ... ≤ d n Commit that b 0 = −∞ , e 0 = 0, and b n +1 = ∞ , e n +1 = 1 + ∞ . 3 We have changed the profits c i ’s to 1’s

  38. Introduction Divide and conquer D.P. G.A. NP-completeness b1 e1 b2 e2 e7 e0 b8 Figure 3: Sorted intervals for task sel. problem Define S i , j = { τ k : e i ≤ b k < e k < b j } . The search solution is S 0 , n +1 . First observe that ∀ i ≥ j , S ij = ∅ : if there were τ k ∈ S ij , then e i ≤ b k < e k < b j < e j , so e i < e j with i ≥ j , a contradiction (the e i ’s are in increasing order).

  39. Introduction Divide and conquer D.P. G.A. NP-completeness Suppose S ⋆ ij is an optimal subset for S ij : ∀ T ⊂ S ⋆ | T | < | S ⋆ ij , ij | . Moreover assume S ⋆ ij � = ∅ , so that τ k ∈ S ⋆ ij well exists. Then S ⋆ ij = S ⋆ ik ∪ { τ k } ∪ S ⋆ kj Because all three subsets in the union are disjoint 4 , it follows that | S ⋆ ij | = | S ⋆ ik | + 1 + | S ⋆ kj | Remind that S ij = ∅ whenever i ≥ j , so is S ⋆ ij . Thus, if P [ i , j ] is a table representing our profit, then � if S ij = ∅ 0 P [ i , j ] = 1 + max i < k < j { P [ i , k ] + P [ k , j ] } otherwise A DP code could readily follow... however there is this: 4 For otherwise, the subset would not be optimal

  40. Introduction Divide and conquer D.P. G.A. NP-completeness Theorem Assume S ij � = ∅ , and denote by τ m ∈ S ij the task with lowest ending date. We claim that: 1 τ m ∈ S ⋆ ij 2 S im = ∅ Proof. Point 2: Suppose τ k ∈ S im � = ∅ . By definition of S im : e i ≤ b k < e k < b m < e m , so e k < e m , a contradiction. Point 1: let τ k be the task of S ⋆ ij with lowest ending time. If it happens that k = m , the proof is achieved. If not, observe that all intervals ( b , e ) in S ⋆ ∈ S ⋆ ij must be disjoint. If τ m / ij , that means there would exist another task τ k whose ending date would be larger than that of τ k without being more than the beginning date of the next task in S ⋆ ij . Since both subsets would have the same time, using τ k or τ m makes no difference, therefore τ m ∈ S ⋆ ij . �

  41. Introduction Divide and conquer D.P. G.A. NP-completeness Very important consequences: 1 Don’t bother evaluating S mj : you won’t find anything interesting there 2 This results into a dramatically simple algorithm: Find the task a with lowest ending date Add it to the optimal solution ← the greedy choice Discard all tasks whose beginning date is before the ending of a Loop to step 1 until no more tasks 3 Complexity: θ ( n ) if tasks are given sorted by their ending date, θ ( n log n ) if they are not.

  42. Introduction Divide and conquer D.P. G.A. NP-completeness Resulting algorithm: Input : n tasks τ i = ( b i , e i ) Ouput : A = optimal scheduling of the τ ’s Sort the τ i ’s by increasing e i ’s tmax ← 0 , A ← ∅ for i=1..n do if tmax < b i then tmax ← e i A ← A ∪ { τ i } end if end for In two words: Greedy algorithms make a choice of the most immediate profitable choice among a (possibly very large) set That choice is never put back into question later on

  43. Introduction Divide and conquer D.P. G.A. NP-completeness They might be the answer to an optimal solution (like in task sched.) : don’t miss that chance, and try to derive the proof. They might be suboptimal (in time or objective) as well Quite often, the greedy strategy finds solutions not too bad to NP-complete problems Example 2: Huffman coding Assume a file F containing only 5 characters : a,b,c,d,e. Suppose n a = 50000, n b = n c = 800, n d = 600, n e = 300. Since 2 2 < 5 < 2 3 , a fixed-length coding requires 3 bits/character, so size(F)=51700 × 3= 155100 bits= 19388 bytes

  44. Introduction Divide and conquer D.P. G.A. NP-completeness However, if a “=” 1, b ”=” 010, c ”=” 011, d ”=” 000, and e “=” 001, then size ( F ) = 50000 × 1 + (800 + 800 + 600 + 300) × 3 = 57500 bits = 7188 bytes 0 1 symbol code length 2700 50000 a 1 1 a b 010 2 0 1 c 011 3 900 1800 d 000 3 0 1 0 1 e 001 3 600 300 800 800 d e b c

  45. Introduction Divide and conquer D.P. G.A. NP-completeness In a Huffman tree, symbols are stored in leafs only ⇒ coding sequences are variable in length, but unique per symbol. Questions: 1 What is the optimal structure 5 of a tree given an alphabet Λ with 2 or more symbols, and number of occurences n : s ∈ Λ �→ n ( s ) ? 2 Assuming the optimal structure is know, where should symbols be stored? We can answer both thanks to 2 lemmas. Lemma Let T be a Huffman tree (optimal), and x , y ∈ Λ denote the 2 symbols with lowest number of occurences amongst all. Then x and y have to be stored on some leafs of T with highest possible depth. 5 The one which yields the lowest coding cost, not accounting for the Huffman tree itself

  46. Introduction Divide and conquer D.P. G.A. NP-completeness Proof. Call d : s ∈ Λ �→ N ∋ d ( s ) the function that gives the depth of any symbol s of Λ in T . Then the cost of T is � C ( T ) = d ( s ) n ( s ) s ∈ Λ Suppose x be not located on a leaf of T with maximal depth : ∃ z ∈ Λ : d ( z ) > d ( x ) , n ( z ) ≥ n ( x ) Call T ′ a copy of T in which x and z have been swapped. Denote d ′ the related depth function. Then � C ( T ′ ) − C ( T ) = d ′ ( x ) n ( x ) + d ′ ( z ) n ( z ) + d ( s ) ′ n ( s ) ′ s ∈ Λ \{ x , z } � − d ( x ) n ( x ) − d ( z ) n ( z ) − d ( s ) n ( s ) s ∈ Λ \{ x , z }

  47. Introduction Divide and conquer D.P. G.A. NP-completeness = d ′ ( x ) n ( x ) + d ′ ( z ) n ( z ) − d ( x ) n ( x ) − d ( z ) n ( z ) = n ( x )( d ′ ( x ) − d ( x ) ) + n ( z )( d ′ ( z ) − d ( z ) ) � �� � � �� � < 0 < 0 < 0 which contradicts optimality of T . Same results holds for y too, without it be necessary to allocate any new node. � Lemma Consider x,y, Λ , and T as defined in previous lemma. Call z a dummy character on Λ ′ = Λ \ { x , y } ∪ { z } , such that n ′ ( z ) = n ( x ) + n ( y ) . If T ′ is optimal for Λ ′ , then T is optimal for Λ .

  48. Introduction Divide and conquer D.P. G.A. NP-completeness Proof. Because x and y are both sons of z , d ( x ) = d ( y ) = d ′ ( z ) + 1. This implies d ( x ) n ( x ) + d ( y ) n ( y ) = ( d ′ ( z ) + 1) n ( x ) + ( d ′ ( z ) + 1) n ( y ) = d ′ ( z )( n ( x ) + n ( y )) + n ( x ) + n ( y ) = d ′ ( z ) n ( z ) + n ( x ) + n ( y ) Therefore (copy-past argument from previous proof) C ( T ) − C ( T ′ ) = n ( x ) d ( x ) + n ( y ) d ( y ) − n ( z ) d ′ ( z ) = n ( z ) d ′ ( z ) + n ( x ) + n ( y ) − n ( z ) d ′ ( z ) = n ( x ) + n ( y ) Now suppose T be not optimal. So there exists an optimal coding O , which must have x and y as extreme leafs according to the first lemma.

  49. Introduction Divide and conquer D.P. G.A. NP-completeness Denote O ′ a copy of O with { x , y } replaced by z . Then C ( O ′ ) = C ( O ) + n ( x ) + n ( y ) < C ( T ) − n ( x ) − n ( y ) = C ( T ′ ) which contradicts optimality of T ′ . �

  50. Introduction Divide and conquer D.P. G.A. NP-completeness Resulting algorithm: Input : Λ and n Ouput : Huffman tree of (Λ , n ) Q ← ∅ – Q is a priority queue ranked by increasing n ’s for s ∈ Lambda do insert(( s , n ( s )),Q) end for for i=1.. | Λ | − 1 do left ← pop(Q) right ← pop(Q) z ← new node(left, right) insert(( z , n ( left ) + n ( right )), Q) end for return pop(Q)

  51. Introduction Divide and conquer D.P. G.A. NP-completeness Example 3: Minimal spanning trees → Jean Cousty’s lecture on Kruskall algorithm and the cut property of MST.

  52. Introduction Divide and conquer D.P. G.A. NP-completeness Example 4: bounded fractional knapsack problem During a burglary, a robber finds n bags containing different kinds of powders. Powder i is worth p i > 0 Euros a gram, but is available in limited quantity, say q i > 0 grams. Also, the robber cannot carry more than b grams of product altogether. How much of each bag should he steal? We wish to maximise n � S = x i p i i =1 subject to S ≤ b x i ≤ q i , ∀ i = 1 , ..., n Assuming (wlog) that the p i ’s are sorted by decreasing order, does the following greedy algorithm provide the right answer?

  53. Introduction Divide and conquer D.P. G.A. NP-completeness Input: p i ’s, q i ’s and b Output: amount x i ’s to steal in each bag r ← b for i=1..n do x i ← min( q i , r ) r ← r − x i end for Justification:

  54. Introduction Divide and conquer D.P. G.A. NP-completeness Example 5: bounded integer knapsack problem Same problem as before, but bag i now contains n i objects, each of which is worth p i Euros and weights w i grams. Assuming bags are ordered by decreasing p i ’s, does the following code produce the right answer? Input: p i ’s, w i ’s, n i ’s and b Output: number x i ’s of objects to steal in each bag r ← b for i=1..n do x i ← min( n i , r ÷ w i ) r ← r mod w i end for Justification:

  55. Introduction Divide and conquer D.P. G.A. NP-completeness Example 6: vertex cover (to show in class)

  56. Introduction Divide and conquer D.P. G.A. NP-completeness Summary: Dynamic programming Make a choice at each step k Tabulate all known optimal solutions to sub-problems at steps < k before Bottom-up approach Greedy strategy: Make the most profitable choice at each step Solve the remaining problem after Top-down approach Optimal sub-structure must be shown in both cases Greedy additionally requires proving the most profitable choice leads to optimal solution

  57. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise You are going on a long journey between Antwerpen and Napoli. Once the tank of your car is filled, you know you can do at most n km. You have a roadmap that tells you where the fuel stations are located, and you would like to make as few stops as possible. Give an efficient algorithm to solve this problem. Exercise Suppose you have a set of n lectures that need be scheduled in classrooms. Each lecture has fixed (non-modifiable) starting and ending times. You would like to use as few classrooms as possible to schedule all lectures. 1 Describe an naive θ ( n 2 ) algorithm to determine the scheduling of lectures 2 Try to improve this solution to an O ( n log n ) time algorithm, and possibly O ( n ) under some conditions.

  58. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise Suppose you have two sequences of n positive numbers A = { a i } n i =1 and B = { b i } n i =1 . You are free to reorganize them as i =1 a b i you want, after what, you get a profit of Π n i . Give a strategy to maximize your profit, and justify it. Exercise A service has n customers waiting to be served, and can only serve one at a time. Customer i will spend d i minutes in the service, and will have to wait � i − 1 j =1 d j minutes before being served. The penality for making customer i wait m minutes is mp i , where p i > 0 is some constant. You would like to schedule the customers, that is, find a permutation φ : [1 : n ] �→ [1 : n ] so as to minimize the overall penalty P ( φ ) = � n � i − 1 i =1 p φ ( i ) j =1 d φ ( j ) .

  59. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise (cont’d) 1 Consider 3 customers C 1 ,C 2 ,C 3 , with service duration 3,5,7, and priorities 6, 11, 9. Among the possible schedulings φ 1 (1) = 1 , φ 1 (2) = 3 , φ 1 (3) = 2 and φ 2 (1) = 3 , φ 2 (2) = 1 , φ 2 (3) = 2 , which one is preferable? 2 Consider two schedulings φ 1 and φ 2 , identical everywhere except that φ 1 makes customer j served immediately after customer i, while φ 2 does just the opposite. What does ∆ = P ( φ 1 ) − P ( φ 2 ) equal to? 3 Derive the expression of an evaluation function f which associate a number to any customer i and decides whether ∆ > 0 or not. 4 Derive an algorithm for this problem, and justify it. Complexity?

  60. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise Consider the problem of giving change on n cents using as few coins as possible. 1 Give a greedy algorithm that gives back change using coins of 50, 20, 10, 5 and 1 cents. Show that it is optimal. 2 Suppose now you have k + 1 coins whose values are powers of some constant c > 0 , that is, 1 , c , c 2 , ..., c k . Prove that your greedy algorithm is still optimal. 3 Give a set of values for the coins for which the greedy solution is not optimal. 4 Give an optimal O ( nk ) algorithm which gives back change whatever the values of the coins – but assuming there is always a coin of 1 cent

  61. Introduction Divide and conquer D.P. G.A. NP-completeness Exercise Your boss asks you to organize a party in which new colleagues can meet. So that the party be successful, you believe it reasonable not to invite someone if he/she knows more than n − p persons out of the n, or fewer than p. However, you would like to invite as many people as possible. 1 Propose an algorithm to solve this problem for p = 1 . Complexity? 2 Can you generalize your solution to p ≥ 2 ? Exercise Given a graph G=(V,E), a matching is a subset E ′ of the edges E such that no pair of E ′ share a same node. A matching is perfect if the set of all vertices touched by E ′ is exactly V . A tree is a graph that has no cycle (the path that links any vertices x and y is unique). A forest is a collection of trees. Give an efficient algorithm which determines whether a forest G has a perfect match or not. What happens if G is a general graph?

  62. Introduction Divide and conquer D.P. G.A. NP-completeness Introduction to NP-completeness

  63. Introduction Divide and conquer D.P. G.A. NP-completeness So far, we coped with algorithms whose time complexity in worst case was polynomial: O ( n k ), where k > 0, and n = | I | = size of an instance of the problem (in bits) They were at least “acceptable” : could determine a solution in polynomial time in a space of solution of exponential size. Examples: Sorting, in θ ( n log n ) time, whereas n ! candidates exist Shortest path of a graph, in O ( V 2 ) (Dijkstra’s algorithm), amongst up a set of up to | V − 2 | ! Are there problems for which no polynomial time solution is known – doesn’t mean there does no exist any? Equivalently, are there problems for which all we can propose is to enumerate all possible solutions and retain one of them?

  64. Introduction Divide and conquer D.P. G.A. NP-completeness NP-completeness theory distinguishes optimization and decision problems, and restricts itself to the study of the latters. Definition An optimization problem (or search problem) is one of the form: find an x ∈ X , where X is a set, a and f : X → Y is a function in some space ( Y , ≤ ), such that f ( x ) is maximal. a Possibly infinite Classical examples of optimization problems: 1 QCQIP: Maximize f ( x , y ) = 2 x 2 − 3 y + 7 over Z 2 subject to x 2 + 2( y − 1) 2 ≤ 5 10 and x 2 + y > 1 2 TSP: given a graph G = ( V , E ), find a cycle visiting all vertices exactly once, and having minimal total length 3 CLIQUE: given a graph G = ( V , E ), find a clique of G with maximal number of vertices.

  65. Introduction Divide and conquer D.P. G.A. NP-completeness Definition A decision problem is one of the form: does there exists x ∈ X such that f ( x ) is true? Decision version of the above problems: 1 QCQIP: Does there exists ( x , y ) ∈ Z 2 such that f ( x , y ) = 2 x 3 − 3 y + 7 ≥ k , x 2 + 2( y − 1) 2 ≤ 5 10 , and x 2 + y > 1 ? 2 TSP: Does G = ( V , E ) admit a cycle with length ≤ k visiting all vertices exactly once? Some cases are trivial : k ≥ � e ∈ E length ( e ), and k ≤ � x ∈ V min y ∈ V , y � = x length (( x , y )) 3 CLIQUE: Does G = ( V , E ) admit a clique of order k . In contrast, no trivial case for this problem.

  66. Introduction Divide and conquer D.P. G.A. NP-completeness Why study only decision problems? Because if one doesn’t know how to solve a DP in reasonable time, there is no hope to solve the OP version in reasonable time too Equivalently, because unless very particular cases in which search space is of infinite dimension, or objective function is piecewise monotonic but exhibits an exponential number of discontinuities, it takes O (log ε k ) iterations to determine which part of the space contains the optimal solution within ε tolerance. Even if k = 2 n , log ε k = O ( n ) is polynomial in the size of the input.

  67. Introduction Divide and conquer D.P. G.A. NP-completeness Definition A problem is in class P if there exists an algorithm C that can solve any instance I of this problem in polynomial time of | I | . The solution of C ( I ) is ensured to be correct, so P ensures that both computing and checking a solution to a problem can be done in O ( | I | k ) time. Definition A problem is in class NP if there exists an algorithm C that can check, for any instance I , whether a proposal solution S is really a solution for I in polynomial time. I.o.w, C ( I , S ) now returns a boolean value, and runs in polynomial time. NP does not ensure anything more than checkability of a solution in polynomial time. Lemma If P ∈ P , then P ∈ NP .

  68. Introduction Divide and conquer D.P. G.A. NP-completeness The big question: how about the converse? Does P ∈ NP ⇒ P ∈ P or not? No one ever could answer this question so far. It does not mean that because a problem belongs to NP , it does not admit a polynomial solution. It could admit one – but no one ever found it. ”The classes of problems which are respectively known and not known to have good algorithms are of great theoretical interest. [...] I conjecture that there is no good algorithm for the traveling salesman problem. My reasons are the same as for any mathematical conjecture: (1) It is a legitimate mathematical possibility, and (2) I do not know.” – Jack Edmonds, 1966

  69. Introduction Divide and conquer D.P. G.A. NP-completeness Let’s recast the problem in a more convenient way: the input (instance) of a problem can be seen as a string of bits of length n 6 the output of a search algorithm is again a string of bits the output of a decision algorithm is a “true/false” answer, so a string of 1 bit A decision problem is equivalent to checking whether a string of bits is correct or not. Language theory offers a convenient framework to express this. Definition Let Σ = { 0 , 1 } be the binary aplhabet. Denote by Σ k the set of all strings obtained by concatenating exactly k symbols of Σ, and by ε the empty string. A language is any subset of Σ ⋆ = { ε } ∪ Σ 1 ∪ Σ 2 ∪ Σ 3 ∪ ... 6 We do not bother what this string means: we just know there is an encoding that can encode our input data on n bits

  70. Introduction Divide and conquer D.P. G.A. NP-completeness Definition An algorithm A accepts an input string s ⇔ A ( s ) outputs true : A ( s ) = true . It rejects it ⇔ A ( s ) = false . A language L is accepted by A ⇔ ∀ s ∈ L , A ( s ) = true A language is decided by L = L ⋆ \ L , A ( s ) = false . A ⇔ it is accepted by A and ∀ s ∈ ¯ Moreover, if A runs in polynomial time, then the above definitions are straightforwardly extended to “in polynomial time” as well : accept/reject/acepted/decided in polynomial time. The following theorem redefines NP for decision problems: Theorem A language L is in P ⇔ there exists an algorithm that can decide L in polynomial time. Similarly, since NP is the class of all decisions problems/languages whose solutions are verifiable in polynomial time:

  71. Introduction Divide and conquer D.P. G.A. NP-completeness Theorem A language L over Σ is in NP if there exists a polynomial time algorithm A and a polynomial p such that: x ∈ L ⇔ ∃ c ∈ Σ p ( | x | ) : A ( x , c ) = true Any c ∈ Σ p ( | x | ) satisfying A ( x , c ) = true is called a certificate for x . A crucial notion in NP-completeness is that of reduction, as defined by Karp: Definition Let L 1 and L 2 be two languages. One says L 1 is reductible to L 2 in polynomial time, written as L 1 ≤ P L 2 , if and only if there exists a polynomial time algorithm f such that s ∈ L 1 ⇔ f ( s ) ∈ L 2

  72. Introduction Divide and conquer D.P. G.A. NP-completeness Karp’s reduction is of great interest in the following result: Lemma Let L 1 and L 2 be two languages such that L 1 ≤ P L 2 . Then L 2 ∈ P ⇒ L 1 ∈ P . Proof See Fig. 4. Since L 1 ≤ P L 2 , there must exist a reduction function f that can compute the image f ( x ) ∈ L 2 of any x ∈ L 1 in less that T 1 ( x ) = c 1 | x | u 1 time, where c 1 , u 1 are > 0 constants. Put y = f ( x ). Since L 2 ∈ P , there must also exist an algorithm A that can decide wether to accept or reject f ( x ) in less than T 2 ( f ( x )) = c 2 | f ( x ) | u 2 time. The time needed to accept or reject x as a string of L 1 is therefore less that T 1 ( x ) + T 2 ( f ( x )) = c 1 | x | u 1 + c 2 | f ( x ) | u 2 , which is polynomial in | x | since | f ( x ) | is polynomial in | x | . Therefore, L 1 ∈ P . �

  73. Introduction Divide and conquer D.P. G.A. NP-completeness f(x) Final output: x Alg. for L2 f true iff x in L1 Alg. for L1 Figure 4: Karp reduction of L 1 to L 2 Corollary Let L 1 and L 2 be two languages such that L 1 ≤ P L 2 . If L 1 / ∈ P , then L 2 / ∈ P as well. Proof Take the counteraposite of the claim of Lemma 22. � The theorem tells us that unless P = NP , if we can reduce a new language/problem to one for which no polynomial time solution is known, then our new language/problem won’t have any polynomial time solution as well. For otherwise, this would contradict P � = NP .

  74. Introduction Divide and conquer D.P. G.A. NP-completeness An other useful result on Karp’s reduction is its transitivity: Property If A ≤ p B and B ≤ p C, then A ≤ P C (Proof easy and left to the reader). We finally come to the central definitions of this chapter: Definition A language L is NP-hard , or belongs to NPC , if X ≤ P L holds for every X ∈ NP . A language is NP-complete if it is both NP-hard and in NP . NP−hard NP−complete NPC P NP

  75. Introduction Divide and conquer D.P. G.A. NP-completeness Unless P = NP , any problem in NPC can not be decided in polynomial time. Since Karp’s reduction is the common technique to prove NP-hardness, is there a reference problem proven NP-complete? Theorem (Cook-Levin) Let F be a Boolean expression in conjunctive normal form of order n: m − 1 � F ( x 1 , ..., x n ) = ( y ni +1 ∨ y ni +2 ... ∨ y ni + n ) i =0 where each y i is either : any variable amongst the set { x 1 , ..., x n , x 1 , ..., x n } or the false value The n-SAT problem (or simply SAT) is to assign values to the x i ’s so that F be true. The SAT problem is NP-complete.

  76. Introduction Divide and conquer D.P. G.A. NP-completeness Reductions Many (thousands, indeed) problems L can be proven NP-complete by either proving SAT ≤ P L , or SAT ≤ P X ≤ P L . Below are only a few examples. Theorem 3-SAT is NP-complete. Proof. 3-SAT ∈ NP : The certificate is just the assignment x itself, and it takes 3 m | x | = O ( | x | ) time units to evaluate it. 3-SAT is NP-hard: Suffice we find a transform that 1) maps any k-CNF, k > 3 to a 3-CNF and conversely, 2) runs in polynomial time. Consider a k-CNF F = C 1 ∧ C 2 ∧ ... ∧ C m . Each C p is a disjunction of the form C p = y 1 ∨ y 2 ∨ ... ∨ y k Introduce a new variable z , and C (1) = y 1 ... ∨ y k − 2 ∨ z , p C (2) = y k − 1 ∨ y k ∨ ¯ z . Then: p

  77. Introduction Divide and conquer D.P. G.A. NP-completeness If C p is true and both y k − 1 and y k are false, we set z to false, and C (1) ∧ C (2) is true p p If C p is true and any of y k − 1 is y k true, we set z to true and C (1) ∧ C (2) is true p p If C p is false, then C (1) ∧ C (2) is false too irrespective of z . p p Recursive application of this transform to C (1) results in a 3-CNF p involving ⌈ k / 2 ⌉ terms and ⌈ k / 2 ⌉ new variables z i ’s. This shows that to any assignment of y i ’s satisfying C p corresponds an assignment of y i ’s and z i ’s satisfying C (1) ∧ ... ∧ C ( ⌈ k / 2 ⌉ ) . Apply it p to each term of C to conclude it will take ⌈ mk / 2 ⌉ = O ( mk ) time to transform C , a polynomial time of the of the length of the input, as required. �

  78. Introduction Divide and conquer D.P. G.A. NP-completeness CLIQUE: a b A clique of order k of a graph G = ( V , E ) is a subgraph G ′ = ( V ′ ⊆ z x V , E ′ ⊆ E ) such that ∀ ( x , y ) ∈ E ′ × E ′ , x � = y ⇒ ( x , y ) ∈ E ′ . c d In a clique, ∀ x ∈ V , degree ( x ) = y k − 1 (why?). Figure 6: A graph and a clique of order 4 The CLIQUE problem is to find a ( { a , b , c , d } ) clique of maximal order. Theorem The CLIQUE problem is NP-complete. Proof. Clearly, CLIQUE ∈ NP . We shall show that 3-SAT ≤ P CLIQUE to prove CLIQUE is NP-hard.

  79. Introduction Divide and conquer D.P. G.A. NP-completeness Given a 3-CNF formula F = C 1 ∧ ... ∧ C k , we must build, in polynomial time, a graph G = ( V , E ) such that F is satisfied ⇔ G admits a clique of order k . Commit that C p = y 1 p ∨ y 2 p ∨ y 3 p for any p , y i p being one of the symbols x 1 , x 1 , x 2 , x 2 , x 3 , x 3 by construction. We build G as follows: V = { ( p , i ) ∈ [1 : k ] × [1 : 3] } encodes all litterals represented by the y i p ’s p � = y j (( i , p ) , ( j , q )) ∈ E ⇔ i � = j and y i q This clearly runs in O ( k 2 ), a polynomial time of k . If F is satisfiable, then at least one of y 1 p , y 2 p , y 3 p , say y l p , is true per clause p encodes a vertex of G , and must be linked to 7 at C p . Each y l least one of y 1 q , y 2 q , y 3 q , whenever q � = p . So degree ( y l p ) ≥ k − 1, and since this holds for any p , G has a clique of order k . 7 Check that otherwise, F would not be satisfiable

  80. Introduction Divide and conquer D.P. G.A. NP-completeness Conversely, assume G has a clique of order k , and choose two of its vertices (( p , i ) , ( q , j )). Because p � = q and p i � = p j , we are ensured clauses C p and C q independently evaluate to true provided they p and y j include symbols y i q . Therefore, F is satisfied too. � VERTEXCOVER Given a graph G = ( V , E ), a vertex cover of G is a subset V ′ of its vertices such that ∀ ( x , y ) ∈ E , { x , y } ∩ V ′ � = ∅ . The VERTEXCOVER problem is to find a vertex cover of G with minimal number of verices. Theorem VERTEXCOVER is NP-complete. Proof. VERTEXCOVER is NP : it takes O ( nk ) time to mark all vertices covered by a certificate of size k in a graph of size n , and O ( n ) to check all vertices are covered. We show VERTEXCOVER ≥ P CLIQUE to prove NP-hardness of VERTEXCOVER. Consider G = ( V , E ), the complement graph of G , in which E = V 2 \ E .

Recommend


More recommend