Divide-and-conquer paradigm Divide-Conquer-Glue Algorithms Divide-and-conquer. ・ Divide up problem into several subproblems. Mergesort and Counting Inversions ・ Solve each subproblem recursively. ・ Combine solutions to subproblems into overall solution. Tyler Moore Most common usage. ・ Divide problem of size n into two subproblems of size n / 2 in linear time. CS 2123, The University of Tulsa ・ Solve two subproblems recursively. ・ Combine two solutions into overall solution in linear time. Consequence. ・ Brute force: Θ ( n 2 ) . Some slides created by or adapted from Dr. Kevin Wayne. For more information see ・ Divide-and-conquer: Θ ( n log n ) . http://www.cs.princeton.edu/~wayne/kleinberg-tardos . Some code reused or adapted from Python Algorithms by Magnus Lie Hetland. attributed to Julius Caesar 2 2 / 22 Sorting problem Problem. Given a list of n elements from a totally-ordered universe, 5. D IVIDE AND C ONQUER rearrange them in ascending order. ‣ mergesort ‣ counting inversions ‣ closest pair of points ‣ randomized quicksort ‣ median and selection S ECTION 5.1 4 3 / 22 4 / 22
Sorting applications Mergesort Obvious applications. ・ Recursively sort left half. ・ Organize an MP3 library. ・ Recursively sort right half. ・ Display Google PageRank results. ・ Merge two halves to make sorted whole. ・ List RSS news items in reverse chronological order. input Some problems become easier once elements are sorted. ・ Identify statistical outliers. A L G O R I T H M S ・ Binary search in a database. sort left half ・ Remove duplicates in a mailing list. A G L O R I T H M S Non-obvious applications. sort right half ・ Convex hull. ・ Closest pair of points. A G L O R H I M S T ・ Interval scheduling / interval partitioning. merge results ・ Minimum spanning trees (Kruskal's algorithm). A G H I L M O R S T ・ Scheduling to minimize maximum lateness or average completion time. ・ ... 5 6 5 / 22 6 / 22 Canonical Divide-Conquer-Glue Algorithm Merging Goal. Combine two sorted lists A and B into a sorted whole C . ・ Scan A and B from left to right. ・ Compare a i and b j . ・ If a i ≤ b j , append a i to C (no larger than any remaining element in B ). def d i v i d e a n d c o n q u e r (S , divide , glue ) : ・ If a i > b j , append b j to C (smaller than every remaining element in A ). len (S) == 1: return S i f L , R = d i v i d e (S) A = d i v i d e a n d c o n q u e r (L , divide , glue ) sorted list A sorted list B B = d i v i d e a n d c o n q u e r (R, divide , glue ) a i b j 3 7 10 18 2 11 17 23 glue (A, B) return 5 2 merge to form sorted list C 2 3 7 10 11 7 7 / 22 8 / 22
Mergesort in Python How can we measure the time complexity of recursive algorithms? mergesort ( seq ) : 1 def mid = len ( seq )/2 #Midpoint f o r d i v i s i o n 2 Measuring the time complexity of iterative algorithms is usually l f t , r g t = seq [ : mid ] , seq [ mid : ] 3 len ( l f t ) > 1 : l f t = mergesort ( l f t ) #Sort i f by h a l v e s 4 straightforward: count the inputs, check for loops, etc. len ( r g t ) > 1 : r g t = mergesort ( r g t ) i f 5 We know that certain operations can take linear time, constant time, r e s = [ ] #Merge s o r t e d h a l v e s 6 while l f t and r g t : #N e i t h e r h a l f i s empty 7 logarithmic time, etc. i f l f t [ − 1] > = r g t [ − 1]: #l f t has g r e a t e s t l a s t v a l u e 8 r e s . append ( l f t . pop ( ) ) #Append i t Running those operation in a loop n times produces a multiplicative 9 e l s e : #r g t has g r e a t e s t l a s t v a l u e 10 factor r e s . append ( r g t . pop ( ) ) #Append i t 11 r e s . r e v e r s e () #R e s u l t i s backward 12 But how can we do this for recursive algorithms? With recurrence return ( l f t or r g t ) + r e s #Also add the remainder 13 relations 9 / 22 10 / 22 Recurrence Relations A useful recurrence relation Def. T ( n ) = max number of compares to mergesort a list of size ≤ n . Note. T ( n ) is monotone nondecreasing. Recurrence relations specify the cost of executing recursive functions. Mergesort recurrence. Consider mergesort 0 if n = 1 Linear-time cost to divide the lists 1 T ( n ) ≤ T ( ⎡ n / 2 ⎤ ) + T ( ⎣ n / 2 ⎦ ) + n otherwise Two recursive calls are made, each given half the original input 2 Linear-time cost to merge the resulting lists together 3 Recurrence: T ( n ) = 2 T ( n 2 ) + Θ( n ) Solution. T ( n ) is O ( n log 2 n ) . Great, but how does this help us estimate the running time? Assorted proofs. We describe several ways to prove this recurrence. Initially we assume n is a power of 2 and replace ≤ with = . 8 11 / 22 12 / 22
Divide-and-conquer recurrence: proof by recursion tree Proof by induction Proposition. If T ( n ) satisfies the following recurrence, then T ( n ) = n log 2 n . Proposition. If T ( n ) satisfies the following recurrence, then T ( n ) = n log 2 n . assuming n assuming n 0 if n = 1 0 if n = 1 is a power of 2 is a power of 2 T ( n ) = T ( n ) = 2 T ( n / 2) + n otherwise 2 T ( n / 2) + n otherwise Pf 1. Pf 2. [by induction on n ] T ( n ) n = n ・ Base case: when n = 1 , T (1) = 0 . ・ Inductive hypothesis: assume T ( n ) = n log 2 n . T ( n / 2) T ( n / 2) 2 ( n /2) = n ・ Goal: show that T (2 n ) = 2 n log 2 (2 n ) . T ( n / 4) T ( n / 4) 4 ( n /4) = n T ( n / 4) T ( n / 4) T (2 n ) = 2 T ( n ) + 2 n log 2 n = 2 n log 2 n + 2 n T ( n / 8) T ( n / 8) T ( n / 8) T ( n / 8) T ( n / 8) T ( n / 8) T ( n / 8) T ( n / 8) = n 8 ( n /8) = 2 n (log 2 ( 2n ) – 1) + 2 n ⋮ = 2 n log 2 ( 2n ). ▪ ⋮ T ( n ) = n lg n 9 10 13 / 22 14 / 22 Counting inversions Music site tries to match your song preferences with others. 5. D IVIDE AND C ONQUER ・ You rank n songs. ・ Music site consults database to find people with similar tastes. ‣ mergesort Similarity metric: number of inversions between two rankings. ‣ counting inversions ・ My rank: 1, 2, …, n . ‣ closest pair of points ・ Your rank: a 1 , a 2 , …, a n . ‣ randomized quicksort ・ Songs i and j are inverted if i < j , but a i > a j . ‣ median and selection A B C D E me 1 2 3 4 5 S ECTION 5.3 you 1 3 4 2 5 2 inversions: 3-2, 4-2 Brute force: check all Θ ( n 2 ) pairs. 13 15 / 22 16 / 22
Counting inversions: applications Counting inversions: divide-and-conquer ・ Divide: separate list into two halves A and B . ・ Voting theory. ・ Collaborative filtering. ・ Conquer: recursively count inversions in each list. ・ Measuring the "sortedness" of an array. ・ Combine: count inversions ( a , b ) with a ∈ A and b ∈ B . ・ Sensitivity analysis of Google's ranking function. ・ Return sum of three counts. ・ Rank aggregation for meta-searching on the Web. input ・ Nonparametric statistics (e.g., Kendall's tau distance). 1 5 4 8 10 2 6 9 3 7 Rank Aggregation Methods for the Web count inversions in left half A count inversions in right half B Cynthia Dwork Ravi Kumar Moni Naor D. Sivakumar 1 5 4 8 10 2 6 9 3 7 ABSTRACT 5-4 6-3 9-3 9-7 count inversions (a, b) with a ∈ A and b ∈ B 1 5 4 8 10 2 6 9 3 7 4-2 4-3 5-2 5-3 8-2 8-3 8-6 8-7 10-2 10-3 10-6 10-7 10-9 output 1 + 3 + 13 = 17 14 15 17 / 22 18 / 22 Counting inversions: how to combine two subproblems? Counting inversions: how to combine two subproblems? Q. How to count inversions ( a , b ) with a ∈ A and b ∈ B ? Count inversions ( a , b ) with a ∈ A and b ∈ B , assuming A and B are sorted. A. Easy if A and B are sorted! ・ Scan A and B from left to right. ・ Compare a i and b j . ・ If a i < b j , then a i is not inverted with any element left in B . Warmup algorithm. ・ Sort A and B . ・ If a i > b j , then b j is inverted with every element left in A . ・ For each element b ∈ B , ・ Append smaller element to sorted list C . - binary search in A to find how elements in A are greater than b . list A list B 7 10 18 3 14 17 23 2 11 16 count inversions (a, b) with a ∈ A and b ∈ B 3 7 10 a i 18 2 11 b j 17 23 sort A sort B 5 2 3 7 10 14 18 2 11 16 17 23 binary search to count inversions (a, b) with a ∈ A and b ∈ B merge to form sorted list C 3 7 10 14 18 2 11 16 17 23 2 3 7 10 11 5 2 1 1 0 16 17 19 / 22 20 / 22
Recommend
More recommend