order statistics
play

Order Statistics Carola Wenk Slides courtesy of Charles Leiserson - PowerPoint PPT Presentation

CS 5633 -- Spring 2008 Order Statistics Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk 2/12/08 CS 5633 Analysis of Algorithms 1 Order statistics Select the i th smallest of n elements (the element with


  1. CS 5633 -- Spring 2008 Order Statistics Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk 2/12/08 CS 5633 Analysis of Algorithms 1

  2. Order statistics Select the i th smallest of n elements (the element with rank i ). • i = 1: minimum ; • i = n : maximum ; • i =  ( n +1)/2  or  ( n +1)/2  : median . Naive algorithm : Sort and index i th element. Worst-case running time = Θ ( n log n ) + Θ (1) = Θ ( n log n ), using merge sort or heapsort ( not quicksort). 2/12/08 CS 5633 Analysis of Algorithms 2

  3. Randomized divide-and- conquer algorithm ⊳ i th smallest of A [ p . . q ] R AND -S ELECT ( A , p, q, i ) if p = q then return A [ p ] r ← R AND -P ARTITION ( A , p, q ) k ← r – p + 1 ⊳ k = rank( A [ r ]) if i = k then return A [ r ] if i < k then return R AND -S ELECT ( A , p, r – 1 , i ) else return R AND -S ELECT ( A , r + 1 , q, i – k ) k ≤ A [ r ] ≥ A [ r ] ≤ A [ r ] ≥ A [ r ] p r q 2/12/08 CS 5633 Analysis of Algorithms 3

  4. Example Select the i = 7th smallest: 6 10 13 5 8 3 2 11 i = 7 6 10 13 5 8 3 2 11 pivot Partition: 2 5 3 6 8 13 10 11 k = 4 2 5 3 6 8 13 10 11 Select the 7 – 4 = 3rd smallest recursively. 2/12/08 CS 5633 Analysis of Algorithms 4

  5. Intuition for analysis (All our analyses today assume that all elements are distinct.) Lucky: log 1 T ( n ) = T (9 n /10) + Θ ( n ) = n 0 = n 1 10 / 9 = Θ ( n ) C ASE 3 Unlucky: T ( n ) = T ( n – 1) + Θ ( n ) arithmetic series = Θ ( n 2 ) Worse than sorting! 2/12/08 CS 5633 Analysis of Algorithms 5

  6. Analysis of expected time The analysis follows that of randomized quicksort, but it’s a little different. Let T ( n ) = the random variable for the running time of R AND -S ELECT on an input of size n , assuming random numbers are independent. For k = 0, 1, …, n –1, define the indicator random variable 1 if P ARTITION generates a k : n – k –1 split, X k = 0 otherwise. 2/12/08 CS 5633 Analysis of Algorithms 6

  7. Analysis (continued) To obtain an upper bound, assume that the i th element always falls in the larger side of the partition: T (max{0, n –1}) + Θ ( n ) if 0 : n –1 split, T (max{1, n –2}) + Θ ( n ) if 1 : n –2 split, T ( n ) = M T (max{ n –1, 0}) + Θ ( n ) if n –1 : 0 split, − n 1 ( ) ∑ = − − + Θ X T (max{ k , n k 1 }) ( n ) k . = k 0 − n 1 ( ) ∑ ≤ + Θ 2 X T ( k ) ( n ) k =   k n / 2 2/12/08 CS 5633 Analysis of Algorithms 7

  8. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k   =   k n / 2 Take expectations of both sides. 2/12/08 CS 5633 Analysis of Algorithms 8

  9. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k   =   k n / 2 − n 1 [ ] ( ) ∑ = + Θ 2 E X T ( k ) ( n ) k =   k n / 2 Linearity of expectation. 2/12/08 CS 5633 Analysis of Algorithms 9

  10. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k   =   k n / 2 − n 1 [ ] ( ) ∑ = + Θ 2 E X T ( k ) ( n ) k =   k n / 2 − n 1 [ ] [ ] ∑ = ⋅ + Θ 2 E X E T ( k ) ( n ) k =   k n / 2 Independence of X k from other random choices. 2/12/08 CS 5633 Analysis of Algorithms 10

  11. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k     = k n / 2 − n 1 [ ] ( ) ∑ = + Θ 2 E X T ( k ) ( n ) k =   k n / 2 − n 1 [ ] [ ] ∑ = ⋅ + Θ 2 E X E T ( k ) ( n ) k =   k n / 2 − − 2 n 1 2 n 1 [ ] ∑ ∑ = + Θ E T ( k ) ( n ) n n     = = k n / 2 k n / 2 Linearity of expectation; E [ X k ] = 1/ n . 2/12/08 CS 5633 Analysis of Algorithms 11

  12. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k     = k n / 2 − n 1 [ ] ( ) ∑ = + Θ 2 E X T ( k ) ( n ) k =   k n / 2 − n 1 [ ] [ ] ∑ = ⋅ + Θ 2 E X E T ( k ) ( n ) k =   k n / 2 − − 2 n 1 2 n 1 [ ] ∑ ∑ = + Θ ( ) ( ) E T k n n n     = = k n / 2 k n / 2 − n 1 2 [ ] ∑ = + Θ E T ( k ) ( n ) n =   k n / 2 2/12/08 CS 5633 Analysis of Algorithms 12

  13. Hairy recurrence (But not quite as hairy as the quicksort one.) − n 1 ∑ 2 [ ] = + Θ E [ T ( n )] E T ( k ) ( n ) n =   k n / 2 Prove: E [ T ( n )] ≤ cn for constant c > 0. • The constant c can be chosen large enough so that E [ T ( n )] ≤ cn for the base cases. − n 1 k ∑ 3 n 2 ≤ k Use fact: (exercise). 8 =   n / 2 2/12/08 CS 5633 Analysis of Algorithms 13

  14. Substitution method − n 1 ∑ [ ] 2 ≤ + Θ E T ( n ) ck ( n ) n =   k n / 2 Substitute inductive hypothesis. 2/12/08 CS 5633 Analysis of Algorithms 14

  15. Substitution method − n 1 ∑ [ ] 2 ≤ + Θ E T ( n ) ck ( n ) n =   k n / 2   2 c 3 2 ≤ + Θ  n  ( n )   n 8 Use fact. 2/12/08 CS 5633 Analysis of Algorithms 15

  16. Substitution method − n 1 ∑ [ ] 2 ≤ + Θ E T ( n ) ck ( n ) n =   k n / 2   2 c 3 2 ≤ + Θ  n  ( n )   n 8   cn = − − Θ   cn ( n )   4 Express as desired – residual . 2/12/08 CS 5633 Analysis of Algorithms 16

  17. Substitution method − n 1 ∑ [ ] 2 ≤ + Θ E T ( n ) ck ( n ) n =   k n / 2   2 c 3 2 ≤ + Θ   n ( n )   n 8   cn = − − Θ   cn ( n )   4 ≤ cn , if c is chosen large enough so that cn /4 dominates the Θ ( n ). 2/12/08 CS 5633 Analysis of Algorithms 17

  18. Summary of randomized order-statistic selection • Works fast: linear expected time. • Excellent algorithm in practice. • But, the worst case is very bad: Θ ( n 2 ). Q. Is there an algorithm that runs in linear time in the worst case? A. Yes, due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973]. I DEA : Generate a good pivot recursively. 2/12/08 CS 5633 Analysis of Algorithms 18

  19. Worst-case linear-time order statistics S ELECT ( i, n ) 1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2. Recursively S ELECT the median x of the  n /5  group medians to be the pivot. 3. Partition around the pivot x . Let k = rank( x ). 4. if i = k then return x Same as elseif i < k R AND - then recursively S ELECT the i th smallest element in the lower part S ELECT else recursively S ELECT the ( i–k )th smallest element in the upper part 2/12/08 CS 5633 Analysis of Algorithms 19

  20. Choosing the pivot 2/12/08 CS 5633 Analysis of Algorithms 20

  21. Choosing the pivot 1. Divide the n elements into groups of 5. 2/12/08 CS 5633 Analysis of Algorithms 21

  22. Choosing the pivot lesser 1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote. greater 2/12/08 CS 5633 Analysis of Algorithms 22

  23. Choosing the pivot x lesser 1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2. Recursively S ELECT the median x of the  n /5  group medians to be the pivot. greater 2/12/08 CS 5633 Analysis of Algorithms 23

  24. Developing the recurrence S ELECT ( i, n ) T ( n ) 1. Divide the n elements into groups of 5. Find Θ ( n ) the median of each 5-element group by rote. 2. Recursively S ELECT the median x of the  n /5  T ( n /5) group medians to be the pivot. Θ ( n ) 3. Partition around the pivot x . Let k = rank( x ). 4. if i = k then return x elseif i < k then recursively S ELECT the i th ? T ( ) smallest element in the lower part else recursively S ELECT the ( i–k )th smallest element in the upper part 2/12/08 CS 5633 Analysis of Algorithms 24

  25. Analysis (Assume all elements are distinct.) x lesser At least half the group medians are ≤ x , which is at least   n /5  /2  =  n /10  group medians. greater 2/12/08 CS 5633 Analysis of Algorithms 25

  26. Analysis (Assume all elements are distinct.) x lesser At least half the group medians are ≤ x , which is at least   n /5  /2  =  n /10  group medians. • Therefore, at least 3  n /10  elements are ≤ x . greater 2/12/08 CS 5633 Analysis of Algorithms 26

  27. Analysis (Assume all elements are distinct.) x lesser At least half the group medians are ≤ x , which is at least   n /5  /2  =  n /10  group medians. • Therefore, at least 3  n /10  elements are ≤ x . • Similarly, at least 3  n /10  elements are ≥ x . greater 2/12/08 CS 5633 Analysis of Algorithms 27

Recommend


More recommend