Two Constant-Factor-Optimal Realizations of Adaptive Heapsort Stefan Edelkamp 1) Amr Elmasry 2) Jyrki Katajainen 2) 1) Universit¨ at Bremen 2) Københavns Universitet These slides and all our programs are available via my home page http://www.diku.dk/~jyrki c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (1)
Adaptive sorting • A sorting algorithm is adaptive with respect to a measure of disorder, if it sorts all input sequences, but performs particularly well on those that have a low amount of disorder. • The running time of such algorithm is measured as a function of the length of the input, n , and the amount of disorder. Hence, the running time varies between O ( n ) time and O ( n lg n ) depending on the amount of disorder. • The algorithm should be adaptive without knowing the amount of disorder beforehand. c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (2)
1. Which of the two has more order? � 1 , 3 , 2 , 7 , 5 , 4 , 6 � � 7 , 6 , 1 , 5 , 2 , 4 , 3 � ✷ ✷ c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (3)
Some measures of disorder Let � x 1 , x 2 , . . . , x n � be a sequence of n elements. For simplicity, assume that all elements are distinct. measure definition n − 1 � �� � � i | 1 ≤ i ≤ n and min { x j , x j +1 } < x i < max { x j , x j +1 } Osc � � � � j =1 � �� � ( i, j ) | 1 ≤ i < j ≤ n and x i > x j Inv � � � � the maximum distance an element is from its correct Max position � �� � i | 1 ≤ i ≤ n and x i > x i +1 � + 1 Runs � � � What is the amount of disorder in a sequence of length n that is in reversed sorted order? c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (4)
Optimality measure asymptotically optimal constant-factor-optimal (running time) (# element comparisons) O ( n lg ( Osc /n )) ≤ n lg ( Osc /n ) + O ( n ) Osc Inv O ( n lg ( Inv /n )) ≤ n lg ( Inv /n ) + O ( n ) O ( n lg ( Max )) ≤ n lg ( Max ) + O ( n ) Max Runs O ( n lg ( Runs )) ≤ n lg ( Runs ) + O ( n ) [Levcopoulos & Petersson 1993] [Guibas, McCreight, Plass & Roberts 1977] [Estivill-Castro & Wood 1989] [Mannila 1985] Natural mergesort is an example of an adaptive sorting algorithm that is constant-factor-optimal; this is with respect to Runs . [Knuth 1973, Section 5.2.4] c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (5)
Local insertionsort input : sequence � x 1 , x 2 , . . . , x n � of n elements 1 Construct an empty finger tree F 2 hint ← 0 3 for i ∈ { 1 , 2 , . . . , n } hint ← F . insert ( x i , hint ) 4 5 for j ∈ { 1 , 2 , . . . , n } x j ← F . extract - min () 6 Idea: Jump over only a few elements in insert ; the cost of insert is O (lg ∆), where ∆ is the jump distance. [Guibas, McCreight, Plass & Roberts 1977] c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (6)
2. Is local insertionsort easy to implement? Yes No ✷ ✷ c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (7)
3. Is local insertionsort optimal? Yes No ✷ ✷ c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (8)
4. Is local insertionsort fast in practice? Yes No ✷ ✷ c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (9)
Problem Theory: Local insertionsort is asymptotically optimal with respect to Osc , Inv , Max , and Runs . [Guibas, McCreight, Plass & Roberts 1977] [Mannila 1985] [Katajainen, Levcopoulos & Petersson 1989] Practice: Only a few publicly-available implementations of finger trees exist; an implementation in the Haskell core libraries and an implementation in OCaml exists, and a C# implementation was published in 2008. [ http://en.wikipedia.org/wiki/Finger_tree ] c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (10)
Adaptive heapsort input : sequence � x 1 , x 2 , . . . , x n � of n elements min 1 Construct an empty Cartesian tree C x i 2 hint ← 0 3 for i ∈ { 1 , 2 , . . . , n } hint ← C . insert ( x i , hint ) 4 x 1 ..x i − 1 x i +1 ..x n 5 Construct an empty priority queue Q 6 Q . insert ( C . minimum ()) 7 for j ∈ { 1 , 2 , . . . , n } x j ← Q . extract - min () 8 Let Y be the set of children x j has in C 9 for each y ∈ Y 10 Q . insert ( y ) 11 Idea: Keep Q small. [Levcopoulos & Petersson 1993] c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (11)
Theoretical race For priority queue Q , the number of element comparisons performed is bounded by βn lg ( Osc /n ) + O ( n ). Q β reference binary heap 3 combined extract - min insert 2.5 [Levcopoulos & Petersson 1993] binomial queue 2 [folklore] weak heap 2 combined extract - min insert 1.5 [folklore] multipartite priority queue 1 [Elmasry, Jensen & Katajainen 2008] Goal: Achieve the constant-factor optimality, i.e. β = 1, and in the meantime ensure practicality! c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (12)
Our contributions Weak heap: insert : O (1) amortized time; extract - min : O (lg n ) worst- case time including at most lg n + O (1) element comparisons Weak queue: insert : O (1) amortized time; extract - min : O (lg n ) worst- case time including at most lg n + O (1) element comparisons Adaptive heapsort: constant-factor-optimal with respect to Osc , Inv , Max , and Runs Idea: Temporarily store the inserted elements in a buffer and, once it is full, move its elements to the main structure using an efficient bulk-insertion procedure. c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (13)
Array-based solution: Weak heap ⌊ i/ 2 ⌋ n : # elements Root a 0 has no left child k : # elements in the buffer i Leaves at the last two levels a i [ a i | i ∈ [0 ..n − 1]], a i element Parent of a i : a ⌊ i/ 2 ⌋ 2 i + r i 2 i +1 − r i � � [ r i | i ∈ [0 ..n − k − 1]], r i ∈ 0 , 1 Left child of a i : a 2 i + r i [ b i | i ∈ [0 ..k − 1]] ≡ [ a i | i ∈ [ n 1]] − k..n − j Right child of a i : a 2 i +1 − r i min heap ≡ a 0 , if n > 0 a j Weak-heap order: a i � > a j min buffer ≡ b 0 , if k > 0 0 8 1 12 3 2 0 1 2 3 4 5 6 7 8 9 10 11 12 10 27 8 12 27 10 47 49 53 46 75 80 26 1 42 6 7 5 4 53 46 47 49 10 8 9 min heap min buffer 75 80 26 c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (14)
Bulk insertion in a weak heap Algorithmic ideas Analysis • Make the buffer part of the • The number of nodes in- heap when k = ⌈ lg n ⌉ . volved at most 2 k + 2 ⌈ lg n ⌉ • Fix the heap bottom up level • One element comparison per by level. node • I) Find the distinguished ances- • • at most 4 comparisons per tors for the levels with more element I) At most (2 k + o ( k )) / 2 j of the than two nodes. II) Traverse the two remaining nodes need j ancestor checks, paths to the root. where j ≥ 1 II) On the two paths at most ⌊ ℓ/ 2 ⌋ +1 ℓ ��� ��� ��� ��� ���� ���� 2 ⌈ lg n ⌉ ancestor checks ��� ��� ��� ��� ���� ���� ��� ��� ��� ��� ���� ���� • • • O (1) the amortized cost per level i − 1 level i element c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (15)
Pointer-based solution: Weak queue n : # elements k : # elements in the buffer insert : mimics an increment for a Buffer: a singly-linked list binary counter Prefix-minimum pointers: roots extract - min : borrow -based Perfect weak heaps: left-child and right-child pointers for each node min queue 53 8 75 1 42 80 12 min buffer 27 10 11 10 = 1011 2 26 46 49 47 c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (16)
Bulk insertion in a weak queue Algorithmic ideas Analysis • Flush the buffer out of its • Give 1 e for each root elements when k = ⌈ lg n ⌉ . • Give 1 e for each insert • Perform normal insert ’s with- • Money at the roots pays the out updating the prefix- linkings of two heaps of the minimum pointers. same size; money that is not • Update the prefix-minimum used for linkings pays the pointers once. pointer updates. • • • at most 2 comparisons per element c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (17)
5. Is adaptive heapsort practical? Yes No ✷ ✷ c � Performance Engineering Laboratory International Workshop on Combinatorial Algorithms, 21 June 2011 (18)
Recommend
More recommend