weak heaps and friends recent developments
play

Weak Heaps and Friends: Recent Developments Stefan Edelkamp 1 , Amr - PowerPoint PPT Presentation

Weak Heaps and Friends: Recent Developments Stefan Edelkamp 1 , Amr Elmasry 2 , Jyrki Katajainen 3 , 4 , Armin Wei 5 1) University of Bremen 2) Alexandria University 3) University of Copenhagen 4) Jyrki Katajainen and Company 5) University of


  1. Weak Heaps and Friends: Recent Developments Stefan Edelkamp 1 , Amr Elmasry 2 , Jyrki Katajainen 3 , 4 , Armin Weiß 5 1) University of Bremen 2) Alexandria University 3) University of Copenhagen 4) Jyrki Katajainen and Company 5) University of Stuttgart These are Stefan’s slides for his invited talk at IWOCA 2013 � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (1)

  2. (Complete) Weak Heap � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (2)

  3. Array-Based Representation 0 8 1 Array a of elements ��� ��� 12 ��� ��� Array r of bits (1’s in cyan) ��� ��� 3 2 ���� ���� ���� ���� 10 27 ���� ���� ���� ���� 6 7 5 4 0 1 2 3 4 5 6 7 8 9 10 ��� ��� ���� ���� 53 46 49 47 ��� ��� ���� ���� 8 12 27 10 47 49 53 46 75 80 26 ��� ��� ���� ���� ��� ��� ���� ���� 10 8 9 26 75 80 ⌊ i/ 2 ⌋ i a i 2 i + r i 2 i +1 − r i j a j � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (3)

  4. Why Weak Heaps? Data structure extract - min construct minimum insert binary heap [Flo64,Wil64] 2 n 0 ⌈ lg n ⌉ 2 ⌈ lg n ⌉ weak heap [Dut93] n − 1 0 ⌈ lg n ⌉ ⌈ lg n ⌉ Repeated insertions [IWOCA-12, JDA-13] Operation sequence: insert n 22 binary heap weak heap 20 weak queue 18 Number of element comparisons per n 16 14 12 10 8 6 4 2 0 10 3 10 4 10 5 10 6 10 7 � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (4) n [logarithmic scale]

  5. Structure of the Talk: Research Questions What is the ”best” heap-construction algorithm? What is the ”best” sorting algorithm? What is the ”best” priority queue? � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (5)

  6. What is the best in-place heap-construction algorithm? Best ∼ In terms of element comparisons and practical running time In-place ∼ Θ(1) extra words � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (6)

  7. Some Options Element comparisons Inventor Abbreviation Worst Average Extra space Floyd alg. F 2 n ∼ 1 . 88 n Θ(1) words Gonnet & Munro alg. GM ∼ 1 . 625 n ∼ 1 . 625 n Θ( n ) words McDiarmid & Reed alg. MR 2 n ∼ 1 . 52 n Θ( n ) bits Li & Reed lower bound ∼ 1 . 37 n ∼ 1 . 37 n Ω(1) words Average-case results assume that the input is a random permutation of n distinct elements � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (7)

  8. Building Binary Heaps Weak heap: Lower bound n − 1 (element comparisons) Weak heap → binary heap: ∼ 0 . 625 n [IWOCA-12, MFCS-12] ❀ 1 . 625 n heap construction, n bits (worst case) Bottom trees: ❀ 1 . 625 n in-place heap construction (worst case) [ ❀ 1 . 52 n in-place heap construction (average case) ] � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (8)

  9. Weak Heap - > Binary Heap GM: Build a binary heap in two phases: 1) Construct a heap-ordered binomial tree 2) Convert this tree into a binary heap Alternative: a complete weak heap → # element comparisons C (8) = 1 , C (2 k ) 2 C (2 k − 1 ) + k − 1 = For n = 2 k ≥ 8, the solution of this relation is C ( n ) = 5 / 8 · n − lg n − 1 height 0 3 0 1 tournament tree 2 Alternative: a navigation 0 2 1 3 1 pile → less element moves 0 elements 10 75 26 8 46 12 80 75 5 6 7 0 1 2 3 4 navigation bits 011 | 1101 | 0111 � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (9)

  10. Bottom-Tree Conversion Bottom trees: All complete binary trees of size m = 2 ⌊ lg lg n ⌋ +1 − 1 1. Convert all bottom trees to bottom heaps 2. Ensure heap order at upper levels by using Floyd’s sift-down procedure 3. Optimize element moves by handling binary micro trees of size 7 differently • Elements involved in all bottom-heap constructions ≤ n → 1 . 625 n element comparisons � n/ 2 h +1 � • At most nodes of height h → o ( n ) element comparisons at the levels above the bottom trees � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (10)

  11. Experimental Setup and Summary Random permutations of n distinct int(eger)s for different (small, medium, large, and very large) problem sizes Programs tuned to construct binary heaps of size 2 k − 1 • GM showed acceptable practical performance • number of element comparisons and element moves was larger for in-situ GM than for in-situ MR • in-situ GM was faster than in-situ MR • but beaten by F and its BKS variant � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (11)

  12. Element Comparisons in-situ in-situ n std F BKS GM MR 2 10 − 1 1.64 1.86 1.86 1.74 1.52 2 15 − 1 1.64 1.88 1.88 1.65 1.54 2 20 − 1 1.64 1.88 1.88 1.63 1.53 2 25 − 1 1.65 1.88 1.88 1.63 1.53 std: Bottom-up heap construction ( make heap , Floyd, Wegener) BKS: Improved version of Floyd’s algorithm (Bojesen et al. [JEA-00]) � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (12)

  13. Execution Times in-situ in-situ n std F BKS GM MR 2 10 − 1 22.3 14.6 17.1 21.3 26.2 2 15 − 1 22.2 14.6 17.4 23.0 24.4 2 20 − 1 29.3 21.9 17.8 22.9 23.6 2 25 − 1 29.8 21.7 17.5 22.9 23.6 std: Bottom-up heap construction ( make heap , Floyd, Wegener) BKS: Improved version of Floyd’s algorithm (Bojesen et al. [JEA-00]) � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (13)

  14. What is the best constant-factor-optimal in-situ/adaptive sorting algorithm? Best ∼ In terms of element comparisons and practical running time In-situ ∼ Θ(lg n ) extra words Adaptive ∼ with respect to inversions � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (14)

  15. Sequential Sorting Lower bound: lg n ! = n lg n − n/ ln 2+ O (lg n ), where 1 / ln 2 = 1 . 4426 • Worst case: n lg n + 0 . 1 n [Dutton 1993, BIT] • Best case/index sorting: n lg n − 0 . 9 n [STACS-00, JEA-02] • QuickWeakHeapsort: n lg n +0 . 2 n on average, in-place [JEA-02] • Optimal adaptive sorting: n lg( Inv ( n ) /n ) + O ( n ) worst case, two options [IWOCA-11, JDA-12] � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (15)

  16. Constant-Factor-Optimal Algorithms Space Time Worst Average Observed Lower bound O (1) Ω( n lg n ) -1.44 -1.44 BUHeapsort [Weg93] O (1) O ( n lg n ) ω (1) – [0.35,0.39] WeakHeapsort [Dut93] O ( n/w ) O ( n lg n ) 0.09 – [-0.46,-0.42] RWeakHeapsort [ES02] O ( n ) O ( n lg n ) -0.91 -0.91 -0.91 Mergesort [Knu73] O ( n ) O ( n lg n ) -0.91 -1.26 – EWeakHeapsort O ( n ) O ( n lg n ) -0.91 -1.26 – O ( n 2 ) Insertionsort [Knu73] O (1) -0.91 -1.38 – O ( n 2 ) MergeInsertion [Knu73] O ( n ) -1.32 -1.3999 [-1.43,-1.41] InPlaceMergesort [R92] O (1) O ( n lg n ) -1.32 – – QuickHeapsort [DW13] O (1) O ( n lg n ) ω (1) -0.03 ≈ 0.20 O ( n/w ) O ( n lg n ) ω (1) -0.99 ≈ -1.24 QuickMergesort (IS) O (lg n ) O ( n lg n ) -0.32 -1.38 – QuickMergesort O (1) O ( n lg n ) -0.32 -1.26 [-1.29,-1.27] QuickMergesort (MI) O (lg n ) O ( n lg n ) -0.32 -1.3999 [-1.41,-1.40] � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (16)

  17. Idea of QuickXsort As in Quicksort the array is partitioned into the elements greater and less than some pivot element Then one part of the array is sorted by some algorithm X and the other part is sorted recursively The advantage of this procedure is that, if X is a black box, then in QuickXsort the part of the array which is not currently being sorted may be used as temporary space, what yields an in-situ variant of X By taking a sample of Θ( √ n ) elements when selecting the pivot, QuickXsort performs, on an average, the same number of element comparisons as X up to an o ( n ) lower-order term � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (17)

  18. Results for Small Datasets Small−Scale Comparison Experiment Small−Scale Runtime Experiment −1.35 0.6 Lower Bound Insertionsort Insertionsort Merge Insertion Improved −1.36 Merge Insertion Improved 0.55 Merge Insertion Merge Insertion Number of element comparisons − n log n per n −1.37 0.5 Execution time per (#elements) 2 [ µ s] −1.38 0.45 −1.39 −1.4 0.4 −1.41 0.35 −1.42 0.3 −1.43 0.25 −1.44 −1.45 0.2 2 10 2 12 2 14 2 16 2 10 2 12 2 14 2 16 n [logarithmic scale] n [logarithmic scale] � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (18)

  19. Results for Large Datasets Large−Scale Comparison Experiment Large−Scale Runtime Experiment 1 Quicksort Median Sqrt Quicksort Median Sqrt STL Introsort (out of range) STL Introsort 0.4 STL Mergesort STL Mergesort QuickMergesort (MI) Median Sqrt Number of element comparisons − n log n per n QuickMergesort (MI) Median Sqrt 0.5 QuickMergesort Median 3 0.35 QuickMergesort Median 3 QuickMergesort Median Sqrt QuickMergesort Median Sqrt QuickWeakHeapsort Median Sqrt Execution time per element [ µ s] QuickWeakHeapsort Median Sqrt 0.3 Lower Bound 0 0.25 0.2 −0.5 0.15 0.1 −1 0.05 −1.5 0 2 10 2 12 2 14 2 16 2 18 2 20 2 22 2 12 2 14 2 16 2 18 2 20 2 22 n [logarithmic scale] n [logarithmic scale] � Performance Engineering Laboratory c IWOCA 2013: Rouen, France (19)

Recommend


More recommend