complexity of the adaptive shiverssort algorithm
play

Complexity of the Adaptive ShiversSort Algorithm and of its sibling - PowerPoint PPT Presentation

Complexity of the Adaptive ShiversSort Algorithm and of its sibling TimSort Vincent Jug LIGM Universit Paris-Est Marne-la-Valle, ESIEE, ENPC & CNRS 18/03/2019 V. Jug Complexity of the Adaptive ShiversSort Algorithm Contents


  1. Complexity of the Adaptive ShiversSort Algorithm and of its sibling TimSort Vincent Jugé LIGM – Université Paris-Est Marne-la-Vallée, ESIEE, ENPC & CNRS 18/03/2019 V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  2. Contents Efficient Merge Sorts 1 TimSort 2 Adaptive ShiversSort 3 V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  3. Sorting data 0 1 4 3 1 5 4 3 2 2 0 2 0 0 1 1 2 2 2 3 3 4 4 5 V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  4. Sorting data – in a stable manner 0 1 1 1 4 1 3 1 1 2 5 1 4 2 3 2 2 1 2 2 0 2 2 3 · · · · · · · · · · · · · · · 0 1 0 2 1 1 1 2 2 1 2 2 2 3 3 1 3 2 4 1 4 2 5 1 V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  5. Sorting data – in a stable manner 0 1 1 1 4 1 3 1 1 2 5 1 4 2 3 2 2 1 2 2 0 2 2 3 0 1 0 2 1 1 1 2 2 1 2 2 2 3 3 1 3 2 4 1 4 2 5 1 MergeSort has a worst-case time complexity of O ( n log( n )) Can we do better? V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  6. Sorting data – in a stable manner 0 1 1 1 4 1 3 1 1 2 5 1 4 2 3 2 2 1 2 2 0 2 2 3 0 1 0 2 1 1 1 2 2 1 2 2 2 3 3 1 3 2 4 1 4 2 5 1 MergeSort has a worst-case time complexity of O ( n log( n )) Can we do better? No! Proof: There are n ! possible reorderings Each element comparison gives a 1-bit information Thus log 2 ( n !) ∼ n log 2 ( n ) tests are required V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  7. Sorting data – in a stable manner 0 1 1 1 4 1 3 1 1 2 5 1 4 2 3 2 2 1 2 2 0 2 2 3 0 1 0 2 1 1 1 2 2 1 2 2 2 3 3 1 3 2 4 1 4 2 5 1 MergeSort has a worst-case time complexity of O ( n log( n )) Can we do better? No! Proof: There are n ! possible reorderings ! K L A T Each element comparison gives a 1-bit information F O D N E Thus log 2 ( n !) ∼ n log 2 ( n ) tests are required V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  8. Cannot we ever do better? In some cases, we should. . . 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  9. Let us do better! 0 1 1 4 3 0 2 2 3 4 5 2 1 Chunk your data in non-decreasing runs V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  10. Let us do better! 4 runs of lengths 4 , 1 , 6 and 1 0 1 1 4 3 0 2 2 3 4 5 2 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs ( ρ ) and their lengths ( r 1 , . . . , r ρ ) V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  11. Let us do better! 4 runs of lengths 4 , 1 , 6 and 1 0 1 1 4 3 0 2 2 3 4 5 2 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs ( ρ ) and their lengths ( r 1 , . . . , r ρ ) New parameters: Run-length entropy : H = � ρ k = 1 ( r i / n ) log 2 ( n / r i ) New parameters: Run-length entropy : H � log 2 ( ρ ) � log 2 ( n ) V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  12. Let us do better! 4 runs of lengths 4 , 1 , 6 and 1 0 1 1 4 3 0 2 2 3 4 5 2 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs ( ρ ) and their lengths ( r 1 , . . . , r ρ ) New parameters: Run-length entropy : H = � ρ k = 1 ( r i / n ) log 2 ( n / r i ) New parameters: Run-length entropy : H � log 2 ( ρ ) � log 2 ( n ) Theorem [2,5] Some stable algorithm has a worst-case time complexity of O ( n + n H ) V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  13. Let us do better! 4 runs of lengths 4 , 1 , 6 and 1 0 1 1 4 3 0 2 2 3 4 5 2 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs ( ρ ) and their lengths ( r 1 , . . . , r ρ ) New parameters: Run-length entropy : H = � ρ k = 1 ( r i / n ) log 2 ( n / r i ) New parameters: Run-length entropy : H � log 2 ( ρ ) � log 2 ( n ) Theorem [2,5] TimSort has a worst-case time complexity of O ( n + n H ) V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  14. Let us do better! 4 runs of lengths 4 , 1 , 6 and 1 0 1 1 4 3 0 2 2 3 4 5 2 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs ( ρ ) and their lengths ( r 1 , . . . , r ρ ) New parameters: Run-length entropy : H = � ρ k = 1 ( r i / n ) log 2 ( n / r i ) New parameters: Run-length entropy : H � log 2 ( ρ ) � log 2 ( n ) Theorem [2,5] TimSort has a worst-case time complexity of O ( n + n H ) We cannot do better than Ω( n + n H ) ! [2] Reading the whole input requires a time Ω( n ) There are X possible reorderings, with X � 2 1 − ρ � n � 2 n H / 2 � r 1 ... r ρ V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  15. Contents Efficient Merge Sorts 1 TimSort 2 Adaptive ShiversSort 3 V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  16. A brief history of TimSort 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  17. A brief history of TimSort 1 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 1 Invented by Tim Peters [1] V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  18. A brief history of TimSort P 1 2 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 1 Invented by Tim Peters [1] 2 Standard algorithm in Python V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  19. A brief history of TimSort P A J O 1 2 3 3 3 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 1 Invented by Tim Peters [1] 2 Standard algorithm in Python 3 Standard algorithm ———————— for non-primitive arrays in Android , Java , Octave V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  20. A brief history of TimSort P A J O 1 2 3 3 3 4 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 1 Invented by Tim Peters [1] 2 Standard algorithm in Python 3 Standard algorithm ———————— for non-primitive arrays in Android , Java , Octave 4 1 st worst-case complexity analysis [4] – TimSort works in time O ( n log n ) V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  21. A brief history of TimSort P A J O 1 2 3 3 3 4 5 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 1 Invented by Tim Peters [1] 2 Standard algorithm in Python 3 Standard algorithm ———————— for non-primitive arrays in Android , Java , Octave 4 1 st worst-case complexity analysis [4] – TimSort works in time O ( n log n ) 5 Refined worst-case analysis [5] – TimSort works in time O ( n + n H ) V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  22. A brief history of TimSort P A J O 1 2 3 3 3 4 5 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 1 Invented by Tim Peters [1] 2 Standard algorithm in Python 3 Standard algorithm ———————— for non-primitive arrays in Android , Java , Octave 4 1 st worst-case complexity analysis [4] – TimSort works in time O ( n log n ) 5 Refined worst-case analysis [5] – TimSort works in time O ( n + n H ) Bugs uncovered in Python & Java implementations [3,5] V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  23. The principles of TimSort and of adaptive ShiversSort (1/2) Algorithm based on merging adjacent runs 0 1 1 4 3 0 1 1 3 4 V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  24. The principles of TimSort and of adaptive ShiversSort (1/2) Algorithm based on merging adjacent runs ℓ k 0 1 1 4 3 0 1 1 3 4 1 Run merging algorithm: standard + many optimizations ◮ time O ( k + ℓ ) � Merge cost: k + ℓ ◮ memory O (min( k , ℓ )) V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  25. The principles of TimSort and of adaptive ShiversSort (1/2) Algorithm based on merging adjacent runs ℓ k 0 1 1 4 3 ≡ 4 1 0 1 1 3 4 ≡ 5 1 Run merging algorithm: standard + many optimizations ◮ time O ( k + ℓ ) � Merge cost: k + ℓ ◮ memory O (min( k , ℓ )) 2 Policy for choosing runs to merge: ◮ depends on run lengths only V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  26. The principles of TimSort and of adaptive ShiversSort (1/2) Algorithm based on merging adjacent runs ℓ k 0 1 1 4 3 ≡ 4 1 0 1 1 3 4 ≡ 5 1 Run merging algorithm: standard + many optimizations ◮ time O ( k + ℓ ) � Merge cost: k + ℓ ◮ memory O (min( k , ℓ )) 2 Policy for choosing runs to merge: ◮ depends on run lengths only 3 Complexity analysis: ☛ Evaluate the total merge cost ☛ Forget array values and only work with run lengths V. Jugé Complexity of the Adaptive ShiversSort Algorithm

  27. Some results about merge costs Best-case merge costs: Every algorithm has a best-case merge cost of at least n H [2,8 + ] Worst-case merge costs: Every algorithm has a worst-case merge cost of at least n H + 2 n [8] merge cost 0 n H + 2 n n H V. Jugé Complexity of the Adaptive ShiversSort Algorithm

Recommend


More recommend