Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 - PowerPoint PPT Presentation

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 – Lecture cture 6 Yan n Gu I/O Algorithms and Parallel Samplesort

Review of Samplesort CS260: Algorithm Semisort Engineering Lecture 6 Course Policy 2

Sample-sort outline Analo logou gous s to mult ltiw iway ay quic ickso ksort 1. 1. Sp Spli lit in input ut array in into 𝑂 contiguo iguous us suba barra rrays ys of siz ize 𝑂 . So Sort subar arrays rays recursi sivel vely … 𝑂 , sorted 𝑂

Sample-sort outline Analo logou gous s to mult ltiw iway ay quic ickso ksort 𝑂 , sorted 1. 1. Sp Spli lit in input ut array in into 𝑂 contiguo iguous us suba barra rrays ys of siz ize 𝑂 . So Sort subar arrays rays recursi sivel vely y (sequ equent entia ially lly) …

Sample-sort outline 2. 2. Choo oose se 𝑂 − 1 “good” pivots 𝑂 , sorted 𝑞 1 ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 3. 3. Dis istribu ribute te su subar barrays rays in into o buckets ckets , , ac accordin ording g to … pivot vots Size ≈ 𝑂 ≤ 𝑞 1 ≤ ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 ≤ Bucket 1 Bucket 2 Bucket 𝑂

Sample-sort outline 4. Recurs 4. cursively ively sort rt the buckets ckets ≤ 𝑞 1 ≤ ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 ≤ Bucket 1 Bucket 2 Bucket 𝑂 5. 5. Copy py conca oncatenated tenated buckets ckets bac ack k to input put ar arra ray sorted

Review of Samplesort CS260: Algorithm Semisort Engineering Lecture 6 Course Policy 7

What is semisort? key 45 12 45 61 28 61 61 45 28 45 Value 2 5 3 9 5 9 8 1 7 5 • Input: • An array of records with associated keys • Assume keys can be hashed to the range [𝑜 𝑙 ] • Goal: • All records with equal keys should be adjacent

What is semisort? key 12 61 61 61 45 45 45 45 28 28 Value 5 8 9 9 2 5 1 3 7 5 • Input: • An array of records with associated keys • Assume keys can be hashed to the range [𝑜 𝑙 ] • Goal: • All records with equal keys should be adjacent

What is semisort? key 45 45 45 45 12 61 61 61 28 28 Value 2 5 1 3 5 8 9 9 7 5 • Input: • An array of records with associated keys • Assume keys can be hashed to the range [𝑜 𝑙 ] • Goal: • All records with equal keys should be adjacent • Different keys are not necessarily sorted • Records with equal keys do not need to be sorted by their values

Semisort is one of the most useful primitives in parallel algorithms Parallel In-Place Algorithms: Theory and Practice Julienne: A Framework for Parallel Graph Algorithms using Work- efficient Bucketing Semi-Asymmetric Parallel Graph Algorithms for NVRAMs Efficient BVH Construction via Approximate Agglomerative Clustering Theoretically-Efficient and Practical Parallel DBSCAN 12

Why is semisort so useful? (albeit not seen before) • Semisorting can be done by sorting, but faster (less restriction) • Theoretically can be done in 𝑃 𝑜 work not 𝑃 𝑜 log 𝑜 work • Can be used to implement counting / integer sort • Integer sort: given 𝑜 key-value pairs with keys in range [1, … , 𝑜] , query the KV-pairs with a certain key • Counting sort: given 𝑜 key-value pairs with keys in range [1, … , 𝑜] , query the number of KV-pairs with a certain key • In database community, this is called the GroupBy operator 13

Why is semisort so useful? (albeit not seen before) • Semisorting can be done by sorting, but faster (less restriction) • Theoretically can be done in 𝑃 𝑜 work not 𝑃 𝑜 log 𝑜 work • Can be used to implement counting / integer sort keys 37 … 58 … 92 … 92 56 key value Linked 12 8 11 lists of values 9 19 56 52 14

Attempts – Sequentially: Pre-allocated array 92 56 keys 37 … 58 … 92 … key value Arrays 12 8 11 of values 9 44 19 52 31 56  Problem  Need to pre-count the number of each key

Another use case for semisrot • Generate adjacency array for a graph Sorted Edge list edge list (3,5) (3,5) 2 6 (1,7) (3,7) (2,3) (3,6) 3 4 (3,6) (5,4) 1 5 (5,4) (1,6) 7 (3,7) (1,7) (1,6) (2,3)

Why is semisort hard? key 45 45 45 45 12 61 61 61 28 28 Value 1 5 3 2 5 8 9 9 7 5 • There can be many duplicate keys • Heavy keys • Or, there can be almost no duplicate keys • Light keys

Implement integer sort using semisort key 45 45 45 45 12 61 61 61 28 28 Value 1 5 3 2 5 8 9 9 7 5 • Input: 𝒐 KV-pairs with key in [𝑜] • Step 1: hash the keys (i.e., for 𝒍 𝒋 , 𝒘 𝒋 , generate 𝒊 𝒋 = 𝐢𝐛𝐭𝐢(𝒍 𝒋 ) ) • Step 2: semisort 𝒊 𝒋 , (𝒍 𝒋 , 𝒘 𝒋 ) , and resolve conflicts • Step 3: get the pointer for each key 𝒍 𝒋

The Top-Down Parallel Semisort Algorithm 22

The main goal estimate key counts • And tell the heavy keys from light ones. By how? Sampling! • For a key appear more than 𝐨/𝒖 times, we call it a heavy key • Otherwise, we call it a light key • We can treat them separately

The algorithm • Take 𝒖 log 𝒐 samples and sort them • For those keys with more than log 𝒐 appearances, we mark them as heavy keys, others are light keys • We give each heavy key a bucket, and the another 𝒖 buckets for light keys each corresponds to a range of 𝒐 𝒍 /𝒖 • The input keys are hashed into 𝒐 𝒍 • In total we have no more than 2𝑢 buckets • The rest of the algorithm is pretty similar to samplesort

Phase 1: Sampling and sorting 1. Select a sample set 𝑇 with 𝑢 log 𝑜 of keys 2. Sort 𝑇 …… Sampling …… S Sorting 17 17 …… 5 5 5 8 8 8 8 8 11 17 (Counting)

Phase 2: Bucket Construction Sorted samples: 17 17 …… 5 5 5 8 8 8 8 8 11 17 Counting & Filtering Light keys Heavy keys Range 0-15 16-31 keys 65 … 8 20 keys 5 11 17 21 26 31 ...

At the end of Phase 2 • In total we have no more than 2𝑢 buckets • 𝑢 of them are for light keys • Then we construct a hash table for the heavy keys • Now we know which bucket each KV-pair (𝒍 𝒋 , 𝒘 𝒋 ) goes to: • If 𝑙 𝑗 is found in the hash table, assign it to the associated heavy bucket • Otherwise, it goes to the light bucket based on the range of 𝑙 𝑗 • The rest of the algorithm is almost identical to samplesort

Sample-sort outline Analogous to multiway quicksort 𝑂/𝑢 1. Split input array into 𝑂 contiguous subarrays of size 𝑂 𝑢 … 𝑂 𝑂

Sample-sort outline Analogous to multiway quicksort Size ≈ 𝑢 1. Split input array into 𝑂/𝑢 contiguous subarrays of size 𝑢 . Sort subarrays recursively (sequentially) …

Sample-sort outline 2. Distribute subarrays into buckets … … ≤ 𝑞 1 ≤ ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 ≤ Bucket 1 Bucket 2 Bucket 𝑂

Sample-sort outline Only for the light buckets 3. Recursively sort the buckets … Bucket 1 Bucket 2 Bucket 𝑂 4. Copy concatenated buckets back to input array sorted

Difference 2: subarrays are not sorted • For simplicity, assume 𝒐 = 𝟐𝟕 , and the input is [𝟐, 𝟑, 𝟒, 𝟓, 𝟐, 𝟐, 𝟒, 𝟒, 𝟐, 𝟑, 𝟑, 𝟓, 𝟐, 𝟑, 𝟓, 𝟓] • First, get the count for each subarray in each bucket [𝟐, 𝟐, 𝟐, 𝟐, 𝟑, 𝟏, 𝟑, 𝟏, 𝟐, 𝟑, 𝟏, 𝟐, 𝟐, 𝟐, 𝟏, 𝟑] • Then, transpose the array and scan to compute the offsets [𝟐, 𝟑, 𝟐, 𝟐, 𝟐, 𝟏, 𝟑, 𝟐, 𝟐, 𝟑, 𝟏, 𝟏, 𝟐, 𝟏, 𝟐, 𝟑] [𝟏, 𝟐, 𝟒, 𝟓, 𝟔, 𝟕, 𝟕, 𝟗, 𝟘, 𝟐𝟏, 𝟐𝟑, 𝟐𝟑, 𝟐𝟑, 𝟐𝟒, 𝟐𝟒, 𝟐𝟓] • Lastly, move each element to the corresponding bucket [∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅] [𝟐, ∅, ∅, ∅, ∅, 𝟑, ∅, ∅, ∅, 𝟒, ∅, ∅, 𝟓, ∅, ∅, ∅] [𝟐, 𝟐, 𝟐, ∅, ∅, 𝟑, ∅, ∅, ∅, 𝟒, 𝟒, 𝟒, 𝟓, ∅, ∅, ∅] 32

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 - PowerPoint PPT Presentation

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 Lecture cture 6 Yan n Gu I/O Algorithms and Parallel Samplesort Review of Samplesort CS260: Algorithm Semisort Engineering Lecture 6 Course Policy 2 Sample-sort

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 Lecture cture 10 Yan n Gu

Visible Surface Determination CS418 Computer Graphics John C. Hart Painters Algorithm

Algorithm Analysis October 12, 2016 CMPE 250 Algorithm Analysis October 12, 2016 1 / 66

Shortest path using A Algorithm Introduction History Components of A Algorithm

Stoer-Wagner Algorithm A Minimum Cut Algorithm for Undirected Graphs BigNews CS214: Algorithms

Quiz I Give the SVD-based algorithm for solving least squares, and I justify the algorithm by that

Some More Critical Section Solutions Dr. Liam OConnor University of Edinburgh LFCS (and UNSW)

A-Star Algorithm & Heaps/Priority Queues Mark Redekopp 2 A* Search Algorithm ALGORITHM

Earley algorithm Earley: introduction Example of Earley algorithm Scott Farrar CLMA,

The BBS Algorithm The BBS Algorithm The BBS Algorithm Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is

Trip Report FINAL MEETING AND SUMMER SCHOOL OF DFG PRIORITY PROGRAM ALGORITHM ENGINEERING DFG

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 Lecture cture 6 Yan n Gu

Dijkstras Algorithm Austin Saporito and Charlie Rizzo Test Questions 1. What is the run time

Pollards Rho Algorithm for Elliptic Curves Aaron Blumenfeld November 30, 2015 Aaron

It slices, dices, and makes julienne data! or, Processing data with RecordStream, also known

Graph Representation Learning: Where Probability Theory, Data Mining, and Neural Networks Meet

Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE 2015 Alan

Joint work with Jessica Hwang & Paulo Orenstein (Stanford), Judah Cohen & Karl Pfeiffer

Telehealth: How to make it work November 12, 2008 Monday 18 June 2018 Supported by The Royal

Two-dimensional Fermi gases Two dimensional Fermi gases Michael Khl BEC-BCS crossover What

Managing Hepatitis C: a Case-based Approach December 9, 2016 Bryn A Boslett, MD Outline Case 1

Cooking Academy Holistic Food Preparation Cooking Academy Holistic Food Preparation Module #1

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 - PowerPoint PPT Presentation

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 Lecture cture 6 Yan n Gu I/O Algorithms and Parallel Samplesort Review of Samplesort CS260: Algorithm Semisort Engineering Lecture 6 Course Policy 2 Sample-sort

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 Lecture cture 10 Yan n Gu

Visible Surface Determination CS418 Computer Graphics John C. Hart Painters Algorithm

Algorithm Analysis October 12, 2016 CMPE 250 Algorithm Analysis October 12, 2016 1 / 66

Shortest path using A Algorithm Introduction History Components of A Algorithm

Stoer-Wagner Algorithm A Minimum Cut Algorithm for Undirected Graphs BigNews CS214: Algorithms

Quiz I Give the SVD-based algorithm for solving least squares, and I justify the algorithm by that

Some More Critical Section Solutions Dr. Liam OConnor University of Edinburgh LFCS (and UNSW)

A-Star Algorithm &amp; Heaps/Priority Queues Mark Redekopp 2 A* Search Algorithm ALGORITHM

Earley algorithm Earley: introduction Example of Earley algorithm Scott Farrar CLMA,

The BBS Algorithm The BBS Algorithm The BBS Algorithm Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is

Trip Report FINAL MEETING AND SUMMER SCHOOL OF DFG PRIORITY PROGRAM ALGORITHM ENGINEERING DFG

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 Lecture cture 6 Yan n Gu

Dijkstras Algorithm Austin Saporito and Charlie Rizzo Test Questions 1. What is the run time

Pollards Rho Algorithm for Elliptic Curves Aaron Blumenfeld November 30, 2015 Aaron

It slices, dices, and makes julienne data! or, Processing data with RecordStream, also known

Graph Representation Learning: Where Probability Theory, Data Mining, and Neural Networks Meet

Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE 2015 Alan

Joint work with Jessica Hwang &amp; Paulo Orenstein (Stanford), Judah Cohen &amp; Karl Pfeiffer

Telehealth: How to make it work November 12, 2008 Monday 18 June 2018 Supported by The Royal

Two-dimensional Fermi gases Two dimensional Fermi gases Michael Khl BEC-BCS crossover What

Managing Hepatitis C: a Case-based Approach December 9, 2016 Bryn A Boslett, MD Outline Case 1

Cooking Academy Holistic Food Preparation Cooking Academy Holistic Food Preparation Module #1

A-Star Algorithm & Heaps/Priority Queues Mark Redekopp 2 A* Search Algorithm ALGORITHM

Joint work with Jessica Hwang & Paulo Orenstein (Stanford), Judah Cohen & Karl Pfeiffer