A Parallel Compact Hash Table Alfons Laarman & Steven van der - PowerPoint PPT Presentation

A Parallel Compact Hash Table Alfons Laarman & Steven van der Vegt

Overview Research Motivation Background Contribution A Parallel Compact Hash Table October 3, 2011 2 / 19

Introduction ◮ Hash tables are fundamental data structures A Parallel Compact Hash Table October 3, 2011 3 / 19

Introduction ◮ Hash tables are fundamental data structures ◮ Compact hash tables: memory efficient hash tables A Parallel Compact Hash Table October 3, 2011 3 / 19

Introduction ◮ Hash tables are fundamental data structures ◮ Compact hash tables: memory efficient hash tables ◮ Useful in i.e. Model checking, planning, BDDs, Tree tables A Parallel Compact Hash Table October 3, 2011 3 / 19

Introduction ◮ Hash tables are fundamental data structures ◮ Compact hash tables: memory efficient hash tables ◮ Useful in i.e. Model checking, planning, BDDs, Tree tables ◮ Problem: No concurrent implementation of concurrent hash tables A Parallel Compact Hash Table October 3, 2011 3 / 19

Introduction ◮ Hash tables are fundamental data structures ◮ Compact hash tables: memory efficient hash tables ◮ Useful in i.e. Model checking, planning, BDDs, Tree tables ◮ Problem: No concurrent implementation of concurrent hash tables ◮ Our contribution: A scalable lockless algorithm for compact hashing A Parallel Compact Hash Table October 3, 2011 3 / 19

Goals ◮ Parallel compact hash table ◮ Scalable ◮ Fast: lockless ◮ Memory efficient: no pointers (otherwise we lose the benefits from compact hashing) ◮ Focus on findOrPut ◮ Already sufficient Model checking (monotonic growing dataset) ◮ subsumes individual find and put operations A Parallel Compact Hash Table October 3, 2011 4 / 19

Hashing Revisited ◮ A hash table stores a subset of a key universe U into an table T of buckets typically | U | ≫ | T | ◮ Multiple keys can be mapped upon 1 bucket ◮ The full key is stored in T to resolve collisions ◮ Several possible collision resolution algorithms, i.e. linear probing A Parallel Compact Hash Table October 3, 2011 6 / 19

Hashing Revisited - Example keys buckets 000 001 Lisa Smith 521-8976 John Smith 002 : : : Lisa Smith 151 152 John Smith 521-1234 Sam Doe 153 Sandra Dee 521-9655 154 T ed Baker 418-4165 Sandra Dee 155 : : : T ed Baker 253 254 Sam Doe 521-5030 255 Figure: Example of an open addressing hash table. A Parallel Compact Hash Table October 3, 2011 7 / 19

Introduction Into Compact Hash Tables ◮ If however | U | ≤ | T | , we only need a bit array! (and a perfect hash function) ◮ What if | U | just slightly bigger than | T | ? Cleary Tables: 1. Maintain order in T 2. Add three bits to buckets in T A Parallel Compact Hash Table October 3, 2011 8 / 19

Introduction Into BLP Let K be the set of possible keys and h the hash function which computes the indexes. h : K → { 0 .. M − 1 } with the property K 1 , K 2 ∈ K | K 1 ≤ L 2 iff h ( K 1 ) ≤ h ( K 2 ) ◮ All keys are stored in ascending order. ◮ There can not be empty locations between a keys original hash location and its actual storage position. ◮ All keys sharing the same initial hash location form one continuous group . ◮ Groups can grow together forming clusters of groups. ◮ Bidirectional linear probing algorithm (probing possible in both directions) A Parallel Compact Hash Table October 3, 2011 9 / 19

Introduction Into BLP - Insert Example Inserting k into table T in 5 steps: 1. Determine index: i ← h ( k ) 2. Determine probing direction T [ h ( k )] > k ? right : left 3. Search empty bucket 4. Insert K into empty bucket 5. Swap bucket into correct place A Parallel Compact Hash Table October 3, 2011 10 / 19

Cleary Table Cleary administration bits: ◮ Virgin Set upon a bucket if its location is the initial hash location for some key in the tables ◮ Change Set at the beginning of a group with the same initial hash location ◮ Occupied Set if the bucket contains a key A Parallel Compact Hash Table October 3, 2011 11 / 19

Cleary Table - Example Figure: Example of a partially filled Cleary table with 4 groups. A Parallel Compact Hash Table October 3, 2011 12 / 19

Requirements for Parallelizing We need a write-exclusive locking mechanism that ◮ Scales well ◮ Is memory efficient A Parallel Compact Hash Table October 3, 2011 14 / 19

Locking Mechanism Properties: ◮ 1 bit per bucket A Parallel Compact Hash Table October 3, 2011 15 / 19

Locking Mechanism Properties: ◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap ( if a == b then a ← c ) A Parallel Compact Hash Table October 3, 2011 15 / 19

Locking Mechanism Properties: ◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap ( if a == b then a ← c ) Locking steps: 1. Search for both left and right bucket of cluster A Parallel Compact Hash Table October 3, 2011 15 / 19

Locking Mechanism Properties: ◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap ( if a == b then a ← c ) Locking steps: 1. Search for both left and right bucket of cluster 2. Lock these buckets A Parallel Compact Hash Table October 3, 2011 15 / 19

Locking Mechanism Properties: ◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap ( if a == b then a ← c ) Locking steps: 1. Search for both left and right bucket of cluster 2. Lock these buckets 3. If one of these locks fails → unlock and start over A Parallel Compact Hash Table October 3, 2011 15 / 19

Locking Mechanism Properties: ◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap ( if a == b then a ← c ) Locking steps: 1. Search for both left and right bucket of cluster 2. Lock these buckets 3. If one of these locks fails → unlock and start over 4. Perform exclusive actions (read, write) A Parallel Compact Hash Table October 3, 2011 15 / 19

Dynamic Region Based Locking 1: left ← CL - LEFT ( h ) 2: right ← CL - RIGHT ( h ) 3: if ¬ TRY - LOCK ( T [ left ]) then 4: RESTART 5: if ¬ TRY - LOCK ( T [ right ]) then UNLOCK ( T [ left ]) 6: 7: RESTART 8: if FIND ( k ) then ⊲ exclusive read UNLOCK ( T [ left ] , T [ right ] ) 9: return FOUND 10: 11: PUT ( k ) ⊲ exclusive write 12: UNLOCK ( T [ left ] , T [ right ] ) A Parallel Compact Hash Table October 3, 2011 16 / 19

Benchmarks - Speedup 12.0 LHT 0:1 LHT 3:1 11.0 LHT 9:1 RBL 0:1 10.0 RBL 3:1 RBL 9:1 9.0 BLP 0:1 BLP 3:1 8.0 BLP 9:1 7.0 PCT 0:1 Speedup PCT 3:1 PCT 9:1 6.0 Ideal Speedup 5.0 4.0 3.0 2.0 1.0 0.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Cores Figure: Speedups of BLP , RBL, LHT and PCT with r/w ratios 0:1, 3:1 and 9:1 A Parallel Compact Hash Table October 3, 2011 17 / 19

Benchmarks - Runtime 200.0 LHT 0:1 LHT 3:1 180.0 LHT 9:1 RBL 0:1 RBL 3:1 RBL 9:1 160.0 BLP 0:1 BLP 3:1 140.0 normalized runtime BLP 9:1 PCT 0:1 PCT 3:1 PCT 9:1 120.0 100.0 80.0 60.0 40.0 20.0 0.0 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100% load factor Figure: 16-core runtimes of BLP , RBL, LHT and PCT with r/w ratios 0:1, 3:1 and 9:1. A Parallel Compact Hash Table October 3, 2011 18 / 19

Results ◮ PCT performs very good with only inserts, ◮ PCT’s performance drops when the load-factor becomes above the 85% ◮ With a high amount of reads ¿ (9:1) BLP eventually becomes faster than LHT ◮ Region based locking with OS-locks is very slow as can be seen in RBL ◮ scalability of both PCL and BLP is good. ◮ r/w ratio: r/w exclusion on clusters takes a toll. there is room for improvement if look at the higher load factors (when clusters are large) A Parallel Compact Hash Table October 3, 2011 19 / 19

Conclusion ◮ We have realized parallel cleary with high performance and scalability up to load-factors of 90% Since the compression ratio of compact hash tables can be high, this is acceptable ◮ Future work: Allow for concurrent reads with cleary to improve scalability of Cleary even more A Parallel Compact Hash Table October 3, 2011 20 / 19

A Parallel Compact Hash Table Alfons Laarman & Steven van der - PowerPoint PPT Presentation

A Parallel Compact Hash Table Alfons Laarman & Steven van der Vegt Overview Research Motivation Background Contribution A Parallel Compact Hash Table October 3, 2011 2 / 19 Introduction Hash tables are fundamental data structures A

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

Hash Functions Hash Functions 1 Cryptographic Hash Function Crypto hash function h(x) must

Hash Functions and Hash Tables (2.5.2) A hash function h maps keys of a given type to

Hash Table In a hash table, we allocate an array of size m, which is much smaller than |U|

Generics Asumu Takikawa RacketCon 2012 1 What are generics? 2 What are generics? hash-ref

Hash Pile Ups: Using Collisions to Identify Unknown Hash Functions R. Joshua Tobin and David

Hash Tables 1 Hash Table in Primary Storage Main parameter B = number of buckets Hash

Hash tables Hash functions Open addressing March 09, 2020 Cinda Heeren / Andy Roth / Geoffrey

Distributed Hash Tables What is a DHT? Hash Table data structure that maps keys to

Interstate Medical Licensure Compact Overview Define Need for compact Compacts in

Compact Subsets Theorem Suppose that K is a subset of a topological space X. 1 If X is compact

Locality-Adaptive Parallel Hash Joins using Hardware Transactional Memory ANIL SHANBHAG , HOLGER

Databases Announcements Create Table and Drop Table Create Table 4 Create Table CREATE

Security Proofs for the MD6 Hash Algorithm Ahmed Ezzat Outline Introduction to hash

LUX Hash Function Ivica Nikoli c, Alex Biryukov, Dmitry Khovratovich University of Luxembourg

Hashing () Hashing () K08

Conditional Course Lecture 4 Hash Tables I: Separate Chaining and Open Addressing Fabian Kuhn

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Collision Attacks on the Reduced Dual-Stream Hash Function RIPEMD-128 Florian Mendel 1 , Tomislav

Introduction to Object-Oriented Programming Hashed Collections Christopher Simpkins

Rethinking SIMD Vectorization for In-Memory Databases Sri Harshal Parimi Motivation Need for

Hash Tables Bryce Boe 2013/08/20 CS24, Summer 2013 C

Dictionaries A Dictionary stores keyelement pairs, called items . Several Inf 2B: Hash Tables