simd vectorized hashing for grouped aggregation
play

SIMD Vectorized Hashing for Grouped Aggregation Bala Gurumurthy, - PowerPoint PPT Presentation

OVGU Prsentation 16.05.2017 1 SIMD Vectorized Hashing for Grouped Aggregation Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake 1 SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala


  1. OVGU Präsentation 16.05.2017 1 SIMD Vectorized Hashing for Grouped Aggregation Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake 1

  2. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Grouped Aggregation Commonly-used and time-consuming operation ● Based on analysis by Boncz et al. [1] 2

  3. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Grouped Aggregation Commonly-used and time-consuming operation ● All input must be consumed for single output ● Based on analysis by Boncz et al. [1] 2

  4. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Grouped Aggregation Commonly-used and time-consuming operation ● All input must be consumed for single output ● Faster input processing = higher throughput ● Based on analysis by Boncz et al. [1] 2

  5. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Grouped Aggregation Commonly-used and time-consuming operation ● All input must be consumed for single output ● Faster input processing = higher throughput ● Improving underlying technique improves efficiency ● Based on analysis by Boncz et al. [1] 2

  6. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD Capability in Modern Processors SIMD – Single Instruction Multiple Data ● 3

  7. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD Capability in Modern Processors SIMD – Single Instruction Multiple Data ● Allows vectorized execution in modern processors ● Reduces overall execution time of an operation ● 3

  8. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD Capability in Modern Processors SIMD – Single Instruction Multiple Data ● Allows vectorized execution in modern processors ● Reduces overall execution time of an operation ● SIMD is shown to increase throughput in orders of magnitude for ● DBMS operation [2] [3] SIMD accelerated selection [3] 3

  9. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD for Grouped Aggregation SIMD acceleration of hashing techniques improves throughput ● + Group-By = High throughput 4

  10. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD for Grouped Aggregation SIMD acceleration of hashing techniques improves throughput ● + Group-By = High throughput How to incorporate SIMD for Grouped Aggregation? What is the impact of SIMD? 4

  11. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation Grouped aggregation commonly implemented using hashing ● techniques 5

  12. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation Grouped aggregation commonly implemented using hashing ● techniques 5

  13. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation Grouped aggregation commonly implemented using hashing ● techniques Separates groups into buckets ● 5

  14. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation Grouped aggregation commonly implemented using hashing ● techniques Separates groups into buckets ● Aggregation done within each buckets ● 5

  15. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation: Example h(x) Hash function Key 6 Aggregate

  16. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation: Example Input: 3 h(x) 6

  17. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation: Example Input: 3 h(x) 3 6

  18. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation: Example Input: 3 h(x) 6

  19. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation: Example Input: 3 h(x) 3 6

  20. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Hash Based Aggregation: Example Input: 3 h(x) 6

  21. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Collision: Increasing Complexity Not all keys have unique location ● h(x) 7

  22. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Collision: Increasing Complexity Not all keys have unique location ● Two keys might hash to same slots ● h(x) 1 11 7

  23. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Collision: Increasing Complexity Not all keys have unique location ● Two keys might hash to same slots ● h(x) 11 7

  24. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Collision: Increasing Complexity Not all keys have unique location ● Two keys might hash to same slots ● Hash table must be probed for alternative location ● h(x) 11 7

  25. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Collision: Increasing Complexity Not all keys have unique location ● Two keys might hash to same slots ● Hash table must be probed for alternative location ● # of probes : 4 h(x) 11 7

  26. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake Collision: Increasing Complexity Not all keys have unique location ● Two keys might hash to same slots ● Hash table must be probed for alternative location ● h(x) Probing is time consuming 7

  27. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD for Probing Multiple slots are probed in an instant using SIMD ● h(x) 7

  28. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD for Probing Multiple slots are probed in an instant using SIMD ● Reduces overall number of probes ● # of probes : 1 h(x) 7

  29. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD Accelerated Hash Probing Each hashing techniques have their own collision resolution mechanism ● 8

  30. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD Accelerated Hash Probing Each hashing techniques have their own collision resolution mechanism ● We use open-addressing hashing techniques ● 8

  31. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD Accelerated Hash Probing Each hashing techniques have their own collision resolution mechanism ● We use open-addressing hashing techniques ● Have constant hashtable size ● 8

  32. SIMD vectorized Hashing for Grouped Aggregation 04.09.2018 Bala Gurumurthy, David Broneske, Marcus Pinnecke, Gabriel Campero Durand and Gunter Saake SIMD Accelerated Hash Probing Each hashing techniques have their own collision resolution mechanism ● We use open-addressing hashing techniques ● Have constant hashtable size ● Suitable for SIMD ● 8

Recommend


More recommend