Hardware-Sensitive Scan Operator Variants for Compiled Selection Pipelines Databases D B and Software S E Engineering David Broneske , Andreas Meister, Gunter Saake University of Magdeburg 1
Introduction Query Compilation ɣ sum(A*B) ⋈ lo_orderdate = d_datekey 훔 d_year=1993 훔 lo_discount …, lo_quantity Dates Lineorder D B S E 2
Introduction Query Compilation ɣ sum(A*B) ⋈ lo_orderdate = d_datekey 훔 d_year=1993 훔 lo_discount …, lo_quantity Dates Lineorder D B S E 2
Introduction Query Compilation ɣ sum(A*B) ⋈ lo_orderdate = d_datekey Bandwidth-bound -> compute-bound Possibility for code optimizations 훔 d_year=1993 훔 lo_discount …, lo_quantity Dates Lineorder D B S E 2
Motivating Examples D B S E 3
Motivating Examples Branching 1 for ( int i = 0; i < input_size; ++i){ 2 if (col[i] < pred) 3 agg+=agg_col[i]; 4 } List. 1: Branching scan for D B S E 3
Motivating Examples Branching 1 for ( int i = 0; i < input_size; ++i){ 2 if (col[i] < pred) 3 agg+=agg_col[i]; 4 } List. 1: Branching scan for Predicated 1 for ( int i = 0; i < input_size; ++i){ 2 agg+=agg_col[i] ∗ (col[i] < pred); 3 } D B S E 3
Motivating Examples Branching 1 for ( int i = 0; i < input_size; ++i){ 2 if (col[i] < pred) 3 agg+=agg_col[i]; 4 } List. 1: Branching scan for Predicated 1 for ( int i = 0; i < input_size; ++i){ 2 agg+=agg_col[i] ∗ (col[i] < pred); 3 } SIMD [ZR02] 1 for ( int i = 0; i < simd_size; ++i){ 2 mask= SIMD_COMP(simd_col[i],pred); 3 if (mask){ 4 for ( int j=0;j < SIMD_LENGTH;++j){ 5 if ((mask >> j) & 1) 6 agg+=agg_col[i]; 7 } 8 } 9 } D B S E 3
Motivating Examples Branching 1 for ( int i = 0; i < input_size; ++i){ 2 if (col[i] < pred) 3 agg+=agg_col[i]; 4 } List. 1: Branching scan for Predicated a) Single Predicate response time in ms 1 for ( int i = 0; i < input_size; ++i){ 300 2 agg+=agg_col[i] ∗ (col[i] < pred); 3 } 200 SIMD [ZR02] 100 1 for ( int i = 0; i < simd_size; ++i){ 2 mask= SIMD_COMP(simd_col[i],pred); 3 if (mask){ 0 4 for ( int j=0;j < SIMD_LENGTH;++j){ 0 0 . 2 0 . 4 0 . 6 0 . 8 1 5 if ((mask >> j) & 1) 6 agg+=agg_col[i]; Selectivity 7 } 8 } 9 } Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 3
Motivating Examples a) Single Predicate response time in ms 300 200 100 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4
Motivating Examples b) Query Q1 c) Query Q6 a) Single Predicate response time in ms 400 300 1 , 000 300 200 200 500 100 100 0 0 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Selectivity Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4
Motivating Examples 8 Aggregates 1 Filter Predicate b) Query Q1 c) Query Q6 a) Single Predicate response time in ms 400 300 1 , 000 300 200 200 500 100 100 0 0 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Selectivity Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4
Motivating Examples 1 Aggregate 8 Aggregates 3 Filter Predicates 1 Filter Predicate b) Query Q1 c) Query Q6 a) Single Predicate response time in ms 400 300 1 , 000 300 200 200 500 100 100 0 0 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Selectivity Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4
Motivating Examples 1 Aggregate 8 Aggregates 3 Filter Predicates 1 Filter Predicate b) Query Q1 c) Query Q6 a) Single Predicate response time in ms 400 300 1 , 000 300 When to use which scan variant? 200 200 500 100 100 0 0 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Selectivity Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4
Evaluation Setup Evaluation Criteria Number of predicates Number of aggregates inside loop Workload & Machine TPC-H LineItem table SF 10 Intel Xeon E5- 2630 v3 with SSE4.2 Variants: Branching vs. Predication Scalar vs. SIMD D B S E 5
Number of Predicates Branching Scan 400 Time in ms 200 s 10 e t 0 a c 0 5 i d 0 . 5 e r 1 P Selectivity P1 f o # D B S E 6
Number of Predicates Branching Scan Branching Scan SIMD Scan 400 400 400 400 Time in ms Time in ms Time in ms 200 200 200 200 s s s 10 10 e e 10 e t t t 0 0 0 a a a c c c 0 0 5 5 0 5 i i i d d d 0 . 5 0 . 5 0 . 5 e e e r r r 1 1 1 P P P Selectivity P1 Selectivity P1 Selectivity P1 f f f o o o # # # Predicated Scan SIMD Predicated Scan 400 400 Time in ms Time in ms 200 200 s s 10 e 10 e t t 0 0 a a c c 0 5 0 5 i i d d 0 . 5 0 . 5 e e r r 1 1 P P Selectivity P1 Selectivity P1 f f o o # # D B S E 6
Number of Predicates Branching Scan Branching Scan SIMD Scan 400 400 400 400 Time in ms Time in ms Time in ms 200 200 200 200 s s s 10 10 e e 10 e t t t 0 0 0 a a a c c c 0 0 5 5 0 5 i i i d d d 0 . 5 0 . 5 0 . 5 e e e r r r 1 1 1 P P P Selectivity P1 Selectivity P1 Selectivity P1 f f f o o o Results: # # # Predicated Scan SIMD Predicated Scan For one predicate SIMD does not pay out 400 400 Time in ms Time in ms 200 200 s s 10 e 10 e t t 0 0 a a c c 0 5 0 5 i i d d 0 . 5 0 . 5 e e r r 1 1 P P Selectivity P1 Selectivity P1 f f o o # # D B S E 6
Number of Predicates Branching Scan Branching Scan SIMD Scan 400 400 400 400 Time in ms Time in ms Time in ms 200 200 200 200 s s s 10 10 e e 10 e t t t 0 0 0 a a a c c c 0 0 5 5 0 5 i i i d d d 0 . 5 0 . 5 0 . 5 e e e r r r 1 1 1 P P P Selectivity P1 Selectivity P1 Selectivity P1 f f f o o o Results: # # # Predicated Scan SIMD Predicated Scan For one predicate SIMD does not pay out 400 400 The more predicates, Time in ms Time in ms the better SIMD 200 200 s s 10 e 10 e t t 0 0 a a c c 0 5 0 5 i i d d 0 . 5 0 . 5 e e r r 1 1 P P Selectivity P1 Selectivity P1 f f o o # # D B S E 6
Work Inside the Loop Branching Scan SIMD Scan 600 750 750 Time in ms Time in ms 500 500 400 250 250 s s e e 10 10 t t 200 a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 g 1 0 g Selectivity P1 Selectivity P1 A A f f o o # # Predicated Scan SIMD Predicated Scan 750 750 Time in ms Time in ms 500 500 250 250 s s e e 10 10 t t a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 1 0 g g Selectivity P1 A Selectivity P1 A f f o o # # D B S E 7
Work Inside the Loop Branching Scan SIMD Scan 600 750 750 Time in ms Time in ms 500 500 400 250 250 s s e e 10 10 t t 200 a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 g 1 0 g Selectivity P1 Selectivity P1 A A f f o o Results: # # More aggregates, less Predicated Scan SIMD Predicated Scan impact of branch misprediction 750 750 Time in ms Time in ms 500 500 250 250 s s e e 10 10 t t a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 1 0 g g Selectivity P1 A Selectivity P1 A f f o o # # D B S E 7
Work Inside the Loop Branching Scan SIMD Scan 600 750 750 Time in ms Time in ms 500 500 400 250 250 s s e e 10 10 t t 200 a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 g 1 0 g Selectivity P1 Selectivity P1 A A f f o o Results: # # More aggregates, less Predicated Scan SIMD Predicated Scan impact of branch misprediction 750 750 Time in ms Time in ms The more aggregates, 500 500 the better branching 250 250 s s e e 10 10 t t a a 0 0 scans for low selectivity g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 1 0 g g Selectivity P1 A Selectivity P1 A f f o o # # D B S E 7
Decision Trees Number of Predicates selectivity < 0.05 >= 0.05 #predicates #predicates < 4 >= 4 < 2 >=2 Branching SIMD Predicated SIMD Scan Branching Scan Predicated Number of Aggregates selectivity < 0.1 >= 0.1 SIMD #aggregates Predicated < 6 >= 6 SIMD selectivity Branching < 0.05 >= 0.05 SIMD SIMD Branching Predicated D B S E 8
Conclusion Increasing number of aggregates slows down predicated variants SIMD outperforms scalar variants for several predicates Pipeline code for filter-&-aggregate pipelines 1 Decision trees as a result of our evaluation in the paper Future Work Hash table put / probe (joins, groupings) Automatic calibration for query compilation D B 1 http:/ /git.iti.cs.ovgu.de/dbronesk/BTW-Pipeline-Variants S E 9
Recommend
More recommend