…and looks almost the same as multicore-parallelism 1 input := Load ("input") // Single Column: val 2 ids := Range (input) 3 partitionSize := Constant (1024) 4 partitionIDs := Divide (ids, partitionSize) 5 positions := Partition (partitionIDs) 6 inputWPart := Zip (input, partition) 7 partInput := Scatter (inputWPart, positions) 8 pSum := FoldSum (partInput.val, partInput.partition) backend performs 9 totalSum := FoldSum (pSum) simple static analysis 1 input := Load ("input") // Single Column: val 2 ids := Range (input) 3 laneCount := Constant (2) 4 partitionIDs := Modulo (ids, laneCount) 5 positions := Partition (partitionIDs) 6 inputWPart := Zip (input, partition) 7 partInput := Scatter (inputWPart, positions) 8 pSum := FoldSum (partInput.val, partInput.partition) 9 totalSum := FoldSum (pSum) 10
A portable high-performance database kernel
The Voodoo query processing system SQL "Liberated" from MonetDB Similar to code generation example Logical Plan Optimizer Unified Algebra Our Contributions Intermediate Language OpenCL OS/Hardware/…
Let’s extend our example SELECT SUM(l_quantity) FROM lineitem WHERE l_shipdate> 5
MonetDB generates a logical plan… sum SELECT SUM(l_quantity) select FROM lineitem WHERE l_shipdate> 5 lineitem
…we compile the logical plan to Voodoo… tmp1 = Load(.lineitem.l_quantity) sum tmp2 = Load(.lineitem.l_shipdate) tmp3.val = Range(728659,tmp2,0) tmp4.val = Greater(tmp2,.l_shipdate,tmp3,.val) select tmp5.val = Range(0,tmp4,1) tmp6 = Zip(.fold,tmp5,.val,.value,tmp4,.val) tmp7.val = FoldSelect(tmp6,.fold,.value) lineitem tmp8 = Gather(tmp1,tmp7,.val) tmp14.val = Range(0,tmp8,0) tmp15 = Zip(.fold,tmp14,.val,.value,tmp8,.val) tmp16.L1 = FoldSum(tmp15,.fold,.value) Return(tmp16)
…i.e., a dataflow program Const .lineitem.l_shipdate Greater sum Range .lineitem.l_quantity FoldSelect select Gather lineitem Constant FoldSum
Generating "undergraduate" C from Voodoo C Const Const .lineitem.l_shipdate .lineitem.l_shipdate pdate Greater Greater for(size_t i = 0; i < sizeof(l_shipdate); i++) l_shipdate[i] > 5 if( ) Range Range l_quantity[i] result += .lineitem.l_quantity .lineitem.l_quantity tity F FoldSelect FoldSelect Gather Gather Constant Constant FoldSum FoldSum
The “graduate" version .lineitem.l_shipdate Const Greater extern size_t inputSize; Range void fragment1(int* tmp, int* s_date, int* quantity) { for(size_t i = 0; i < grainsize; i++) .lineitem.l_quantity FoldSelect if(s_date[pId * grainsize + i] > 5) out[pId] += quantity[pId * grainsize + i] Gather } void fragment2(int* out, int* tmp) { Range Constant for(size_t i = 0; i < inputSize / grainsize; i++) output[0] += tmp[i] Divide } FoldSum Multicore-partitioned fold control-incompatible folds Constant → hierarchical aggregation FoldSum
Voodoo outperforms MonetDB (TPC-H subset, SF10) ′ 2000 Monet Voodoo 1500 Time in ms 1000 500 0 1 4 5 6 7 8 9 10 11 12 14 15 19 20 TPC - H Query an old acquaintance
Voodoo is competitive with Hyper on CPU (selected TPC-H queries, SF10) 700 HyPeR Voodoo 591 600 500 Time in ms 420 400 279 300 222 191 200 162 151 158 155 154 120 103 96 76 85 76 73 74 100 63 64 56 47 38 45 30 42 18 25 0 1 4 5 6 7 8 9 10 11 12 14 15 19 20 TPC-H Query
Voodoo is competitive with Hyper on CPU (selected TPC-H queries, SF10) Generally Competitive 700 HyPeR Voodoo 591 600 500 Time in ms 420 400 279 300 222 191 200 162 151 158 155 154 120 103 96 76 85 76 73 74 100 63 64 56 47 38 45 30 42 18 25 0 1 4 5 6 7 8 9 10 11 12 14 15 19 20 TPC-H Query
Voodoo is competitive with Hyper on CPU (selected TPC-H queries, SF10) Often faster 700 HyPeR Voodoo 591 600 500 Time in ms 420 400 279 300 222 191 200 162 151 158 155 154 120 103 96 76 85 76 73 74 100 63 64 56 47 38 45 30 42 18 25 0 1 4 5 6 7 8 9 10 11 12 14 15 19 20 TPC-H Query
Voodoo is competitive with Hyper on CPU (selected TPC-H queries, SF10) Sometimes slower 700 HyPeR Voodoo 591 600 500 Time in ms 420 400 279 300 222 191 200 162 151 158 155 154 120 103 96 76 85 76 73 74 100 63 64 56 47 38 45 30 42 18 25 0 1 4 5 6 7 8 9 10 11 12 14 15 19 20 TPC-H Query
Voodoo is competitive with Hyper on CPU (selected TPC-H queries, SF10) Q9 is an interesting case where tuning is important 700 HyPeR Voodoo 591 600 500 Time in ms 420 400 279 300 222 191 200 162 151 158 155 154 120 103 96 76 85 76 73 74 100 63 64 56 47 38 45 30 42 18 25 0 1 4 5 6 7 8 9 10 11 12 14 15 19 20 TPC-H Query
VOODOO AS A TUNING TOOL
Selective Foreign-Key Joins join select sum(part.price) from lineitem, part where discount > $x and lineitem.partkey_fk = part.partkey_pk select items that were sold at a discount greater than x look up their price from the parts table and sum it
Selective Foreign-Key Joins lookup/Foreign-Key Join join join select sum(part.price) from lineitem, part where discount > $x select part and lineitem.partkey_fk = part.partkey_pk lineitem primary key, i.e., unique
Selective Foreign-Key Joins lineitem part price partkey discount 0 .80 .1 2 .93 .01 0 .10 .06 3 .75 .07 2 .5 2 .25 Foreign-Key Index 1 .0 3 .15 1 .03 select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Selective Foreign-Key Joins read key/pointer resolve pointer/ select read value price >.05 partkey discount 0 .80 .1 ✔ 2 .93 .01 ✘ 0 .10 .06 ✔ 3 .75 .07 ✔ 2 .5 ✔ 2 .25 ✔ 1 .0 ✘ 3 .15 ✔ 1 .03 ✘ select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Selective Foreign-Key Joins price discount partkey 0 .80 ✔ 2 .93 ✘ 0 .10 ✔ 3 .75 ✔ 2 ✔ 2 ✔ 1 ✘ 3 ✔ 1 ✘ select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Selective Foreign-Key Joins price discount partkey 0 .80 ✔ 2 .93 ✘ 0 .10 ✔ 3 .75 ✔ 2 ✔ 2 ✔ 1 ✘ 3 ✔ 1 ✘ select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Branching Foreign-Key Joins .lineitem.l_discount Range select sum(price) Selection Key-Lookup from lineitem, part Greater where discount > $1 and partkey = partkey for(size_t i = 0; i < lineitemsize; i++) Range if(discount[i] > $1) result += price[partkey[i]]; Zip .lineitem.lineitem_part FoldSelect .part.p_retailprice Gather 0.25 0.20 Time in seconds Gather 0.15 Range 0.10 0.05 Zip 0.00 20 40 60 80 100 FoldSum Selectivity return
Predicated Foreign-Key Joins for(size_t i = 0; i < lineitemsize; i++) if(discount[i] > $1) result += price[partkey[i]];
Predicated Foreign-Key Joins for(size_t i = 0; i < lineitemsize; i++) result += (discount[i] > $1) * price[partkey[i]];
Selective Foreign-Key Joins price discount partkey 0 .80 ✔ 2 .93 ✘ 0 .10 ✔ 3 .75 ✔ 2 ✔ 2 ✔ 1 ✘ 3 ✔ 1 ✘ select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Predicated Foreign-Key Joins price discount partkey 0 .80 ✔ 2 .93 ✘ 0 .10 ✔ 3 .75 ✔ 2 ✔ 2 ✔ 1 ✘ 3 ✔ 1 ✘ select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Predicated Foreign-Key Joins .lineitem.l_discount Range .lineitem.l_discount Greater .part.p_retailprice .lineitem.lineitem_part Range Range Zip Gather Greater FoldSum( .lineitem.lineitem_part FoldSelect Gather( Range Multiply Gather( .part.p_retailprice Gather Zip FoldSelect(x),y),z)) Gather =>FoldSum( FoldSum Multiply(x, Range Multiply return Gather(y,z))) Zip FoldSum return
Selective Foreign-Key Joins select sum(price) from lineitem, part where discount > $1 and partkey = partkey .lineitem.l_discount for(size_t i = 0; i < lineitemsize; i++) .part.p_retailprice .lineitem.lineitem_part Range result += (price[i] > $1) * p_discount[partkey[i]]; Gather Greater Range Multiply Zip FoldSum return
Selective Foreign-Key Joins select sum(price) from lineitem, part where discount > $1 and partkey = partkey .lineitem.l_discount for(size_t i = 0; i < lineitemsize; i++) .part.p_retailprice .lineitem.lineitem_part Range result += (price[i] > $1) * p_discount[partkey[i]]; Gather Greater 0.25 Range Multiply 0.20 Time in seconds 0.15 Zip 0.10 Branching FoldSum Predicated Aggregation 0.05 return 0.00 20 40 60 80 100 Selectivity
Double-predicated Foreign-Key Joins .lineitem.l_discount .lineitem.l_discount Range .part.p_retailprice .lineitem.lineitem_part Range .lineitem.lineitem_part Greater .part.p_retailprice Multiply Gather Greater FoldSum( Multiply( Range Multiply Gather Gather( Zip x,y),z)) Range Multiply =>FoldSum( FoldSum Zip Multiply( return Gather( FoldSum Multiply(x,z),y),z)) return
Predicated Foreign-Key Joins price discount partkey 0 .80 ✔ 2 .93 ✘ 0 .10 ✔ 3 .75 ✔ 2 ✔ 2 ✔ 1 ✘ 3 ✔ 1 ✘ select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Predicated Foreign-Key Joins price discount partkey 0 .80 ✔ 2 .93 ✘ 0 .10 ✔ 3 .75 ✔ 2 ✔ 2 ✔ 1 ✘ 3 ✔ 1 ✘ select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Double-predicated Foreign-Key Joins price discount partkey 0 .80 ✔ 2 .93 ✘ 0 .10 ✔ 3 .75 ✔ 2 ✔ 2 ✔ 1 ✘ 3 ✔ 1 ✘ select sum(price) from lineitem, part where discount > $1 and lineitem.partkey = part.partkey
Double-predicated Foreign-Key Joins select sum(price) .lineitem.l_discount from lineitem, part where discount > $1 Range and partkey = partkey for(size_t i = 0; i < lineitemsize; i++) .lineitem.lineitem_part Greater result += (discount[i] > $1) * price[partkey[i]]; .part.p_retailprice Multiply Gather Range Multiply Zip FoldSum return
Double-predicated Foreign-Key Joins select sum(price) .lineitem.l_discount from lineitem, part where discount > $1 Range and partkey = partkey for(size_t i = 0; i < lineitemsize; i++) .lineitem.lineitem_part Greater result += (discount[i] > $1) * price[partkey[i] * (discount[i] > $1)]; .part.p_retailprice Multiply Gather 0.25 0.20 Range Multiply Time in seconds 0.15 Zip 0.10 Branching Predicated Aggregation FoldSum 0.05 Predicated Lookups 0.00 return 20 40 60 80 100 Selectivity
Double-predicated Foreign-Key Joins 0.25 0.20 Time in seconds 0.15 0.10 Branching Predicated Aggregation 0.05 Predicated Lookups 0.00 20 40 60 80 100 Selectivity
Voodoo Wrap SQL • Fast and expressive like C Logical Plan • Optimizable like relational algebra Voodoo Optimizer • Portable to different devices Executable • Enables serendipitous discovery of new optimizations OS/Hardware/…
Quo Vadis
FUTURE WORK SQL Relational Algebra Optimizer Voodoo Executable OS/Hardware/…
FUTURE WORK Domain Streams Specific Tensorflow Arrays Graphs SQL Weld Relational Algebra Optimizer Voodoo Executable OS/Hardware/… [VLDB 2018]
QUO VADIS SQL Relational Algebra Optimizer Voodoo Executable OS/Hardware/…
Plugging holes in the Design Space Remember these? 700 HyPeR Voodoo 591 600 500 Time in ms 420 400 279 300 222 191 200 162 151 158 155 154 120 103 96 76 85 76 73 74 100 63 64 56 47 38 45 30 42 18 25 0 1 4 5 6 7 8 9 10 11 12 14 15 19 20 TPC-H Query
Plugging holes in the Design Space • Massively Parallel Top-K [SIGMOD 2018] • Incremental Window Computation [ongoing] • Dynamic Load Balancing on GPUs [future]
Auto-Generating Databases 0.25 0.20 Time in seconds 0.15 0.10 Branching Predicated Aggregation 0.05 Predicated Lookups 0.00 20 40 60 80 100 Selectivity
Auto-Generating Databases 0.25 0.20 Time in seconds 0.15 0.10 Branching Predicated Aggregation 0.05 Predicated Lookups 0.00 0 20 40 60 80 100 Selectivity in %
Auto-Generating Databases • Hardware-conscious modelling & tuning [ongoing]
Auto-Generating Databases DBMS DBMS Hardware Model Indexing Processing Indexing Processing … Voodoo Program Hardware Hardware Self-driving Synthesis Self-tuning CPU CPU DBMS DBMS Core 1 Core 1 Core 2 Core 2 Registers Registers Registers Registers L1 Cache L1 Cache TLB TLB L1 Cache L1 Cache TLB TLB Indexing Indexing L2 Cache L2 Cache L2 Cache L2 Cache Last Level (L3) Cache Last Level (L3) Cache Processing Processing on die on die off die off die Memory Memory Hardware Hardware Self-generating
Auto-Generating Databases • Hardware-conscious modelling & tuning [ongoing] • Auto-Generating Databases [future]
Wrapping up
Thanks to Collaborators Sam Madden Matei Zaharia Mike Stonebraker Shoumik Palkar Anil Shanbhag Malte Schwarzkopf
A new DB research hub — Visit us! London Eye Westminster Hyde Park Buckingham Palace Imperial College
Recruiting • Interns • PhDs • Postdocs
Recommend
More recommend