Making Cost-Based Query Optimization Asymmetry-Aware Daniel Bausch, Ilia Petrov, and Alejandro Buchmann {bausch, petrov, buchmann}@dvs.tu-darmstadt.de 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 1
Asymmetry in new storage devices Writing to Flash memory is slower than reading from it This also applies to emerging non-volatile memories (PCM, etc.) Small writes to random locations on Flash are even more slow Random reads from Flash are only 1 3 slower than sequential reads 1 1 on Intel X25-E using full command queue 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 2
Cost-Based Query Optimization estimation of run-time before real execution (e.g. in PostgreSQL) model comprised of functions like c ( seqscan ) = c s � R � p + q R ,0 + (˙ c cpu + ˙ q R ) � R � t I/O cost CPU cost order statistics and plan A cost value A select plan B cost value B “best” query planner cost model . . . . plan . . cost value Z plan Z configuration 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 3
Splitting Parameters c sr sequential page read accesses c s sequential page accesses c sw sequential page write accesses c rr random page read accesses c r random page accesses c rw random page write accesses 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 4
Cost functions for “pure load” algorithms cost function kind original replacement sequential scan read only c s c sr index scan read only c s c sr c r c rr bitmap scan read only c s c sr c r c rr TID scan read only c r c rr e materialization write only c s c sw re-scan read only c s c sr 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 5
Cost function of sort algorithm startup � 3 4 c s + 1 � c io ( sort ) = 2 � S � p ⌈ log m n ⌉ 4 c r blktrace stats write read s r s r external sort of unordered data external sort of ordered data sort-merge join � c sw + c rw � � + ⌈ log m n ⌉ c sr + c rr � c io ( sort ) rw = � S � p c sw + ⌈ log m n ⌉ − 1 2 2 startup 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 6
Cost function of hash join startup � � c io ( hashjoin ) = � P i � p c s + � P i � p + 2 � P o � p c s blktrace stats write read s r s r hash join c io ( hashjoin ) rw = � P i � p c rw + � P i � p c sr + � P o � p ( c rw + c sr ) startup 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 7
System and Load MB system with tight memory configuration 4000 ⇒ simulate data-intensive systems under high load application class benchmark based on TPC-H specs 3000 2000 1000 0 obj C L N O P PS R S iCn iLcd iLo iLoq iLp iLrd iLsd iLs iLsp iNr iOc iOod iPSp iPSs iSn iC iN iO iP iPS iR iS RAM e sb w 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 8
Calibration order statistics and plan A cost value A select plan B cost value B “best” query planner cost model . . . . plan . . cost value Z plan Z configuration calibration cost value, based on simulated annealing [Kirk1983], execution time accepts inferior configuration at calibration decreasing probability cache modify by multiplication with logarithmic normally distributed random variable cooling cycle of 100 iterations, restarted 100 times 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 9
Found “Optimal” Settings SSD HDD original model asymmetric model original model asymmetric model c sr = 1.00000 c sr = 1.00000 c s = 1.00000 c s = 1.00000 c sw = 49.91840 c sw = 110.21139 c rr = 5.62724 c rr = 19.25494 c r = 6.77405 c r = 29.04790 c rw = 19.08421 c rw = 20.18467 c cpu = 0.00121 ˙ c cpu = ˙ 0.00003 c cpu = ˙ 0.00280 c cpu = ˙ 0.00082 ˙ ˙ ˙ ˙ ˆ ˆ ˆ ˆ c cpu = 0.03658 c cpu = 0.01608 c cpu = 0.03718 c cpu = 0.00045 c op = 0.00016 c op = 0.00008 c op = 0.00004 c op = 0.00119 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 10
2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 11 Comparison total time [s] 1000 1500 2000 2500 500 0 960637766 2888465430 2988248363 orig. model / calibrated config orig. model / initial config 109994110 2090918519 80136562 742878636 3137579638 3972888868 TPC−H random seed 47261513 644961076 1231071980 SSD 3962584638 4133947234 1671380912 asym. model / calibrated config asym. model / initial config 747421973 4204648090 3759550337 1960789542 3615766274 1073327142 970312735 2473950213 1712708155 3744046782 10000 15000 20000 25000 30000 5000 0 960637766 2888465430 TPC−H random seed 2988248363 109994110 2090918519 HDD 80136562 742878636 3137579638 3972888868 47261513
Individual Query Speed-Up 200% 153% 140% 100% speed−up 50% 48% 36% 22% 16% 14% 10% −17% −10% −74% 7% 7% −1% 5% 5% −4% 5% −1% −1% −1% −2% −1% −1% −4% −0% −1% 4% 3% 3% 3% 4% 2% 2% 2% 2% 2% 2% 0% 0% 0% 0% 2609% 549% faster with asym. model on SSD faster with asym. model on HDD faster with orig. model on SSD faster with orig. model on HDD −100% 01 02 03 04 05 06 07 08 09 10 11 12 13 14 16 17 18 19 20 21 22 tot TPC−H query 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 12
Discussion – Cause for Peaks ⇒ Query 9 700 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 600 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 500 ● ● ● ● ● time of query 9 [s] ● ● ● ● ● ● ● ● ● ● ● ● ● 400 300 ● ● ● ● ● ● ● 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 100 orig_default orig_pgcalib acm_default acm_pgcalib 0 960637766 2888465430 2988248363 109994110 2090918519 80136562 742878636 3137579638 3972888868 47261513 644961076 1231071980 3962584638 4133947234 1671380912 747421973 4204648090 3759550337 1960789542 3615766274 1073327142 970312735 2473950213 1712708155 3744046782 seed 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 13
Discussion – Query 9 – the slow plan Nested Loop 509541.375 using pk_orders on orders Nested Loop Index Scan orders.o_orderkey=lineitem.l_orderkey 102642.716 0.209*1934051=404216.659 partsupp.ps_partkey=lineitem.l_partkey using pk_supplier on supplier Merge Join Index Scan Join Filter: lineitem.l_suppkey=partsupp.ps_suppkey supplier.s_suppkey=lineitem.l_suppkey 88687.896 0.006*1934051=11604.306 Index Scan using i_ps_partkey on partsupp Materialize 2986.769 79808.663 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 14
Discussion – Query 9 – the fast plan Nested Loop 68231.426 lineitem.l_suppkey=supplier.s_suppkey AND using pk_orders on orders Hash Join Index Scan lineitem.l_partkey=partsupp.ps_partkey orders.o_orderkey=lineitem.l_orderkey 54756.688 0.008*1468202=11745.616 Seq Scan on lineitem Hash (1 Batch) 13467.469 6460.738 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 15
Discussion – Biggest Improvement ⇒ Query 21 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 400 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● time of query 21 [s] 300 200 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● orig_default orig_pgcalib acm_default acm_pgcalib 0 960637766 2888465430 2988248363 109994110 2090918519 80136562 742878636 3137579638 3972888868 47261513 644961076 1231071980 3962584638 4133947234 1671380912 747421973 4204648090 3759550337 1960789542 3615766274 1073327142 970312735 2473950213 1712708155 3744046782 seed 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 16
Recommend
More recommend