Making Fast Databases FASTER @andy_pavlo Yale University Columbia University April 2012
Fast + Cheap
Legacy Systems TPC-C NewOrder 100% 12.3% Real Work 80% 29.6% Buffer Pool 60% Latching 10.2% CPU Cycles Locking 18.7% 40% Logging 21.1% 20% B-Tree Keys 8.1% 0% OLTP Through the Looking Glass, and What We Found There SIGMOD 2008
OLTP Transactions Fast Repetitive Small
Main Memory • Parallel • Shared-Nothing Transaction Processing H-Store: A High-Performance, Distributed Main Memory Transaction Processing System VLDB vol. 1, issue 2, 2008
Stored Procedure Procedure Name Execution Input Parameters Client Application Database Cluster Database Cluster
TPC-C NewOrder txn/s /s 150,000 No Distributed Txns 20% Distributed Txns 125,000 100,000 75,000 50,000 25,000 0 4 8 12 12 16 16 20 20 24 24 28 28 32 32 36 36 40 40 44 44 48 48 52 52 56 56 60 60 64 64 Parti titi tion ons
Optimization #1: Partition database to reduce the number of distributed txns. Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems SIGMOD 2012
CUSTOMER ORDERS c_id c_w_id id c_last st … o_id o_c_id id o_w_id … 1001 5 RZA - 78703 1004 5 - 1002 3 GZA - 78704 1002 3 - 1003 12 Raekwon - 78705 1006 7 - 1004 5 Deck - 78706 1005 6 - 1005 6 Killah - 78707 1005 6 - 1006 7 ODB - 78708 1003 12 - CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS
ITEM i_id i_name i_pric ice … 603514 XXX 23.99 - 267923 XXX 19.99 - 475386 XXX 14.99 - 578945 XXX 9.98 - 476348 XXX 103.49 - 784285 XXX 69.99 - CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM
CUSTOMER c_id c_w_id id c_last st … 1001 5 RZA - 1002 3 GZA - 1003 12 Raekwon - 1004 5 Deck - 1005 6 Killah - 1006 7 ODB - CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM
NewOrder (5, “Method Man”, 1234) Client Application CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM
DTxn DDL DDL Estimator Schema Skew ------- ------- Estimator ------- Workload Large-NeighorhoodSearch Algorithm … CUSTOMER CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ORDERS ITEM ITEM ITEM ITEM
Large-Neighborhood Search Restart ------- DDL DDL ------- ------- Schema Workload Initial Design Relaxation Local Search
Large-Neighborhood Search Restart ------- DDL DDL ------- ------- Schema Workload Initial Design Relaxation Local Search
Throughput Horticulture State-of-the-Art (txn/s) 80,000 60,000 14,000 70,000 12,000 50,000 60,000 10,000 40,000 50,000 8,000 40,000 30,000 6,000 30,000 20,000 4,000 20,000 10,000 2,000 10,000 0 0 0 4 8 16 16 32 32 64 64 4 8 16 16 32 32 64 64 4 8 16 16 32 32 64 64 TATP TPC-C TPC-C Skewed +88% +16% +183%
Search Times TATP SEATS % Single-Partitioned Transactions TPC-C TPC-C Skewed AuctionMark TPC-E
Undo Log Client Application Database Cluster Database Cluster
Optimization #2: Predict what txnswill do beforethey execute. On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems VLDB, vol 5. issue 2, October 2011
» Partitions Touched? » Undo Log? » Done with Partitions? Client Application Database Cluster Database Cluster
Current State: Input Parameters: w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] GetWarehouse: SELECT * FROM WAREHOUSE WHERE W_ID = ?
Estimated Execution Path Input Parameters: w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] Transaction Estimate: Confidence Coefficient: 0.96 Best Partition: 0 Partitions Accessed: { 0 } Use Undo Logging: Yes
Throughput Houdini Assume Single-Partitioned (txn/s) 14,000 16,000 16,000 14,000 14,000 12,000 12,000 12,000 10,000 10,000 10,000 8,000 8,000 8,000 6,000 6,000 6,000 4,000 4,000 4,000 2,000 2,000 2,000 0 0 0 4 8 16 32 64 4 8 16 32 64 4 8 16 32 64 TATP TPC-C AuctionMark +57% +126% +117%
Prediction Overhead TATP TPC-C AuctionMark
Conclusion: Achieving fast performance is more than just using only RAM. Future Work : Reduce distributed txnoverhead through creative scheduling.
h-store hstore.cs.brown.edu github.com/apavlo/h-store
Help is Available +1-212-939-7064 Graduate Student Abuse Hotline Available24/7 Collect Calls Accepted
Recommend
More recommend