faster
play

FASTER @andy_pavlo Yale University Columbia University April 2012 - PowerPoint PPT Presentation

Making Fast Databases FASTER @andy_pavlo Yale University Columbia University April 2012 Fast + Cheap Legacy Systems TPC-C NewOrder 100% 12.3% Real Work 80% 29.6% Buffer Pool 60% Latching 10.2% CPU Cycles Locking 18.7% 40%


  1. Making Fast Databases FASTER @andy_pavlo Yale University Columbia University April 2012

  2. Fast + Cheap

  3. Legacy Systems TPC-C NewOrder 100% 12.3% Real Work 80% 29.6% Buffer Pool 60% Latching 10.2% CPU Cycles Locking 18.7% 40% Logging 21.1% 20% B-Tree Keys 8.1% 0% OLTP Through the Looking Glass, and What We Found There SIGMOD 2008

  4. OLTP Transactions Fast Repetitive Small

  5. Main Memory • Parallel • Shared-Nothing Transaction Processing H-Store: A High-Performance, Distributed Main Memory Transaction Processing System VLDB vol. 1, issue 2, 2008

  6. Stored Procedure Procedure Name Execution Input Parameters Client Application Database Cluster Database Cluster

  7. TPC-C NewOrder txn/s /s 150,000 No Distributed Txns 20% Distributed Txns 125,000 100,000 75,000 50,000 25,000 0 4 8 12 12 16 16 20 20 24 24 28 28 32 32 36 36 40 40 44 44 48 48 52 52 56 56 60 60 64 64 Parti titi tion ons

  8. Optimization #1: Partition database to reduce the number of distributed txns. Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems SIGMOD 2012

  9. CUSTOMER ORDERS c_id c_w_id id c_last st … o_id o_c_id id o_w_id … 1001 5 RZA - 78703 1004 5 - 1002 3 GZA - 78704 1002 3 - 1003 12 Raekwon - 78705 1006 7 - 1004 5 Deck - 78706 1005 6 - 1005 6 Killah - 78707 1005 6 - 1006 7 ODB - 78708 1003 12 - CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS

  10. ITEM i_id i_name i_pric ice … 603514 XXX 23.99 - 267923 XXX 19.99 - 475386 XXX 14.99 - 578945 XXX 9.98 - 476348 XXX 103.49 - 784285 XXX 69.99 - CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM

  11. CUSTOMER c_id c_w_id id c_last st … 1001 5 RZA - 1002 3 GZA - 1003 12 Raekwon - 1004 5 Deck - 1005 6 Killah - 1006 7 ODB - CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM

  12. NewOrder (5, “Method Man”, 1234) Client Application CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ITEM ITEM ITEM

  13. DTxn DDL DDL Estimator Schema Skew ------- ------- Estimator ------- Workload Large-NeighorhoodSearch Algorithm … CUSTOMER CUSTOMER CUSTOMER CUSTOMER ORDERS ORDERS ORDERS ORDERS ITEM ITEM ITEM ITEM

  14. Large-Neighborhood Search Restart ------- DDL DDL ------- ------- Schema Workload Initial Design Relaxation Local Search

  15. Large-Neighborhood Search Restart ------- DDL DDL ------- ------- Schema Workload Initial Design Relaxation Local Search

  16. Throughput Horticulture State-of-the-Art (txn/s) 80,000 60,000 14,000 70,000 12,000 50,000 60,000 10,000 40,000 50,000 8,000 40,000 30,000 6,000 30,000 20,000 4,000 20,000 10,000 2,000 10,000 0 0 0 4 8 16 16 32 32 64 64 4 8 16 16 32 32 64 64 4 8 16 16 32 32 64 64 TATP TPC-C TPC-C Skewed +88% +16% +183%

  17. Search Times TATP SEATS % Single-Partitioned Transactions TPC-C TPC-C Skewed AuctionMark TPC-E

  18. Undo Log Client Application Database Cluster Database Cluster

  19. Optimization #2: Predict what txnswill do beforethey execute. On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems VLDB, vol 5. issue 2, October 2011

  20. » Partitions Touched? » Undo Log? » Done with Partitions? Client Application Database Cluster Database Cluster

  21. Current State: Input Parameters: w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] GetWarehouse: SELECT * FROM WAREHOUSE WHERE W_ID = ?

  22. Estimated Execution Path Input Parameters: w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] Transaction Estimate: Confidence Coefficient: 0.96 Best Partition: 0 Partitions Accessed: { 0 } Use Undo Logging: Yes

  23. Throughput Houdini Assume Single-Partitioned (txn/s) 14,000 16,000 16,000 14,000 14,000 12,000 12,000 12,000 10,000 10,000 10,000 8,000 8,000 8,000 6,000 6,000 6,000 4,000 4,000 4,000 2,000 2,000 2,000 0 0 0 4 8 16 32 64 4 8 16 32 64 4 8 16 32 64 TATP TPC-C AuctionMark +57% +126% +117%

  24. Prediction Overhead TATP TPC-C AuctionMark

  25. Conclusion: Achieving fast performance is more than just using only RAM. Future Work : Reduce distributed txnoverhead through creative scheduling.

  26. h-store hstore.cs.brown.edu github.com/apavlo/h-store

  27. Help is Available +1-212-939-7064 Graduate Student Abuse Hotline Available24/7 Collect Calls Accepted

Recommend


More recommend