MY DATABASE SYSTEM IS THE ONLY THING I CAN TRUST @ANDY_PAVLO
Thirty Years Ago… 2
I NTERACTIVE T RANSACTIONS S MALL # OF CPU C ORES S MALL M EMORY S IZES
TPC-C BENCHMARK APPLICATION NewOrder Transaction 4
TPC-C BENCHMARK 20,000 MySQL Postgres 15,000 10,000 5,000 0 1 2 3 4 5 6 7 8 9 10 11 12 TXN/SEC CPU CORES 5
TRADITIONAL DBMS BUFFER POOL 30% 28% LOCKING RECOVERY 30% 12% REAL WORK OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE SIGMOD, pp. 981-992, 2008. 6
H ARDWARE U PGRADE R EPLICATION D ISTRIBUTED C ACHE S HARDING M IDDLEWARE N O SQL
HOW TO SCALE UP WITHOUT GIVING UP TRANSACTIONS?
Distributed Main Memory Transaction Processing System H-STORE: A HIGH-PERFORMANCE, DISTRIBUTED MAIN MEMORY TRANSACTION PROCESSING SYSTEM Proc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.
x DISK ORIENTED M AIN M EMORY S TORAGE i CONCURRENT EXECUTION S ERIAL E XECUTION / HEAVYWEIGHT RECOVERY C OMPACT LOGGING
PARTITIONS SINGLE-THREADED EXECUTION ENGINES 11
Procedure Name Input Parameters 12
STORED PROCEDURE PARTITIONS Transaction VoteCount: InsertVote: Execution Transaction SELECT COUNT(*) INSERT INTO votes Result FROM votes VALUES (?, ?, ?); WHERE phone_num = ? ; run (phoneNum, contestantId, currentTime) { result = execute ( VoteCount , phoneNum); if (result > MAX_VOTES ) { return ( ERROR ); } execute ( InsertVote , phoneNum, contestantId, SINGLE-THREADED currentTime); CMD LOG SNAPSHOTS return ( SUCCESS ); } EXECUTION ENGINES 13
Transaction Execution CMD LOG 14
Transaction Result SNAPSHOTS 15
TPC-C BENCHMARK 20,000 MySQL Postgres H-Store 15,000 10,000 5,000 0 1 2 3 4 5 6 7 8 9 10 11 12 TXN/SEC CPU CORES 16
DISTRIBUTED TRANSACTIONS
TPC-C BENCHMARK 40,000 H-Store 30,000 20,000 10,000 0 1 2 3 4 TXN/SEC NODES 18
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS 19
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS 20
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS Query Count 21
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS Query Count 22
KNOW WHAT TRANSACTIONS WILL DO BEFORE THEY START
BUT PEOPLE ALWAYS GIVE ME BAD ADVICE
DON’T GET INVOLVED WITH COMPUTERS. YOU’LL NEVER MAKE ANY MONEY.
DON’T GET A PHD. EVERYONE WILL THINK YOU ARE A JERK.
THE DATABASE SYSTEM ALWAYS HAS MORE INFORMATION
DO USE MACHINE LEARNING TO PREDICT TRANSACTION BEHAVIOR. ON PREDICTIVE MODELING FOR OPTIMIZING TRANSACTION EXECUTION IN PARALLEL OLTP SYSTEMS Proc. VLDB Endow., Vol 5, Iss. 2, pp. 85-96, 2011
PREDICTIVE MODELS 29
SELECT * FROM WAREHOUSE Model SELECT * FROM WAREHO EHOUSE WHERE W_ID = 10; SELECT * FROM WAREHO EHOUSE ________ ________ WHERE W_ID = 10; ________ ________ SELECT * FROM WAREHO EHOUSE WHERE W_ID = 10; SELECT * FROM DISTRICT WHERE ________ ________ SELECT * FROM DISTR TRIC ICT ________ ________ WHERE W_ID = 10; Generator D_W_ID = 10 AND D_ID =9; ________ ________ Feature INSERT INTO ORDERS RS D_W_ID = 10 AND D_ID =9; ________ ________ INSERT INTO ORDERS RS ______ ______ (O_W_ID, O_D_ID, O_C_ID) INSERT INTO ORDERS (O_W_ID, (O_W_ID, O_D_ID, O_C_ID) INSERT INTO ORDERS RS VALUES (10, 9, 12345); O_D_ID, O_C_ID,…) VALUES VALUES (10, 9, 12345); (O_W_ID, O_D_ID, O_C_ID) ________ ________ Clusterer ⋮ ________ ________ (10, 9, 12345,…); ⋮ VALUES (10, 9, 12345); ________ ________ ________ ________ ⋮ ⋮ ________ ________ ________ ________ Classifier ______ ______ Decision Tree Markov Models 30
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS 31
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS 32
TPC-C BENCHMARK 25,000 OPTIMAL Naïve Houdini 20,000 15,000 10,000 5,000 0 1 2 3 4 TXN/SEC NODES 33
TPC-C BENCHMARK 60,000 Naïve Houdini 45,000 30,000 15,000 0 1 2 3 4 TXN/SEC NODES 34
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS SP1 - Waiting for Query Result 35
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS SP1 - Waiting for Query Result SP2 - Waiting for Query Request 36
DI DISTR STRIBU IBUTED TRA TED TRANSAC NSACTIONS TIONS SP1 - Waiting for Query Result SP2 - Waiting for Query Request SP3 - Two-Phase Commit 37
TR TRAN ANSACTI SACTION ON STA STALL LL POIN POINTS TS BASE PARTITION REMOTE PARTITION 18% 5% 73% 45% 37% 22% SP1 - Waiting for Query Result SP2 - Waiting for Query Request SP3 - Two-Phase Commit Real Work 38
DO SOMETHING USEFUL WHEN STALLED
DON’T BE SURPRISED IF YOU & KB DON’T LAST THROUGH GRAD SCHOOL.
DON’T BE STAN’S STUDENT IF YOU GO TO BROWN.
DO USE MACHINE LEARNING TO SCHEDULE SPECULATIVE TASKS. THE ART OF SPECULATIVE EXECUTION In Progress (August 2013)
SERIALIZABLE SCHEDULE Distributed Transaction Zzzz … Single-Partition Transaction Single-Partition Transaction 43
SERIALIZABLE SCHEDULE Distributed Transaction VERIFY Zzzz … Speculative Transaction Speculative Transaction 44
SPECULATIVE TRANSACTIONS Transaction Queue Speculation Candidate: WRITE X Distributed Transaction: READ X READ X 45
SPECULATIVE TRANSACTIONS Transaction Queue Speculation Candidate: Distributed Transaction: 46
SPECULATIVE QUERIES Distributed Transaction: 47
SPECULATIVE QUERIES Distributed Transaction: 48
SPECULATIVE QUERIES QueryY: SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ? ; Distributed Transaction: 49
SPECULATIVE QUERIES QueryY: SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ? ; Distributed Transaction: 50
Transaction Parameters: w_id =0 i_w_ids =[1,0] i_ids =[1001,1002] GetWarehouse: SELECT * FROM WAREHOUSE WHERE W_ID = ? 51
Transaction Parameters: w_id =0 i_w_ids =[1,0] i_ids =[1001,1002] CheckStock: SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ? ; 52
Transaction Parameters: w_id =0 i_w_ids =[1,0] i_ids =[1001,1002] CheckStock: SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ? ; 53
VERIFICATION Distributed Transaction Speculative Transactions Query1 Query2 Query3 Query1 Query3 Query3 Query3 Query1 Query2 Query3 54
TPC-C BENCHMARK 50,000 None Spec Queries Spec Txns All 40,000 30,000 20,000 10,000 0 1 2 3 4 TXN/SEC NODES 55
Optimize Single-Partition Execution H-STORE: A HIGH-PERFORMANCE, DISTRIBUTED MAIN MEMORY TRANSACTION PROCESSING SYSTEM Proc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008. Minimize Distributed Transactions SKEW-AWARE AUTOMATIC DATABASE PARTITIONING IN SHARED-NOTHING, PARALLEL OLTP SYSTEMS Proceedings of SIGMOD, 2012. Identify Distributed Transactions ON PREDICTIVE MODELING FOR OPTIMIZING TRANSACTION EXECUTION IN PARALLEL OLTP SYSTEMS Proc. VLDB Endow., vol. 5, pp. 85-96, 2011. Utilize Transaction Stalls THE ART OF SPECULATIVE EXECUTION In Progress (August 2013)
FUTURE WORK
N H-STORE S-STORE N-STORE
• • •
DON’T MESS IT UP WITH KB.
Recommend
More recommend