11 TWO - PH ASE LO CKIN G (M V2PL) TXN-ID READ-CNT BEGIN-TS END-TS Thread #1 ∞ READ(A) A 1 0 0 1 1 T id =10 ∞ B 1 0 0 1 10 1 WRITE(B) Txns use the tuple's read- If txn-id is zero, then the If both txn-id and read-cnt cnt field as SHARED lock. txn acquires the SHARED are zero, then txn acquires Use txn-id and read-cnt lock by incrementing the the EXCLUSIVE lock by together as EXCLUSIVE read-cnt field. setting both of them. lock. 15-721 (Spring 2020)
11 TWO - PH ASE LO CKIN G (M V2PL) TXN-ID READ-CNT BEGIN-TS END-TS Thread #1 ∞ READ(A) A 1 0 0 1 1 T id =10 ∞ B 1 0 0 1 10 1 ∞ B 2 0 10 10 WRITE(B) Txns use the tuple's read- If txn-id is zero, then the If both txn-id and read-cnt cnt field as SHARED lock. txn acquires the SHARED are zero, then txn acquires Use txn-id and read-cnt lock by incrementing the the EXCLUSIVE lock by together as EXCLUSIVE read-cnt field. setting both of them. lock. 15-721 (Spring 2020)
11 TWO - PH ASE LO CKIN G (M V2PL) TXN-ID READ-CNT BEGIN-TS END-TS Thread #1 ∞ READ(A) A 1 0 0 1 1 T id =10 ∞ B 1 0 0 1 10 1 10 ∞ B 2 0 10 10 WRITE(B) Txns use the tuple's read- If txn-id is zero, then the If both txn-id and read-cnt cnt field as SHARED lock. txn acquires the SHARED are zero, then txn acquires Use txn-id and read-cnt lock by incrementing the the EXCLUSIVE lock by together as EXCLUSIVE read-cnt field. setting both of them. lock. 15-721 (Spring 2020)
11 TWO - PH ASE LO CKIN G (M V2PL) TXN-ID READ-CNT BEGIN-TS END-TS Thread #1 ∞ READ(A) A 1 0 0 1 T id =10 ∞ B 1 0 0 1 10 ∞ B 2 0 10 10 0 WRITE(B) Txns use the tuple's read- If txn-id is zero, then the If both txn-id and read-cnt cnt field as SHARED lock. txn acquires the SHARED are zero, then txn acquires Use txn-id and read-cnt lock by incrementing the the EXCLUSIVE lock by together as EXCLUSIVE read-cnt field. setting both of them. lock. 15-721 (Spring 2020)
12 O BSERVATIO N Thread #1 TXN-ID READ-TS BEGIN-TS END-TS T id =2 31 -1 ∞ WRITE(A) A 1 0 - 99999 If the DBMS reaches the max value for its timestamps, it will have to wrap around and restart at one. This will make all previous versions be in the "future" from new transactions. 15-721 (Spring 2020)
12 O BSERVATIO N Thread #1 TXN-ID READ-TS BEGIN-TS END-TS T id =2 31 -1 ∞ WRITE(A) A 1 0 - 99999 2 31 -1 2 31 -1 ∞ A 2 2 31 -1 2 31 -1 0 - If the DBMS reaches the max value for its timestamps, it will have to wrap around and restart at one. This will make all previous versions be in the "future" from new transactions. 15-721 (Spring 2020)
12 O BSERVATIO N Thread #1 TXN-ID READ-TS BEGIN-TS END-TS T id =2 31 -1 ∞ WRITE(A) A 1 0 - 99999 2 31 -1 ∞ A 2 2 31 -1 0 - If the DBMS reaches the max value for its timestamps, it will have to wrap around and restart at one. This will make all previous versions be in the "future" from new transactions. 15-721 (Spring 2020)
12 O BSERVATIO N Thread #1 TXN-ID READ-TS BEGIN-TS END-TS T id =2 31 -1 ∞ A 1 0 - 99999 2 31 -1 ∞ Thread #2 A 2 2 31 -1 0 - T id =1 WRITE(A) If the DBMS reaches the max value for its timestamps, it will have to wrap around and restart at one. This will make all previous versions be in the "future" from new transactions. 15-721 (Spring 2020)
12 O BSERVATIO N Thread #1 TXN-ID READ-TS BEGIN-TS END-TS T id =2 31 -1 ∞ A 1 0 - 99999 2 31 -1 ∞ Thread #2 A 2 1 1 2 31 -1 0 - ∞ T id =1 - A 3 0 1 1 WRITE(A) If the DBMS reaches the max value for its timestamps, it will have to wrap around and restart at one. This will make all previous versions be in the "future" from new transactions. 15-721 (Spring 2020)
12 O BSERVATIO N Thread #1 TXN-ID READ-TS BEGIN-TS END-TS T id =2 31 -1 ∞ A 1 0 - 99999 2 31 -1 ∞ Thread #2 A 2 1 2 31 -1 0 - ∞ T id =1 - A 3 0 1 If the DBMS reaches the max value for its timestamps, it will have to wrap around and restart at one. This will make all previous versions be in the "future" from new transactions. 15-721 (Spring 2020)
13 PO STGRES TXN ID WRAPARO UN D Set a flag in each tuple header that says that it is "frozen" in the past. Any new txn id will always be newer than a frozen version. Runs the vacuum before the system gets close to this upper limit. Otherwise it must stop accepting new commands when the system gets close to the max txn id. 15-721 (Spring 2020)
14 VERSIO N STO RAGE The DBMS uses the tuples’ pointer field to create a latch-free version chain per logical tuple. → This allows the DBMS to find the version that is visible to a particular txn at runtime. → Indexes always point to the “head” of the chain. Different storage schemes determine where/what to store for each version. 15-721 (Spring 2020)
15 VERSIO N STO RAGE Approach #1: Append-Only Storage → New versions are appended to the same table space. Approach #2: Time-Travel Storage → Old versions are copied to separate table space. Approach #3: Delta Storage → The original values of the modified attributes are copied into a separate delta record space. 15-721 (Spring 2020)
16 APPEN D- O N LY STO RAGE Main Table All the physical versions of a logical tuple are stored in the same table VALUE POINTER space. The versions are mixed A 0 $111 together. Ø A 1 $222 Ø B 1 $10 On every update, append a new version of the tuple into an empty space in the table. 15-721 (Spring 2020)
16 APPEN D- O N LY STO RAGE Main Table All the physical versions of a logical tuple are stored in the same table VALUE POINTER space. The versions are mixed A 0 $111 together. Ø A 1 $222 Ø B 1 $10 On every update, append a new $333 Ø A 2 version of the tuple into an empty space in the table. 15-721 (Spring 2020)
16 APPEN D- O N LY STO RAGE Main Table All the physical versions of a logical tuple are stored in the same table VALUE POINTER space. The versions are mixed A 0 $111 together. A 1 $222 Ø B 1 $10 On every update, append a new $333 Ø A 2 version of the tuple into an empty space in the table. 15-721 (Spring 2020)
17 VERSIO N CH AIN O RDERIN G Approach #1: Oldest-to-Newest (O2N) → Append every new version to end of the chain. → Must traverse chain on look-ups. Approach #2: Newest-to-Oldest (N2O) → Must update index pointers for every new version. → Don’t have to traverse chain on look ups. The ordering of the chain has different performance trade-offs. 15-721 (Spring 2020)
18 TIM E- TRAVEL STO RAGE Main Table Time-Travel Table VALUE POINTER VALUE POINTER A 2 $222 A 1 $111 Ø B 1 $10 On every update, copy the current version to the time- travel table. Update pointers. 15-721 (Spring 2020)
18 TIM E- TRAVEL STO RAGE Main Table Time-Travel Table VALUE POINTER VALUE POINTER A 2 $222 A 1 $111 Ø B 1 $10 A 2 $222 On every update, copy the current version to the time- travel table. Update pointers. 15-721 (Spring 2020)
18 TIM E- TRAVEL STO RAGE Main Table Time-Travel Table VALUE POINTER VALUE POINTER A 2 $222 A 1 $111 Ø B 1 $10 A 2 $222 On every update, copy the Overwrite master version in current version to the time- the main table and update travel table. Update pointers. pointers. 15-721 (Spring 2020)
18 TIM E- TRAVEL STO RAGE Main Table Time-Travel Table VALUE POINTER VALUE POINTER A 2 A 3 $333 $222 A 1 $111 Ø B 1 $10 A 2 $222 On every update, copy the Overwrite master version in current version to the time- the main table and update travel table. Update pointers. pointers. 15-721 (Spring 2020)
18 TIM E- TRAVEL STO RAGE Main Table Time-Travel Table VALUE POINTER VALUE POINTER A 2 A 3 $333 $222 A 1 $111 Ø B 1 $10 A 2 $222 On every update, copy the Overwrite master version in current version to the time- the main table and update travel table. Update pointers. pointers. 15-721 (Spring 2020)
18 TIM E- TRAVEL STO RAGE Main Table Time-Travel Table VALUE POINTER VALUE POINTER A 2 A 3 $333 $222 A 1 $111 Ø B 1 $10 A 2 $222 On every update, copy the Overwrite master version in current version to the time- the main table and update travel table. Update pointers. pointers. 15-721 (Spring 2020)
19 DELTA STO RAGE Main Table Delta Storage Segment VALUE POINTER A 1 $111 B 1 $10 On every update, copy only the values that were modified to the delta storage and overwrite the master version. 15-721 (Spring 2020)
19 DELTA STO RAGE Main Table Delta Storage Segment VALUE POINTER DELTA POINTER A 1 (VALUE→$111) A 1 $111 Ø B 1 $10 On every update, copy only the values that were modified to the delta storage and overwrite the master version. 15-721 (Spring 2020)
19 DELTA STO RAGE Main Table Delta Storage Segment VALUE POINTER DELTA POINTER A 1 (VALUE→$111) A 2 A 1 $222 $111 Ø B 1 $10 On every update, copy only the values that were modified to the delta storage and overwrite the master version. 15-721 (Spring 2020)
19 DELTA STO RAGE Main Table Delta Storage Segment VALUE POINTER DELTA POINTER A 1 (VALUE→$111) A 2 A 1 $222 $111 Ø A 2 (VALUE→$222) B 1 $10 On every update, copy only the values that were modified to the delta storage and overwrite the master version. 15-721 (Spring 2020)
19 DELTA STO RAGE Main Table Delta Storage Segment VALUE POINTER DELTA POINTER A 1 (VALUE→$111) A 3 A 2 A 1 $333 $222 $111 Ø A 2 (VALUE→$222) B 1 $10 On every update, copy only Txns can recreate old the values that were modified versions by applying the delta to the delta storage and in reverse order. overwrite the master version. 15-721 (Spring 2020)
20 N O N- IN LIN E ATTRIBUTES Main Table Variable-Length Data MY_LONG_STRING INT_VAL STR_VAL A 1 A 1 $100 Reuse pointers to variable- length pool for values that do not change between versions. 15-721 (Spring 2020)
20 N O N- IN LIN E ATTRIBUTES Main Table Variable-Length Data MY_LONG_STRING INT_VAL STR_VAL A 1 A 1 $100 MY_LONG_STRING A 2 $90 Reuse pointers to variable- length pool for values that do not change between versions. 15-721 (Spring 2020)
20 N O N- IN LIN E ATTRIBUTES Main Table Variable-Length Data Refs=1 MY_LONG_STRING INT_VAL STR_VAL A 1 A 1 $100 A 2 $90 Reuse pointers to variable- Requires reference counters length pool for values that do to know when it is safe to not change between versions. free memory. Unable to relocate memory easily. 15-721 (Spring 2020)
20 N O N- IN LIN E ATTRIBUTES Main Table Variable-Length Data Refs=2 Refs=1 MY_LONG_STRING INT_VAL STR_VAL A 1 A 1 $100 A 2 $90 Reuse pointers to variable- Requires reference counters length pool for values that do to know when it is safe to not change between versions. free memory. Unable to relocate memory easily. 15-721 (Spring 2020)
21 GARBAGE CO LLECTIO N The DBMS needs to remove reclaimable physical versions from the database over time. → No active txn in the DBMS can “see” that version (SI). → The version was created by an aborted txn. Three additional design decisions: → How to look for expired versions? → How to decide when it is safe to reclaim memory? → Where to look for expired versions? 15-721 (Spring 2020)
22 GARBAGE CO LLECTIO N Approach #1: Tuple-level → Find old versions by examining tuples directly. → Background Vacuuming vs. Cooperative Cleaning Approach #2: Transaction-level → Txns keep track of their old versions so the DBMS does not have to scan tuples to determine visibility. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 Vacuum BEGIN-TS END-TS T id =12 A 100 1 9 Thread #2 B 100 1 9 T id =25 B 101 10 20 Background Vacuuming: Separate thread(s) periodically scan the table and look for reclaimable versions. Works with any storage. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 Vacuum BEGIN-TS END-TS T id =12 A 100 1 9 Thread #2 B 100 1 9 T id =25 B 101 10 20 Background Vacuuming: Separate thread(s) periodically scan the table and look for reclaimable versions. Works with any storage. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 Vacuum BEGIN-TS END-TS T id =12 A 100 1 9 Thread #2 B 100 1 9 T id =25 B 101 10 20 Background Vacuuming: Separate thread(s) periodically scan the table and look for reclaimable versions. Works with any storage. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 Dirty Block BitMap Vacuum BEGIN-TS END-TS T id =12 A 100 1 9 Thread #2 B 100 1 9 T id =25 B 101 10 20 Background Vacuuming: Separate thread(s) periodically scan the table and look for reclaimable versions. Works with any storage. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 Dirty Block BitMap Vacuum BEGIN-TS END-TS T id =12 A 100 1 9 Thread #2 B 100 1 9 T id =25 B 101 10 20 Background Vacuuming: Separate thread(s) periodically scan the table and look for reclaimable versions. Works with any storage. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 Dirty Block BitMap Vacuum BEGIN-TS END-TS T id =12 A 100 1 9 Thread #2 B 100 1 9 T id =25 B 101 10 20 Background Vacuuming: Separate thread(s) periodically scan the table and look for reclaimable versions. Works with any storage. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 T id =12 A 0 A 1 A 2 A 3 INDEX Thread #2 B 0 B 1 B 2 B 3 T id =25 Background Vacuuming: Cooperative Cleaning: Separate thread(s) periodically Worker threads identify scan the table and look for reclaimable versions as they reclaimable versions. Works traverse version chain. Only with any storage. works with O2N. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 GET(A) T id =12 A 0 A 1 A 2 A 3 INDEX Thread #2 B 0 B 1 B 2 B 3 T id =25 Background Vacuuming: Cooperative Cleaning: Separate thread(s) periodically Worker threads identify scan the table and look for reclaimable versions as they reclaimable versions. Works traverse version chain. Only with any storage. works with O2N. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 GET(A) T id =12 A 0 A 1 A 2 A 3 INDEX Thread #2 B 0 B 1 B 2 B 3 T id =25 Background Vacuuming: Cooperative Cleaning: Separate thread(s) periodically Worker threads identify scan the table and look for reclaimable versions as they reclaimable versions. Works traverse version chain. Only with any storage. works with O2N. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 X GET(A) T id =12 A 0 A 1 A 2 A 3 INDEX Thread #2 B 0 B 1 B 2 B 3 T id =25 Background Vacuuming: Cooperative Cleaning: Separate thread(s) periodically Worker threads identify scan the table and look for reclaimable versions as they reclaimable versions. Works traverse version chain. Only with any storage. works with O2N. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 X X GET(A) T id =12 A 0 A 1 A 2 A 3 INDEX Thread #2 B 0 B 1 B 2 B 3 T id =25 Background Vacuuming: Cooperative Cleaning: Separate thread(s) periodically Worker threads identify scan the table and look for reclaimable versions as they reclaimable versions. Works traverse version chain. Only with any storage. works with O2N. 15-721 (Spring 2020)
23 TUPLE- LEVEL GC Thread #1 GET(A) T id =12 A 2 A 3 INDEX Thread #2 B 0 B 1 B 2 B 3 T id =25 Background Vacuuming: Cooperative Cleaning: Separate thread(s) periodically Worker threads identify scan the table and look for reclaimable versions as they reclaimable versions. Works traverse version chain. Only with any storage. works with O2N. 15-721 (Spring 2020)
24 TRAN SACTIO N - LEVEL GC Each txn keeps track of its read/write set. The DBMS determines when all versions created by a finished txn are no longer visible. May still require multiple threads to reclaim the memory fast enough for the workload. 15-721 (Spring 2020)
25 IN DEX M AN AGEM EN T PKey indexes always point to version chain head. → How often the DBMS must update the pkey index depends on whether the system creates new versions when a tuple is updated. → If a txn updates a tuple’s pkey attribute(s), then this is treated as a DELETE followed by an INSERT . Secondary indexes are more complicated… 15-721 (Spring 2020)
25 IN DEX M AN AGEM EN T PKey indexes always point to version chain head. → How often the DBMS must update the pkey index depends on whether the system creates new versions when a tuple is updated. → If a txn updates a tuple’s pkey attribute(s), then this is treated as a DELETE followed by an INSERT . Secondary indexes are more complicated… 15-721 (Spring 2020)
26 SECO N DARY IN DEXES Approach #1: Logical Pointers → Use a fixed identifier per tuple that does not change. → Requires an extra indirection layer. → Primary Key vs. Tuple Id Approach #2: Physical Pointers → Use the physical address to the version chain head. 15-721 (Spring 2020)
27 IN DEX PO IN TERS GET(A) PRIMARY INDEX SECONDARY INDEX Physical Address Append-Only A 4 A 3 A 2 A 1 Newest-to-Oldest 15-721 (Spring 2020)
27 IN DEX PO IN TERS GET(A) PRIMARY INDEX SECONDARY INDEX Physical Address Append-Only A 4 A 3 A 2 A 1 Newest-to-Oldest 15-721 (Spring 2020)
27 IN DEX PO IN TERS GET(A) PRIMARY INDEX SECONDARY INDEX SECONDARY INDEX SECONDARY INDEX SECONDARY INDEX Append-Only A 4 A 3 A 2 A 1 Newest-to-Oldest 15-721 (Spring 2020)
27 IN DEX PO IN TERS GET(A) PRIMARY INDEX SECONDARY INDEX Primary Key Physical Address Append-Only A 4 A 3 A 2 A 1 Newest-to-Oldest 15-721 (Spring 2020)
27 IN DEX PO IN TERS GET(A) PRIMARY INDEX SECONDARY INDEX TupleId TupleId → Address Physical Address Append-Only A 4 A 3 A 2 A 1 Newest-to-Oldest 15-721 (Spring 2020)
28 M VCC IN DEXES MVCC DBMS indexes (usually) do not store version information about tuples with their keys. → Exception: Index-organized tables (e.g., MySQL) Every index must support duplicate keys from different snapshots: → The same key may point to different logical tuples in different snapshots. 15-721 (Spring 2020)
29 M VCC DUPLICATE KEY PRO BLEM Thread #1 Index Begin @ 10 READ(A) BEGIN-TS END-TS POINTER ∞ A 1 1 Ø 15-721 (Spring 2020)
29 M VCC DUPLICATE KEY PRO BLEM Thread #1 Index Begin @ 10 READ(A) Thread #2 Begin @ 20 UPDATE(A) BEGIN-TS END-TS POINTER ∞ A 1 1 Ø 15-721 (Spring 2020)
29 M VCC DUPLICATE KEY PRO BLEM Thread #1 Index Begin @ 10 READ(A) Thread #2 Begin @ 20 UPDATE(A) BEGIN-TS END-TS POINTER ∞ A 1 1 20 Ø ∞ A 2 20 Ø 15-721 (Spring 2020)
29 M VCC DUPLICATE KEY PRO BLEM Thread #1 Index Begin @ 10 READ(A) Thread #2 Begin @ 20 UPDATE(A) DELETE(A) BEGIN-TS END-TS POINTER ∞ A 1 1 20 Ø ∞ A 2 20 Ø 15-721 (Spring 2020)
29 M VCC DUPLICATE KEY PRO BLEM Thread #1 Index Begin @ 10 READ(A) Thread #2 Begin @ 20 Commit @ 25 UPDATE(A) DELETE(A) BEGIN-TS END-TS POINTER ∞ A 1 1 20 25 Ø ∞ A 2 20 25 25 Ø 15-721 (Spring 2020)
29 M VCC DUPLICATE KEY PRO BLEM Thread #1 Index Begin @ 10 READ(A) Thread #2 Begin @ 20 Commit @ 25 UPDATE(A) DELETE(A) BEGIN-TS END-TS POINTER ∞ Thread #3 A 1 1 20 25 Ø ∞ Begin @ 30 A 2 20 25 25 Ø ∞ A 1 30 Ø INSERT(A) 15-721 (Spring 2020)
29 M VCC DUPLICATE KEY PRO BLEM Thread #1 Index Begin @ 10 READ(A) READ(A) Thread #2 Begin @ 20 Commit @ 25 UPDATE(A) DELETE(A) BEGIN-TS END-TS POINTER ∞ Thread #3 A 1 1 20 25 Ø ∞ Begin @ 30 A 2 20 25 25 Ø ∞ A 1 30 Ø INSERT(A) 15-721 (Spring 2020)
30 M VCC IN DEXES Each index's underlying data structure must support the storage of non-unique keys. Use additional execution logic to perform conditional inserts for pkey / unique indexes. → Atomically check whether the key exists and then insert. Workers may get back multiple entries for a single fetch. They then must follow the pointers to find the proper physical version. 15-721 (Spring 2020)
31 M VCC EVALUATIO N PAPER We implemented all the design decisions in the Peloton DBMS as part of 15-721 in Spring 2016. Two categories of experiments: → Evaluate each of the design decisions in isolation to determine their trade-offs. → Compare configurations of real-world MVCC systems. AN EMPIRICAL EVALUATION OF IN- MEMORY MULTI- VERSION CONCURRENCY CONTROL VLDB 2017 15-721 (Spring 2020)
32 M VCC DESIGN DECISIO N S CC Protocol: Inconclusive results… Version Storage: Deltas Garbage Collection: Tuple-Level Vacuuming Indexes: Logical Pointers 15-721 (Spring 2020)
33 M VCC CO N FIGURATIO N EVALUATIO N Protocol Version Storage Garbage Collection Indexes Oracle MV2PL Delta Vacuum Logical Postgres MV-2PL/MV-TO Append-Only Vacuum Physical MySQL-InnoDB MV-2PL Delta Vacuum Logical HYRISE MV-OCC Append-Only – Physical Hekaton MV-OCC Append-Only Cooperative Physical MemSQL MV-OCC Append-Only Vacuum Physical SAP HANA MV-2PL Time-travel Hybrid Logical NuoDB MV-2PL Append-Only Vacuum Logical HyPer MV-OCC Delta Txn-level Logical CMU's TBD MV-OCC Delta Txn-level Logical 15-721 (Spring 2020)
34 M VCC CO N FIGURATIO N EVALUATIO N Database: TPC-C Benchmark (40 Warehouses) Processor: 4 sockets, 10 cores per socket Oracle/MySQL 100 Throughput (txn/sec) NuoDB 75 HyPer HYRISE 50 MemSQL 25 HANA HEKATON 0 0 8 16 24 32 40 Postgres # Threads 15-721 (Spring 2020)
34 M VCC CO N FIGURATIO N EVALUATIO N Database: TPC-C Benchmark (40 Warehouses) Processor: 4 sockets, 10 cores per socket Oracle/MySQL 100 Throughput (txn/sec) NuoDB 75 HyPer HYRISE 50 MemSQL 25 HANA HEKATON 0 0 8 16 24 32 40 Postgres # Threads 15-721 (Spring 2020)
100 PRO J ECT # 1 Identify bottlenecks in the DBMS's sequential scan implementation using profiling tools and refactor the system to remove it. This project is meant to teach you how to work in a highly concurrent system. 15-721 (Spring 2020)
Recommend
More recommend