High Performance Transactions via Early Write Visibility Jose Faleiro Daniel Abadi Joseph Hellerstein
Serializability is our gold standard Developers focus on individual transaction correctness System ensures correctness under concurrency Developers can focus on application logic
Elephant in the room: Serializability is the exception Weak isolation is the norm
Serializability in practice… is snapshot isolation is not the default
Non-modular applications: changing anything changes everything
Non-modular applications: changing anything changes everything Silent data corruption
Non-modular applications: changing anything changes everything Silent data corruption Security bugs
“The hacker discovered that if you place several withdrawals all in practically the same instant, they will get processed at more or less the same time. This will result in a negative balance, but valid insertions into the database… ’’
(Second!) Elephant in the room: Very little progress towards addressing the gap
The real hurdle is recoverability mechanism
Recoverability + isolation Strong isolation mechanisms have limited mileage Due to: Recoverability mechanisms Isolation level specifications
Recoverability + isolation Limitation is independent of isolation level implementation
Recoverability + isolation Limitation is independent of isolation level implementation Includes all modern concurrency control protocols based on 2PL, OCC, MVCC, Timestamp ordering
This talk State-of-the-art recoverability mechanisms fundamentally limit strong isolation levels New recoverability mechanism based on deterministic execution
Recoverability Committed transactions must read committed data Required of popular isolation levels: Read committed, snapshot isolation, repeatable read, serializable
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl item = items_tbl[item_id item_id] if if item.count item.count == 0: == 0: Abort() Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: Abort() item.count item.count -= 1 = 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert bills.insert(cust_id cust_id, , item_id item_id, , item.price) item.price history.insert history.insert(cust_id cust_id, , item_id item_id)
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: Abort()
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: Abort() Purchase(item_id, cust_id): item.count -= 1 item = items_tbl[item_id] if item.count == 0: Abort()
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: Abort() Purchase(item_id, cust_id): item.count -= 1 item = items_tbl[item_id] if item.count == 0: Abort()
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: Abort() Strawmen: item.count -= 1 Wait until commit Limited throughput bills.insert(cust_id, item_id, Expose writes immediately item.price) Cascaded rollbacks history.insert(cust_id, item_id)
Recoverability Purchase(item_id, cust_id): mechanisms item = items_tbl[item_id] if item.count == 0: State-of-the-art: Group commit Abort() item.count -= 1 Wait until end of execution , not commit bills.insert(cust_id, Readers “share fate” with writers item_id, item.price) Durable write latency does not history.insert(cust_id, limit throughput item_id) Cascaded rollbacks restricted to failures
State-of-the-art: Group commit
Writes made visible at the Purchase(item_id, cust_id): item = items_tbl[item_id] end of txn’s execution if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) Write visibility delay history.insert(cust_id, item_id)
Serializability: Purchase(item_id, cust_id): Conflicting txns must wait item = items_tbl[item_id] if item.count == 0: Abort() Read Committed: Conflicting txns can read old values item.count -= 1 bills.insert(cust_id, item_id, item.price) Write visibility delay history.insert(cust_id, item_id)
Serializable Purchase(item_id, cust_id): Purchase(item_id, cust_id): Purchase(item_id, cust_id): item = items_tbl[item_id] item = items_tbl[item_id] item = items_tbl[item_id] if item.count == 0: if item.count == 0: if item.count == 0: Abort() Abort() Abort() item.count -= 1 item.count -= 1 item.count -= 1 bills.insert(cust_id, bills.insert(cust_id, bills.insert(cust_id, item_id, item_id, item_id, item.price) item.price) item.price) history.insert(cust_id, history.insert(cust_id, history.insert(cust_id, item_id) item_id) item_id)
Read committed Item Read Purchase(item_id, cust_id): Purchase(item_id, cust_id): Purchase(item_id, cust_id): item = items_tbl[item_id] item = items_tbl[item_id] item = items_tbl[item_id] if item.count == 0: if item.count == 0: if item.count == 0: Abort() Abort() Abort() item.count -= 1 item.count -= 1 item.count -= 1 bills.insert(cust_id, bills.insert(cust_id, bills.insert(cust_id, item_id, item_id, item_id, item.price) item.price) item.price) history.insert(cust_id, history.insert(cust_id, history.insert(cust_id, item_id) item_id) item_id)
Impact of delayed write visibility
Impact of delayed write visibility Workload: 10 read-modify-write txns 1 hot, 9 cold Vary position of hot update
Impact of delayed write visibility Workload: 10 read-modify-write txns Cold update 1 hot, 9 cold Cold update Vary position of hot update . . . Cold update Cold update Hot update Hot update
Impact of delayed write visibility Workload: 10 read-modify-write txns Cold update 1 hot, 9 cold Cold update Vary position of hot update . . . Cold update Cold update Write visibility delay: 0 Hot update Hot update
Impact of delayed write visibility Workload: 10 read-modify-write txns Cold update 1 hot, 9 cold Cold update Vary position of hot update . . . Hot update Hot update Cold update Write visibility delay: 2 Cold update
Impact of delayed write visibility Workload: 10 read-modify-write txns Hot Hot update update 1 hot, 9 cold Cold update Vary position of hot update Cold update . . . Write visibility delay: 9 Cold update Cold update
Impact of delayed write visibility 400 K Read Committed Throughput (txns/sec) Serializable 300 K 200 K 100 K 0 K 0 1 2 3 4 5 6 7 8 9 Write visibility delay
Impact of delayed write visibility 30% drop in Read committed vs 3x drop in Serializable
Evolution of transaction processing Recoverability Concurrency control Metaphor credit: Bill Thies
Evolution of transaction processing Recoverability Concurrency control Metaphor credit: Bill Thies
Why delayed write visibility? Database systems have the flexibility to arbitrarily abort transactions
Why delayed write visibility? Failures Deadlocks Abort statements Validation errors Constraint violations Resource constraints
Why delayed write visibility? System induced State induced Failures Deadlocks Abort statements Validation errors Constraint violations Resource constraints
Purchase(item_id, cust_id): item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, Write visibility delay item.price) history.insert(cust_id, item_id)
Why delayed write visibility System induced State induced Failures Deadlocks Abort statements Validation errors Constraint violations Resource constraints
Purchase(item_id, cust_id): Early write visibility item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Purchase(item_id, cust_id): Early write visibility item = items_tbl[item_id] if item.count == 0: Abort() item.count -= 1 Item’s count update is visible here bills.insert(cust_id, item_id, item.price) history.insert(cust_id, item_id)
Recommend
More recommend