demystifying distributed transactions with the fairness
play

Demystifying Distributed Transactions with the - PowerPoint PPT Presentation

Demystifying Distributed Transactions with the Fairness-Isolation-Throughput Tradeoff Jose Faleiro Yale University Distributed txns: Too expensive? Several popular databases eschew distributed txns Distributed txns: Too expensive?


  1. Get rid of distributed transactions? “In retrospect I think that [not supporting distributed transactions] was a mistake . … a lot of people did want distributed transactions, and so they hand-rolled their own protocols, sometimes incorrectly , and it would have been better to build it into the infrastructure.“ - Jeff Dean

  2. Get rid of distributed transactions? “In retrospect I think that [not supporting distributed transactions] was a mistake . … a lot of people did want distributed transactions, and so they hand-rolled their own protocols, sometimes incorrectly , and it would have been better to build it into the infrastructure.“ - Jeff Dean

  3. Get rid of distributed transactions? “In retrospect I think that [not supporting distributed transactions] was a mistake . … a lot of people did want distributed … and many more transactions, and so they hand-rolled their own protocols, sometimes incorrectly , and it would have been better to build it into the infrastructure.“ - Jeff Dean

  4. Why are distributed txns expensive?

  5. Why are distributed txns expensive? Suppose T 1 observes T 0 ’s writes T 0 : Read R Read B . Write R . . Write R Write B T 1 : Read B . . . Write B

  6. Why are distributed txns expensive? Suppose T 1 observes T 0 ’s writes T 0 : WAIT! Read R Read B Distributed Transactions entail . Write R . unavoidable coordination . Write R Write B T 1 : Read B . . . Write B

  7. Why are distributed txns expensive? Suppose T 1 observes T 0 ’s writes T 0 : Read R Read B . Write R . . Write R Write B T 1 : Read B . T1 must wait for commit . . protocol to finish Write B

  8. Why are distributed txns expensive? • Mechanisms for atomicity and isolation overlap in time • Atomicity: Distributed coordination • Isolation: Wait during distributed coordination

  9. Fairness-Isolation-Throughput tradeoff • Any system implementing distributed transactions can get at most two of these three properties • Three classes of systems • Fairness-Isolation • Isolation-Throughput • Throughput-Fairness

  10. FIT Tradeoff • Poor performance of distributed transactions attributable to two fundamental issues • Expensive commit protocol (required due to atomicity) • Waiting (required for isolation) • Commit protocol and waiting overlap in time • Space characterized by how to separate commit protocol from waiting

  11. Intuition • Badness results from overlapping commit with isolation • To avoid impact of coordination, separate the two

  12. Option 1: Weaken isolation • Allow conflicting txns to execute without observing each other’s writes • Implementable without making txns wait for each other • Susceptible to concurrency bugs • Transactions execute against potentially stale state • E.g., RAMP transactions

  13. Option 2: Re-order coordination • Move coordination outside of transaction boundaries • Amortize coordination across several transactions • Compromises fairness because we penalize certain txns to benefit overall throughput • E.g., Calvin, G-Store

  14. FIT Tradeoff • Fairness-Isolation • Give up throughput • Fairness-Throughput • Give up isolation • Isolation-Throughput • Give up fairness

  15. FIT Tradeoff • Fairness-Isolation Atomicity and isolation • Give up throughput mechanisms overlap • Fairness-Throughput • Give up isolation • Isolation-Throughput • Give up fairness

  16. FIT Tradeoff • Fairness-Isolation • Give up throughput • Fairness-Throughput • Give up isolation • Isolation-Throughput Atomicity and isolation • Give up fairness mechanisms are decoupled

  17. Weak Isolation Example • Read Atomic Multi-Partition (RAMP) transactions • Decouples concurrent transactions • Research system • Appeared in SIGMOD 2014 • Peter Bailis et al. from UC Berkeley

  18. RAMP transactions R B G

  19. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G

  20. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G

  21. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G

  22. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G

  23. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G Run a commit protocol

  24. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B Commit protocol represents B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) coordination required for Write R 1 G 1 B 1 atomicity G Run a commit protocol

  25. RAMP transactions R, R T1 T 1 : R 0 , G 0 , B 0 = Read R G B B, B T1 R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G, G T1

  26. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  27. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  28. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  29. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B T 1 and T 2 read the same B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) snapshot Write R 1 G 1 B 1 G T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  30. RAMP transactions R T 1 : R 0 , G 0 , B 0 = Read R G B B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  31. RAMP transactions R, R T1 T 1 : R 0 , G 0 , B 0 = Read R G B B, B T1 R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 Commit T 1 G, G T1 T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  32. RAMP transactions R, R T1 T 1 : R 0 , G 0 , B 0 = Read R G B B, B T1 R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 G, G T1 T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  33. RAMP transactions R, R T1 , R T2 T 1 : R 0 , G 0 , B 0 = Read R G B B, B T1, B T2 R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 Commit T 2 G, G T1 , G T2 T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  34. RAMP transactions R, R T1 , R T2 T 1 : R 0 , G 0 , B 0 = Read R G B B, B T1, B T2 R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 T 1 and T 2 don’t see each other’s writes G, G T1 , G T2 T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  35. RAMP transactions R, R T1 , R T2 T 1 : R 0 , G 0 , B 0 = Read R G B B, B T1, B T2 R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1 State must be merged in some app dependent manner G, G T1 , G T2 T 2 : R 0 , G 0 , B 0 = Read R G B R 1 , G 1 , B 1 = f(R 0 , G 0 , B 0 ) Write R 1 G 1 B 1

  36. RAMP transactions • Decouples execution of concurrent txns • “Synchronization independence” • Great for scalability • More resources means more throughput • Weak isolation • Diverged state must be reconciled and merged • Cannot enforce important class of constraints

  37. RAMP transactions • Decouples execution of concurrent txns • “Synchronization independence” • Great for scalability Txns never blocks others • More resources means more throughput • Weak isolation • Diverged state must be reconciled and merged • Cannot enforce important class of constraints

  38. RAMP transactions • Decouples execution of concurrent txns Extra development effort • “Synchronization independence” • Great for scalability • More resources means more throughput • Weak isolation • Diverged state must be reconciled and merged • Cannot enforce important class of constraints

  39. RAMP transactions • Decouples execution of concurrent txns • “Synchronization independence” Irrelevant for many • Great for scalability applications • More resources means more throughput • Weak isolation • Diverged state must be reconciled and merged • Cannot enforce important class of constraints

  40. Fairness-Isolation-Throughput tradeoff • Any system implementing distributed transactions can get at most two of these three properties • Three classes of systems • Fairness-Isolation • Isolation-Throughput • Throughput-Fairness

  41. Why are distributed txns expensive? Suppose T 1 observes T 0 ’s writes T 0 : Read R Read B . Write R . . Write R Write B T 1 : Read B . T1 must wait for commit . . protocol to finish Write B

  42. Re-ordered coordination example • Move distributed coordination outside txn boundaries • Amortize its cost across several txns • Guarantee isolation • Conflicts still induce waiting • But txns don’t wait for distributed coordination • By re-ordering coordination, some txns are penalized • Unfairly delay txns to benefit overall throughput

  43. G-Store • Built for workloads with temporal locality • E.g., multi-player games • Research system • Appeared in SoCC 2010 • Sudipto Das et al. from UC Santa Barbara • Built as a transaction layer on top of Hbase

  44. G-Store R Supports txns on “KeyGroups” B Y G

  45. G-Store R B Y G

Recommend


More recommend