Serializable Snapshot Isolation in PostgreSQL Dan Ports Kevin - PowerPoint PPT Presentation

Serializable Snapshot Isolation in PostgreSQL Dan Ports Kevin Grittner University of Washington Wisconsin Supreme Court MIT Tuesday, August 28, 2012

For years, PostgreSQL’s “SERIALIZABLE” mode did not provide true serializability • instead: snapshot isolation – allows anomalies PostgreSQL 9.1: Serializable Snapshot Isolation • based on recent research [Cahill, SIGMOD ’08] • first implementation in a production DB release & first in a purely-snapshot DB Tuesday, August 28, 2012

This talk.... • Motivation: Why serializability? Why did we choose SSI? • Review of snapshot isolation and SSI • Implementation challenges & optimizations • Performance Tuesday, August 28, 2012

Serializability vs. Performance Two perspectives: • Serializability is important for correctness • simplifies development; don’t need to worry about race conditions • Serializability is too expensive to use • locking restricts concurrency; use weaker isolation levels instead Tuesday, August 28, 2012

Serializability vs. Performance (in PostgreSQL) PostgreSQL offered snapshot isolation instead • better performance than 2-phase locking “readers don’t block writers, writers don’t block readers” • but doesn’t guarantee serializability! Snapshot isolation isn’t enough for some users • complex databases with strict integrity requirements, e.g. Wisconsin Court System Tuesday, August 28, 2012

Serializability vs. Performance (in PostgreSQL) PostgreSQL offered snapshot isolation instead • better performance than 2-phase locking “readers don’t block writers, writers don’t block readers” • but doesn’t guarantee serializability! Snapshot isolation isn’t enough for some users • complex databases with strict integrity requirements, e.g. Wisconsin Court System Serializable Snapshot Isolation offered true serializability with performance benefits of snapshot isolation! Tuesday, August 28, 2012

Serializable Snapshot Isolation SSI approach: • run transactions using snapshot isolation • detect conflicts between transactions at runtime; abort transactions to prevent anomalies Appealing for performance reasons • aborts less common than blocking under 2PL • readers still don’t block writers! [Cahill et al. Serializable Isolation for Snapshot Databases, SIGMOD ’08] Tuesday, August 28, 2012

SSI in PostgreSQL Available in PostgreSQL 9.1; first production implementation Contributions: new implementation techniques • Detecting conflicts in a purely-snapshot DB • Limiting memory usage • Read-only transaction optimizations • Integration with other PostgreSQL features Tuesday, August 28, 2012

Outline • Motivation • Review of snapshot isolation and SSI • Implementation challenges & optimizations • Performance • Conclusions Tuesday, August 28, 2012

Goal: ensure at least one guard always on-duty guard on-duty? uty? Alice y Bob y Tuesday, August 28, 2012

BEGIN Goal: ensure at least one SELECT count(*) guard always on-duty FROM guards WHERE on-duty = y if > 1 { UPDATE guards SET on-duty = n WHERE guard = x } guard on-duty? uty? COMMIT Alice y Bob y Tuesday, August 28, 2012

guard on-duty? uty? Alice y Bob y Tuesday, August 28, 2012

BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2] guard on-duty? uty? Alice y Bob y Tuesday, August 28, 2012

BEGIN BEGIN SELECT count(*) SELECT count(*) FROM guards FROM guard WHERE on-duty = y WHERE on-duty = y [result = 2] [result = 2] guard on-duty? uty? Alice y Bob y Tuesday, August 28, 2012

BEGIN BEGIN SELECT count(*) SELECT count(*) FROM guards FROM guard WHERE on-duty = y WHERE on-duty = y [result = 2] [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’ } COMMIT guard on-duty? uty? Alice y Bob y Tuesday, August 28, 2012

BEGIN BEGIN SELECT count(*) SELECT count(*) FROM guards FROM guard WHERE on-duty = y WHERE on-duty = y [result = 2] [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’ } COMMIT guard on-duty? uty? n Alice y Bob y Tuesday, August 28, 2012

BEGIN BEGIN SELECT count(*) SELECT count(*) FROM guards FROM guard WHERE on-duty = y WHERE on-duty = y [result = 2] [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’ } if > 1 { COMMIT UPDATE guards SET on-duty = n WHERE guards = ‘Bob’ } guard on-duty? uty? COMMIT n Alice y Bob y Tuesday, August 28, 2012

BEGIN BEGIN SELECT count(*) SELECT count(*) FROM guards FROM guard WHERE on-duty = y WHERE on-duty = y [result = 2] [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’ } if > 1 { COMMIT UPDATE guards SET on-duty = n WHERE guards = ‘Bob’ } guard on-duty? uty? COMMIT n Alice y Bob y n Tuesday, August 28, 2012

BEGIN BEGIN SELECT count(*) SELECT count(*) rw-conflict: FROM guards FROM guard T1 didn’t see WHERE on-duty = y WHERE on-duty = y T2’s UPDATE [result = 2] [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’ } if > 1 { COMMIT UPDATE guards SET on-duty = n WHERE guards = ‘Bob’ } guard on-duty? uty? COMMIT n Alice y Bob y n Tuesday, August 28, 2012

BEGIN BEGIN SELECT count(*) SELECT count(*) rw-conflict: FROM guards FROM guard T1 didn’t see WHERE on-duty = y WHERE on-duty = y T2’s UPDATE [result = 2] [result = 2] rw-conflict: if > 1 { T2 didn’t see UPDATE guards T1’s UPDATE SET on-duty = n WHERE guard = ‘Alice’ } if > 1 { COMMIT UPDATE guards SET on-duty = n WHERE guards = ‘Bob’ } guard on-duty? uty? COMMIT n Alice y Bob y n Tuesday, August 28, 2012

SSI Approach Detect these rw-conflicts and maintain a conflict graph Serializability theory: each anomaly involves two adjacent rw-conflict edges • if found, abort some involved transaction • note: can have false positives Tuesday, August 28, 2012

T1 rw-conflict: rw-conflict: T1 didn’t see T2 didn’t see T2’s UPDATE T1’s UPDATE T2 two adjacent edges: T1 -> T2 and T2 -> T1 Tuesday, August 28, 2012

T1 rw-conflict: rw-conflict: T1 didn’t see T2 didn’t see T2’s UPDATE T1’s UPDATE X T2 two adjacent edges: T1 -> T2 and T2 -> T1 Tuesday, August 28, 2012

T1 ERROR: could not serialize access due to rw-conflict: rw-conflict: read/write dependencies among transactions T1 didn’t see T2 didn’t see HINT: The transaction might succeed if retried. T2’s UPDATE T1’s UPDATE X T2 two adjacent edges: T1 -> T2 and T2 -> T1 Tuesday, August 28, 2012

SSI in PostgreSQL Implementation challenges: • Detecting conflicts in a purely-snapshot DB • requires new lock manager • Reining in potentially-unbounded memory usage Tuesday, August 28, 2012

Detecting Conflicts How to detect when an update conflicts with a previous read? Previous SSI implementations: reuse read locks from existing lock mgr But... • PostgreSQL didn’t have read locks! • ...let alone predicate locks Tuesday, August 28, 2012

SSI Lock Manager Needed to build a new lock manager to track read dependencies • Uses multigranularity locks, index-range locks • Doesn’t block, just flags conflicts => no deadlocks • Locks need to persist past transaction commit Tuesday, August 28, 2012

Memory Usage Need to keep track of transaction readsets + conflict graph • not just active transactions; also committed ones that ran concurrently • one long-running transaction can cause memory usage to grow without bound Could exhaust shared memory space (esp. in PostgreSQL) Tuesday, August 28, 2012

Read-Only Transactions Many long-running transactions are read-only; optimize for these Safe snapshots: cases where r/o transactions can never be a part of an anomaly • can then run using regular SI w/o SSI overhead • but: can only detect once all concurrent r/w transactions complete Deferrable transactions: delay execution to ensure safe snapshot Tuesday, August 28, 2012

Graceful Degradation What if we still run out of memory? Don’t want to refuse to accept new transactions Instead: keep less information (tradeoff: more false positives) • keep less state about committed transactions • deduplicate readsets: “read by some committed transaction” Tuesday, August 28, 2012

Performance TPC-C-derived benchmark; modified to have SI anomalies Varied fraction of r/o and r/w transactions Compared PostgreSQL 9.1’s SSI against SI, and an implementation of S2PL Tuesday, August 28, 2012

Serializable Snapshot Isolation in PostgreSQL Dan Ports Kevin - PowerPoint PPT Presentation

Serializable Snapshot Isolation in PostgreSQL Dan Ports Kevin Grittner University of Washington Wisconsin Supreme Court MIT Tuesday, August 28, 2012 For years, PostgreSQLs SERIALIZABLE mode did not provide true serializability

Serializable Snapshot Isolation Making ISOLATION LEVEL SERIALIZABLE Provide Serializable

Corporate Presentation June, 2018 Group Snapshot July 00, 2017 Group Snapshot Group Snapshot -

PostgreSQL Who, What, When, Where, Why, How? 1 QUIS? Who's involved with PostgreSQL? Core

Hacking PostgreSQL Stephen Frost Crunchy Data stephen@crunchydata.com FOSDEM 2019 February 3,

Hacking PostgreSQL Stephen Frost Crunchy Data stephen@crunchydata.com PGConf.EU 2018 October

PostgreSQL SQL-MED Ibrar Ahmed Senior Software Engineer @ Percona PostgreSQL Consultant What?

Breaking PostgreSQL at Scale. Christophe Pettus PostgreSQL Experts pgDay Paris 2019

PostgreSQL Provider The PostgreSQL provider gives the ability to deploy and congure resources

Look It Up: Practical PostgreSQL Indexing Christophe Pettus PostgreSQL Experts

PostgreSQL Replication Christophe Pettus PostgreSQL Experts PerconaLive, April 25, 2018

Securing PostgreSQL Christophe Pettus PostgreSQL Experts, Inc. PGDay FOSDEM 2018 Greetings!

Hosted PostgreSQL: An Objective Look Christophe Pettus PostgreSQL Experts, Inc. FOSDEM PGDay

PostgreSQL on FreeBSD Some news, observations and speculation Thomas Munro, BSDCan 2020

PostgreSQL for developers Dimitri Fontaine PostgreSQL Major Contributor P O S T G R E S Q L M A

Distributed PostgreSQL Santa Clara, California | April 23th 25th, 2018 Simon Riggs CTO,

Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems By Thomas

Dec 05, 2006 Yinghua Wu Where are we? After learning all the foundation of modern

Covariant Quantum Error Correction R e f e r e n c e f r a m e e n c o d i n g , s y m m e t r

Tor: a quick overview Roger Dingledine The Tor Project https://torproject.org/ 1 What is Tor?

Alice and Bob want to communicate securely Achieve confidentiality and

Alice goes floating Frank Mittelbach TUG 2016, Toronto, Canada, July 2016 /Alice goes floating

Algorithms for Big Data (IX) Chihao Zhang Shanghai Jiao Tong University Nov. 15, 2019

Overview Key exchange Session vs. interchange keys Classical, public key methods

Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University