Transactions: ACID Properties Transactions: ACID Properties “Full-blown” transactions guarantee four intertwined properties: • Atomicity . Transactions can never “partly commit”; their updates are applied “all or nothing”. The system guarantees this using logging, shadowing, distributed commit. Transactional Concurrency Control Transactional Concurrency Control • Consistency . Each transaction T transitions the dataset from one semantically consistent state to another. The application guarantees this by correctly marking transaction boundaries. • Independence/Isolation . All updates by T1 are either entirely visible to T2 , or are not visible at all. Guaranteed through locking or timestamp-based concurrency control. • Durability . Updates made by T are “never” lost once T commits. The system guarantees this by writing updates to stable storage. Isolation and Serializability Isolation and Serializability Some Examples of Conflicts Some Examples of Conflicts Isolation/Independence means that actions are serializable . A conflict exists when two transactions access the same item, and at least one of the accesses is a write. A schedule for a group of transactions is serializable iff its effect is the same as if they had executed in some serial order. 1. lost update problem Obvious approach: execute them in a serial order (slow). T : transfer $100 from A to C: R(A) W(A) R(C) W(C) Transactions may be interleaved for concurrency, but this S: transfer $100 from B to C: R(B) W(B) R(C) W(C) requirement constrains the allowable schedules: 2. inconsistent retrievals problem ( dirty reads violate consistency) T 1 and T 2 may be arbitrarily interleaved only if there are no T : transfer $100 from A to C: R(A) W(A) R(C) W(C) conflicts among their operations. S : compute total balance for A and C : R(A) R(C) • A transaction must not affect another that commits before it. 3. nonrepeatable reads • Intermediate effects of T are invisible to other transactions T : transfer $100 from A to C: R(A) W(A) R(C) W(C) unless/until T commits, at which point they become visible. S : check balance and withdraw $100 from A : R(A) R(A) W(A) Serializable Schedules Schedules Legal Interleaved Schedules: Examples Serializable Legal Interleaved Schedules: Examples A schedule is a partial ordering of operations for a set of T < S transactions {T,S,...}, such that: 1. avoid lost update problem • The operations of each xaction execute serially. T : transfer $100 from A to C: R(A) W(A) R(C) W(C) • The schedule specifies an order for conflicting operations. S: transfer $100 from B to C: R(B) W(B) R(C) W(C) Any two schedules for {T,S,...} that order the conflicting operations 2. avoid inconsistent retrievals problem in the same way are equivalent . T : transfer $100 from A to C: R(A) W(A) R(C) W(C) A schedule for {T,S,...} is serializable if it is equivalent to some S : compute total balance for A and C : R(A) R(C) serial schedule on {T,S,...} . 3. avoid nonrepeatable reads There may be other serializable schedules on {T,S,...} that do not T : transfer $100 from A to C R(A) W(A) R(C) W(C) meet this condition, but if we enforce this condition we are safe. Conflict serializability : detect conflicting operations and enforce a S : check balance and withdraw $100 from A : R(A) R(A) W(A) serial-equivalent order. 1
Defining the Legal Schedules Defining the Legal Schedules The Graph Test for Serializability The Graph Test for Serializability 1. To be serializable, the conflicting operations of T and S must be To determine if a schedule is serializable, make a directed graph: ordered as if either T or S had executed first. • Add a node for each committed transaction. We only care about the conflicting operations: everything else • Add an arc from T to S if any equivalent serial schedule must will take care of itself. order T before S . 2. Suppose T and S conflict over some shared item(s) x . T must commit before S iff the schedule orders some operation of T 3. In a serial schedule, T ’s operations on x would appear before S ’s, before some operation of S . or vice versa....for every shared item x . The schedule only defines such an order for conflicting operations... As it turns out, this is true for all the operations, but again, we ...so this means that a pair of accesses from T and S conflict over only care about the conflicting ones. some item x , and the schedule says T “wins” the race to x . 4. A legal (conflict-serializable) interleaved schedule of T and S • The schedule is conflict-serializable if the graph has no cycles. must exhibit the same property. (winner take all) Either T or S “wins” in the race to x ; serializability dictates that A T S the “winner take all”. C The Graph Test: Example The Graph Test: Example Transactional Concurrency Control Transactional Concurrency Control Three ways to ensure a serial-equivalent order on conflicts: Consider two transactions T and S : • Option 1, execute transactions serially. T : transfer $100 from A to C: R(A) W(A) R(C) W(C) “single shot” transactions S : compute total balance for A and C : R(A) R(C) • Option 2, pessimistic concurrency control : block T until transactions with conflicting operations are done. use locks for mutual exclusion T: R(A) W(A) R(C) W(C) T: R(A) W(A) R(C) W(C) S: R(A) R(C) S: R(A) R(C) two-phase locking (2PL) required for strict isolation • Option 3, optimistic concurrency control : proceed as if no A A conflicts will occur, and recover if constraints are violated. T S T S C C Repair the damage by rolling back (aborting) one of the conflicting transactions. ( S total balance gains $100.) ( S total balance loses $100.) • Option 4, hybrid timestamp ordering using versions. Pessimistic Concurrency Control Why 2PL? Pessimistic Concurrency Control Why 2PL? Pessimistic concurrency control uses locking to prevent illegal If transactions are well-formed, then an arc from T to S in the conflict orderings. schedule graph indicates that T beat S to some lock. avoid/reduce expensive rollbacks Neither could access the shared item x without holding its lock. • Well-formed : acquire lock before accessing each data item. Read the arc as “ T holds a resource needed by S ”. Concurrent transactions T and S race for locks on conflicting data 2PL guarantees that the “winning” transaction T holds all its items (say x and y ).... locks at some point during its execution. Locks are often implicit, e.g., on first access to a data object/page. T: R(A) W(A) R(C) W(C) • No acquires after release : hold all locks at least until all Thus 2PL guarantees that T “won the race” S: R(A) R(C) needed locks have been acquired (2PL). for all the locks... growing phase vs. shrinking phase A ...or else a deadlock would have resulted. T S • Problem: possible deadlock. C prevention vs. detection and recovery 2
Recommend
More recommend