cse 132c database system implementation
play

CSE 132C Database System Implementation Arun Kumar Topic 6: - PowerPoint PPT Presentation

CSE 132C Database System Implementation Arun Kumar Topic 6: Transaction Management Chapter 16 of Cow Book Slide ACKs: Jignesh Patel 1 Transaction Management Motivation and Basics The ACID Properties Transaction Schedules


  1. CSE 132C 
 Database System Implementation Arun Kumar Topic 6: Transaction Management Chapter 16 of Cow Book Slide ACKs: Jignesh Patel 1

  2. Transaction Management Motivation and Basics ❖ The ACID Properties Transaction Schedules ❖ Conflicts and Other Issues Concurrency Control ❖ SQL/RDBMS Features 2

  3. Motivation ❖ An RDBMS does not exist in a vacuum! ❖ It manages structured data at scale for an application ❖ Banking, insurance, finance, retail, telecom, etc. ❖ Workload could involve a mix of reads and writes ❖ Application determines schema of databases in RDBMS, integrity constraints, instance content, physical design, etc. Application-independence is a key benefit of RDBMSs! Q: But at what level does an application deal with an RDBMS? 3

  4. Motivation ❖ An SQL query is often too fine-grained for an application ❖ In most real-world applications, a “logical unit of work” could be coarser / high-level: May need multiple DDL + DML queries together May involve reading and/or writing (update/delete/insert) May need to check for integrity constraints ❖ A “transaction” is a sequence of operations on the database that captures one logical unit of work for an application (NB: Not really specific to the relational data model!) 4

  5. Motivating Example ❖ A logical unit of work in a bank’s DBMS: “Transfer $5000 from Checking to Savings account” ❖ A sequence of fine-grained operations on the database: C is Checking balance S is Savings balance One logical unit of work 5

  6. Transaction Management Overview ❖ Transaction : DBMS abstraction to capture a sequence of database operations that form one logical unit of work ❖ Transaction Manager : Software module in a DBMS that manages transactions ❖ Benefits of having the “transaction” abstraction: Enables us to reason about concurrent queries (e.g., DMA, multi-core parallelism) Enables us to reason about recovery from crashes 6

  7. Transaction Management Overview Q: What operations does a transaction (txn) contain? ❖ Txn Manager understands only “ Read ” (R) and “ Write ” (W) ❖ Granularity can vary: item, tuple, relation, etc. ❖ All SQL queries mapped to a sequence of R/W! ❖ Three additional special operations: “ Begin ”, “ Commit ”, “ Abort ” 7

  8. Motivating Example Application’s unit of work The abstract transaction R (C) W (C) C 8000 C 3000 C 3000 C 8000 R (S) S 7000 S 2000 S 7000 S 2000 W (S) RAM Disk 8

  9. Seems to work fine. What is the point of a “transaction” abstraction then? 9

  10. What if this happens? Application’s unit of work The abstract transaction R (C) W (C) C 8000 C 3000 C 3000 C 8000 R (S) S 7000 S 2000 S 2000 W (S) RAM Disk OS crashes! Your $5000 has vanished! ☹ 10

  11. Transaction Commits Application’s unit of work The abstract transaction Begin R (C) W (C) C 8000 C 3000 C 3000 C 8000 R (S) S 2000 S 7000 S 7000 S 2000 W (S) RAM Disk Commit Tells the DBMS that the transaction has finished successfully and all of its changes need to be persisted (on disk)! 11

  12. Transaction Aborts Application’s unit of work The abstract transaction Begin R (C) W (C) C -1000 C 4000 C 4000 Abort S 2000 RAM Disk Tells the DBMS that the transaction failed (for some reason) and it should be terminated without persisting any of its changes! 12

  13. Moral of the Story ❖ A transaction must satisfy “ all or nothing ” property Application’s “logical unit of work” is indivisible Either all operations get done or none of them get done ❖ The database state must remain consistent Application’s invariants define what is consistent A transaction is assumed to get the database from one consistent state to another (inconsistency ok in between) C=8000 C=3000 C=3000, S=2000 S=2000 S=7000 C=8000, S=7000 13

  14. Transaction Management Motivation and Basics ❖ The ACID Properties Transaction Schedules ❖ Conflicts and Other Issues Concurrency Control ❖ SQL/RDBMS Features 14

  15. The ACID Properties ❖ Transaction management should ensure 4 key properties ❖ Atomicity Techniques: Logging and Recovery A transaction should be indivisible /“all or nothing” ❖ Consistency App semantics; Techniques: Schedules Database should not become inconsistent in the end ❖ Isolation Techniques: Concurrency Control A transaction should not worry about or interact with other concurrent transactions on the DBMS ❖ Durability Techniques: Logging and Recovery All changes of a “Committed” transaction must persist 15

  16. Transaction Management Motivation and Basics ❖ The ACID Properties Transaction Schedules ❖ Conflicts and Other Issues Concurrency Control ❖ SQL/RDBMS Features 16

  17. Transaction Schedules T1 T2 Begin Begin ❖ Tells us how to interleave T1 T2 R T1 (A) concurrent transactions Begin Begin W T1 (A) ❖ Benefits of interleaving: R T1 (A) R T2 (A) R T2 (A) Lets us exploits disk-CPU W T1 (A) W T2 (A) Time W T2 (A) and multi-core parallelism R T1 (B) R T2 (B) R T1 (B) Helps avoid “ starvation ” of W T1 (B) W T2 (B) W T1 (B) shorter transactions Commit Commit Commit Temporal order of ops in a R T2 (B) transaction must be preserved! W T2 (B) A “ schedule ” Commit 17

  18. Transaction Schedules T1 T2 Begin Begin Begin R T1 (A) Begin R T2 (A) ❖ Complete Schedule R T1 (A) W T1 (A) W T2 (A) Each transaction ends with either W T1 (A) R T1 (B) R T2 (B) a Commit or an Abort R T2 (A) W T1 (B) W T2 (B) ❖ Serial Schedule W T2 (A) Commit Commit No interleaving of ops from R T1 (B) Begin Begin different transactions R T2 (A) R T1 (A) W T1 (B) Any serial schedule is W T2 (A) W T1 (A) Abort Commit R T2 (B) R T2 (B) R T1 (B) considered “ acceptable ”, even if they end up with W T2 (B) W T2 (B) W T1 (B) different database states! Commit Commit Commit 18

  19. Transaction Schedules T1 T2 Begin Begin Q: What is wrong with this schedule? R T1 (A) R T2 (A) The update of A W T2 (A) by T2 is lost! W T1 (A) R T1 (B) “Bad” schedules like this could lead W T1 (B) to an inconsistent state! Commit R T2 (B) Q: What is a “ acceptable ” schedule? W T2 (B) Commit 19

  20. Transaction Schedules: Serializability ❖ Equivalence of Schedules: Two schedules are equivalent iff they lead the database to the same end state irrespective of its start state ❖ Serializable Schedule: A schedule that is equivalent to some complete serial schedule (for now, assume only Committed transactions; Aborts will be discussed later) Only a serializable schedule is an “ acceptable ” schedule! 20

  21. Transaction Schedules T1 T2 Begin Begin R T1 (A) Q: Is this schedule serializable? W T1 (A) R T2 (A) W T2 (A) R T1 (B) W T1 (B) Commit R T2 (B) W T2 (B) Commit 21

  22. Transaction Schedules T1 T2 Begin Begin R T1 (A) Q: Is this schedule serializable? W T1 (A) R T2 (A) R T1 (B) W T1 (B) W T2 (A) R T2 (B) W T2 (B) Commit Commit 22

  23. Transaction Schedules T1 T2 Begin Begin R T1 (A) Q: What about this one from before? R T2 (A) W T2 (A) W T1 (A) R T1 (B) W T1 (B) The update of A Commit by T2 is lost! R T2 (B) This is called a “ conflict ” W T2 (B) Commit 23

  24. Transaction Management Motivation and Basics ❖ The ACID Properties Transaction Schedules ❖ Conflicts and Other Issues Concurrency Control ❖ SQL/RDBMS Features 24

  25. Kinds of Transaction Conflicts ❖ WW Conflict (Overwriting Uncommitted Data) ❖ WR Conflict (Reading Uncommitted Data aka “Dirty” Read) ❖ RW Conflict (Unrepeatable Reads) 25

  26. WW Conflict T1 T2 Begin Begin (Overwriting Uncommitted Data) R T1 (A) R T2 (A) T1 overwrites T2’s update W T2 (A) without reading new A W T1 (A) R T1 (B) This schedule is not serializable! W T1 (B) Commit If a transaction writes an item without R T2 (B) reading it, it is called a “ Blind Write ” W T2 (B) Commit 26

  27. WR Conflict T1 T2 Begin Begin (Reading Uncommitted Data / Dirty Read) R T1 (A) W T1 (A) T1’s writes of A and B may be related; R T2 (A) T2 may read inconsistent database! W T2 (A) R T2 (B) W T2 (B) This schedule is not serializable! Commit R T1 (B) W T1 (B) Commit 27

  28. RW(R) Conflict T1 T2 Begin Begin (Unrepeatable Reads) R T1 (A) W T1 (C) R T2 (A) The two reads of A by T1 W T2 (A) R T2 (B) may yield different values W T2 (B) Commit This schedule is not serializable! R T1 (A) W T1 (D) Commit 28

  29. Review Question T2 T1 Begin Begin R T1 (A) Q: Is this a serializable schedule? W T1 (A) R T2 (A) Q: Does it have any conflicts? R T1 (B) W T1 (B) W T2 (A) R T2 (B) W T2 (B) Commit Commit 29

  30. Okay, what about Aborted transactions? 30

  31. Transactions with Aborts ❖ Serializability: only worry about Commited transactions and pretend as if Aborted transactions did not even happen! ❖ To make the above “illusion” possible, 2 new issues to deal with if Aborted transactions present: ❖ How to “undo” the effects of an Aborted transaction? All changes made by it should be undone Use Logging and Recovery (Later) ❖ What if some other transactions got “affected” by it? Must undo all affected transactions as well! 31

  32. Cascading Aborts T1 T2 Begin Abort of T1 leads to Abort of T2! Begin R T1 (A) W T1 (A) T2’s Commit will be replaced with R T2 (A) an Abort by the Txn Manager! W T2 (A) Abort Commit Abort 32

Recommend


More recommend