21 Database Recovery Intro to Database Systems Andy Pavlo AP AP 15-445/15-645 Computer Science Carnegie Mellon University Fall 2019
2 CRASH RECOVERY Recovery algorithms are techniques to ensure database consistency, transaction atomicity, and durability despite failures. Recovery algorithms have two parts: → Actions during normal txn processing to ensure that the DBMS can recover from a failure. → Actions after a failure to recover the database to a state Today that ensures atomicity, consistency, and durability. CMU 15-445/645 (Fall 2019)
3 ARIES A lgorithms for R ecovery and I solation E xploiting S emantics Developed at IBM Research in early 1990s for the DB2 DBMS. Not all systems implement ARIES exactly as defined in this paper but they're close enough. CMU 15-445/645 (Fall 2019)
4 ARIES M AIN IDEAS Write-Ahead Logging: → Any change is recorded in log on stable storage before the database change is written to disk. → Must use STEAL + NO-FORCE buffer pool policies. Repeating History During Redo: → On restart, retrace actions and restore database to exact state before crash. Logging Changes During Undo: → Record undo actions to log to ensure action is not repeated in the event of repeated failures. CMU 15-445/645 (Fall 2019)
5 Log Sequence Numbers Normal Commit & Abort Operations Fuzzy Checkpointing Recovery Algorithm CMU 15-445/645 (Fall 2019)
6 WAL RECO RDS We need to extend our log record format from last class to include additional info. Every log record now includes a globally unique log sequence number (LSN). Various components in the system keep track of LSNs that pertain to them… CMU 15-445/645 (Fall 2019)
7 LO G SEQ UEN CE N UM BERS Name Where Definition Memory Last LSN in log on disk flushedLSN page x Newest update to page x pageLSN Oldest update to page x page x recLSN since it was last flushed T i Latest record of txn T i lastLSN Disk LSN of latest checkpoint MasterRecord CMU 15-445/645 (Fall 2019)
8 WRITIN G LO G RECO RDS Each data page contains a pageLSN . → The LSN of the most recent update to that page. System keeps track of flushedLSN . → The max LSN flushed so far. Before page x can be written to disk, we must flush log at least to the point where: → pageLSN x ≤ flushedLSN CMU 15-445/645 (Fall 2019)
9 WRITIN G LO G RECO RDS Log Sequence Numbers Log Sequence Numbers WAL ? WAL (Tail) 001 :<T 1 BEGIN > 002 :<T 1 , A, 1, 2> 017 :<T 5 BEGIN > 003 :<T 1 COMMIT > 004 :<T 2 BEGIN > 018 :<T 5 , A, 9, 8> 005 :<T 2 , A, 2, 3> 006 :<T 3 BEGIN > 019 :<T 5 , B, 5, 1> 007 :< CHECKPOINT > 008 :<T 2 COMMIT > 020 :<T 5 COMMIT > 009 :<T 3 , A, 3, 4> 010 :<T 4 BEGIN > ⋮ 011 :<T 4 , X, 5, 6> 012 :<T 4 , Y, 9, 7> ? 013 :<T 3 , B, 4, 2> 014 :<T 3 COMMIT > 015 :<T 4 , B, 2, 3> 016 :<T 4 , C, 1, 2> Buffer Pool pageLSN recLSN pageLSN recLSN A=9 B=5 C=2 A=9 B=5 C=2 Not safe to unpin because Safe to unpin because MasterRecord pageLSN ≤ flushedLSN pageLSN > flushedLSN flushedLSN Database CMU 15-445/645 (Fall 2019)
10 WRITIN G LO G RECO RDS All log records have an LSN . Update the pageLSN every time a txn modifies a record in the page. Update the flushedLSN in memory every time the DBMS writes out the WAL buffer to disk. CMU 15-445/645 (Fall 2019)
11 N O RM AL EXECUTIO N Each txn invokes a sequence of reads and writes, followed by commit or abort. Assumptions in this lecture: → All log records fit within a single page. → Disk writes are atomic. → Single-versioned tuples with Strict 2PL. → STEAL + NO-FORCE buffer management with WAL. CMU 15-445/645 (Fall 2019)
12 TRAN SACTIO N CO M M IT Write COMMIT record to log. All log records up to txn’s COMMIT record are flushed to disk. → Note that log flushes are sequential, synchronous writes to disk. → Many log records per log page. When the commit succeeds, write a special TXN- END record to log. → This does not need to be flushed immediately. CMU 15-445/645 (Fall 2019)
13 TRAN SACTIO N CO M M IT We can trim the in-memory log up to flushedLSN WAL WAL (Tail) 001 :<T 1 BEGIN > 002 :<T 1 , A, 1, 2> 012 :<T 4 BEGIN > 003 :<T 1 COMMIT > 004 :<T 2 BEGIN > 013 :<T 4 , A, 9, 8> 005 :<T 2 , A, 2, 3> 006 :<T 3 BEGIN > 014 :<T 4 , B, 5, 1> 007 :< CHECKPOINT > 008 :<T 2 COMMIT > 015 :<T 4 COMMIT > 009 :<T 3 , A, 3, 4> 010 :<T 3 , B, 4, 2> ⋮ 011 :<T 3 , COMMIT > 012 :<T 4 BEGIN > 099 :<T 4 TXN-END > 013 :<T 4 , A, 9, 8> 014 :<T 4 , B, 5, 1> 015 :<T 4 COMMIT > Buffer Pool pageLSN recLSN pageLSN recLSN A=9 B=5 C=2 A=9 B=5 C=2 MasterRecord flushedLSN Database flushedLSN = 015 CMU 15-445/645 (Fall 2019)
15 TRAN SACTIO N ABO RT Aborting a txn is actually a special case of the ARIES undo operation applied to only one transaction. We need to add another field to our log records: → prevLSN : The previous LSN for the txn. → This maintains a linked-list for each txn that makes it easy to walk through its records. CMU 15-445/645 (Fall 2019)
16 TRAN SACTIO N ABO RT LSN | prevLSN WAL WAL (Tail) 012|nil :<T 4 BEGIN > Important: Need to record what 013|012 :<T 4 , A, 9, 8> steps we took to undo the txn. 014|013 :<T 4 , B, 5, 1> 015|014 :<T 4 ABORT > ??? 099|098 :<T 4 TXN-END > Buffer Pool pageLSN recLSN pageLSN recLSN A=9 B=5 C=2 A=9 B=5 C=2 MasterRecord flushedLSN Database CMU 15-445/645 (Fall 2019)
17 CO M PEN SATIO N LO G RECO RDS A CLR describes the actions taken to undo the actions of a previous update record. It has all the fields of an update log record plus the undoNext pointer (the next-to-be-undone LSN). CLRs are added to log like any other record. CMU 15-445/645 (Fall 2019)
18 TRAN SACTIO N ABO RT CLR EXAM PLE LSN prevLSN TxnId Type Object Before After UndoNext 001 nil T 1 BEGIN - - - - 002 001 T 1 UPDATE A 30 40 - TIM E ⋮ 011 002 T 1 ABORT - - - - CMU 15-445/645 (Fall 2019)
19 TRAN SACTIO N ABO RT CLR EXAM PLE LSN prevLSN TxnId Type Object Before After UndoNext 001 nil T 1 BEGIN - - - - 002 001 T 1 UPDATE A 30 40 - TIM E ⋮ 011 002 T 1 ABORT - - - - ⋮ 026 011 T 1 CLR-002 A 40 30 001 The LSN of the next log record to be undone. CMU 15-445/645 (Fall 2019)
20 TRAN SACTIO N ABO RT CLR EXAM PLE LSN prevLSN TxnId Type Object Before After UndoNext 001 nil T 1 BEGIN - - - - 002 001 T 1 UPDATE A 30 40 - TIM E ⋮ 011 002 T 1 ABORT - - - - ⋮ 026 011 T 1 CLR-002 A 40 30 001 027 026 T 1 TXN-END - - - nil CMU 15-445/645 (Fall 2019)
21 ABO RT ALGO RITH M First write an ABORT record to log for the txn. Then play back the txn's updates in reverse order. For each update record: → Write a CLR entry to the log. → Restore old value. At end, write a TXN-END log record. Notice: CLRs never need to be undone. CMU 15-445/645 (Fall 2019)
22 Log Sequence Numbers Normal Commit & Abort Operations Fuzzy Checkpointing Recovery Algorithm CMU 15-445/645 (Fall 2019)
23 N O N- FUZZY CH ECKPO IN TS The DBMS halts everything when it takes a checkpoint to ensure a consistent snapshot: → Halt the start of any new txns. → Wait until all active txns finish executing. → Flushes dirty pages on disk. This is obviously bad… CMU 15-445/645 (Fall 2019)
24 SLIGH TLY BETTER CH ECKPO IN TS Pause modifying txns while the DBMS takes the checkpoint. Transaction Checkpoint → Prevent queries from acquiring write latch Page #1 Page #1 on table/index pages. Page #2 → Don't have to wait until all txns finish before taking the checkpoint. Page #3 Page #3 We must record internal state as of Page #1 Page #2 Page #3 the beginning of the checkpoint. → Active Transaction Table (ATT) → Dirty Page Table (DPT) CMU 15-445/645 (Fall 2019)
25 ACTIVE TRAN SACTIO N TABLE One entry per currently active txn. → txnId : Unique txn identifier. → status : The current "mode" of the txn. → lastLSN : Most recent LSN created by txn. Entry removed when txn commits or aborts. Txn Status Codes: → R → Running → C → Committing → U → Candidate for Undo CMU 15-445/645 (Fall 2019)
26 DIRTY PAGE TABLE Keep track of which pages in the buffer pool contain changes from uncommitted transactions. One entry per dirty page in the buffer pool: → recLSN : The LSN of the log record that first caused the page to be dirty. CMU 15-445/645 (Fall 2019)
27 SLIGH TLY BETTER CH ECKPO IN TS WAL <T 1 BEGIN > At the first checkpoint, T 2 is still <T 2 BEGIN > <T 1 , A →P 11 , 100, 120> running and there are two dirty pages <T 1 COMMIT > ( P 11 , P 22 ). <T 2 , C →P 22 , 100, 120> < CHECKPOINT ATT={T 2 }, At the second checkpoint, T 3 is active DPT={P 11 ,P 22 } > <T 3 START > and there are two dirty pages ( P 11 , P 33 ). <T 2 , A →P 11 , 120, 130> <T 2 COMMIT > <T 3 , B →P 33 , 200, 400> This still is not ideal because the DBMS < CHECKPOINT must stall txns during checkpoint… ATT={T 3 }, DPT={P 11 ,P 33 } > <T 3 , B →P 33 , 400, 600> CMU 15-445/645 (Fall 2019)
Recommend
More recommend