crash recovery
play

Crash recovery Organization 13: Failure and Recovery Boris Glavic - PDF document

Now CS 525: Advanced Database Crash recovery Organization 13: Failure and Recovery Boris Glavic Slides: adapted from a course taught by Hector Garcia-Molina, Stanford InfoLab CS 525 Notes 13 - Failure and Recovery 1 CS 525 Notes 13 -


  1. Now CS 525: Advanced Database • Crash recovery Organization 13: Failure and Recovery Boris Glavic Slides: adapted from a course taught by Hector Garcia-Molina, Stanford InfoLab CS 525 Notes 13 - Failure and Recovery 1 CS 525 Notes 13 - Failure and Recovery 2 How can constraints be violated? Correctness (informally) • Transaction bug • If we stop running transactions, • DBMS bug DB left consistent • Hardware failure • Each transaction sees a consistent DB e.g., disk crash alters balance of account • Data sharing e.g.: T1: give 10% raise to programmers T2: change programmers ⇒ systems analysts CS 525 Notes 13 - Failure and Recovery 3 CS 525 Notes 13 - Failure and Recovery 4 Recovery Events Desired Undesired Expected • First order of business: Unexpected Failure Model CS 525 Notes 13 - Failure and Recovery 5 CS 525 Notes 13 - Failure and Recovery 6 1

  2. Desired events: see product manuals…. Our failure model Undesired expected events: processor CPU System crash - memory lost memory disk M D - cpu halts, resets CS 525 Notes 13 - Failure and Recovery 7 CS 525 Notes 13 - Failure and Recovery 8 Desired events: see product manuals…. Undesired Unexpected: Everything else! Examples: Undesired expected events: • Disk data is lost System crash • Memory lost without CPU halt - memory lost • CPU implodes wiping out universe…. - cpu halts, resets that ’ s it!! Undesired Unexpected: Everything else! CS 525 Notes 13 - Failure and Recovery 9 CS 525 Notes 13 - Failure and Recovery 10 Is this model reasonable? Second order of business: Approach: Add low level checks + Storage hierarchy redundancy to increase probability model holds x x E.g., Replicate disk storage (stable store) Memory parity Memory Disk DB Buffer CPU checks CS 525 Notes 13 - Failure and Recovery 11 CS 525 Notes 13 - Failure and Recovery 12 2

  3. Operations: Operations: • Input (x): block containing x → memory • Input (x): block containing x → memory • Output (x): block containing x → disk • Output (x): block containing x → disk • Read (x,t): do input(x) if necessary t ← value of x in block • Write (x,t): do input(x) if necessary value of x in block ← t CS 525 Notes 13 - Failure and Recovery 13 CS 525 Notes 13 - Failure and Recovery 14 T 1 : Read (A,t); t ← t × 2 Key problem Unfinished transaction Write (A,t); Read (B,t); t ← t × 2 Write (B,t); Example Constraint: A=B Output (A); T 1 : A ← A × 2 Output (B); B ← B × 2 A: 8 A: 8 B: 8 B: 8 memory disk CS 525 Notes 13 - Failure and Recovery 15 CS 525 Notes 13 - Failure and Recovery 16 T 1 : Read (A,t); t ← t × 2 T 1 : Read (A,t); t ← t × 2 Write (A,t); Write (A,t); Read (B,t); t ← t × 2 Read (B,t); t ← t × 2 Write (B,t); Write (B,t); Output (A); Output (A); failure! Output (B); Output (B); A: 8 16 A: 8 16 A: 8 A: 8 16 B: 8 16 B: 8 16 B: 8 B: 8 memory memory disk disk CS 525 Notes 13 - Failure and Recovery 17 CS 525 Notes 13 - Failure and Recovery 18 3

  4. How to restore consistent state after crash? • Need atomicity: • Desired state after recovery: – execute all actions of a transaction or – Changes of committed transactions are reflected none at all on disk – Changes of unfinished transactions are not reflected on disk • After crash we need to – Undo changes of unfinished transactions that have been written to disk – Redo changes of finished transactions that have not been written to disk CS 525 Notes 13 - Failure and Recovery 19 CS 525 Notes 13 - Failure and Recovery 20 How to restore consistent T 1 : Read (A,t); t ← t × 2 T 1 is unfinished state after crash? -> need to undo the Write (A,t); write to A to recover • After crash we need to Read (B,t); t ← t × 2 to consistent state Write (B,t); – Undo changes of unfinished transactions that have been written to disk Output (A); failure! – Redo changes of finished transactions that have Output (B); not been written to disk • We need to either – Store additional data to be able to Undo/Redo A: 8 16 – Avoid ending up in situations where we need to B: 8 Undo/Redo memory disk CS 525 Notes 13 - Failure and Recovery 21 CS 525 Notes 13 - Failure and Recovery 22 Logging Buffer Replacement Revisited • After crash need to – Undo • Now we are interested in knowing how – Redo buffer replacement influences recovery! • We need to know – Which operations have been executed – Which operations are reflected on disk • -> Log upfront what is to be done CS 525 Notes 13 - Failure and Recovery 23 CS 525 Notes 13 - Failure and Recovery 24 4

  5. Buffer Replacement Revisited Buffer Replacement Revisited • Steal : all pages with fix count = 0 are • Force : Pages modified by transaction replacement candidates are flushed to disk at end of transaction – Smaller buffer requirements – No redo required • No steal: pages that have been • No force: modified (dirty) pages are modified by active transaction -> not allowed to remain in buffer after end of considered for replacement transaction – No need to undo operations of unfinished – Less repeated writes of same page transactions after failure CS 525 Notes 13 - Failure and Recovery 25 CS 525 Notes 13 - Failure and Recovery 26 Effects of Buffer Replacement Schedules and Recovery • Are there certain schedules that are force No force easy/hard/impossible to recover from? No steal • No Undo • No Undo • No Redo • Redo • Undo • Redo steal • No Redo • Undo CS 525 Notes 13 - Failure and Recovery 27 CS 525 Notes 13 - Failure and Recovery 28 T 1 = w 1 (X),c 1 � Recoverable Schedules T 2 = r 2 (X),w 2 (X),c 2 � Recoverable Schedule • We should never have to rollback an already S 1 = w 1 (X),r 2 (X),w 2 (X),c 1 ,c 2 � committed transaction (D in ACID) • Recoverable schedules require that Nonrecoverable Schedule – A transaction does not commit before every transaction that is has read from has committed S 2 = w 1 (X),r 2 (X),w 2 (X),c 2 ,c 1 � – A transaction T reads from another transaction T’ if it reads an item X that has last been written by T’ and T’ has not aborted before the read CS 525 Notes 13 - Failure and Recovery 29 CS 525 Notes 12 - Transaction 30 Management 5

  6. Cascading Abort Cascadeless Schedules • Transaction T has written an item that is later • Cascadeless schedules guarantee that there read by T’ and T aborts after that are no cascading aborts – we have to also abort T’ because the value it read – Transactions only read values written by already is no longer valid anymore committed transactions – This is called a cascading abort – Cascading aborts are complex and should be avoided S = … w 1 (X) … r 2 (X) … a 1 � CS 525 Notes 13 - Failure and Recovery 31 CS 525 Notes 13 - Failure and Recovery 32 T 1 = w 1 (X),c 1 � T 1 = w 1 (X),a 1 � Consider what T 2 = r 2 (X),w 2 (X),c 2 � T 2 = r 2 (X),w 2 (X),c 2 � happens if T1 aborts! Cascadeless Schedule Cascadeless Schedule S 1 = w 1 (X),c 1, r 2 (X),w 2 (X),c 2 � S 1 = w 1 (X),a 1, r 2 (X),w 2 (X),c 2 � Recoverable Schedule Recoverable Schedule S 2 = w 1 (X),r 2 (X),w 2 (X),c 1 ,c 2 � S 2 = w 1 (X),r 2 (X),w 2 (X),a 1 ,a 2 � Nonrecoverable Schedule Nonrecoverable Schedule S 3 = w 1 (X),r 2 (X),w 2 (X),c 2 ,c 1 � S 3 = w 1 (X),r 2 (X),w 2 (X),c 2 ,a 1 � CS 525 Notes 12 - Transaction 33 CS 525 Notes 12 - Transaction 34 Management Management T 1 = w 1 (X),c 1 � Strict Schedules T 2 = r 2 (X),w 2 (X),c 2 � Cascadeless Schedule • Strict schedules guarantee that to Undo the S 1 = w 1 (X),c 1, r 2 (X),w 2 (X),c 2 � effect of an transaction we simply have to undo each of its writes Recoverable Schedule – Transactions do not read nor write items written S 2 = w 1 (X),r 2 (X),w 2 (X),c 1 ,c 2 � by uncommitted transactions Nonrecoverable Schedule S 3 = w 1 (X),r 2 (X),w 2 (X),c 2 ,c 1 � CS 525 Notes 13 - Failure and Recovery 35 CS 525 Notes 12 - Transaction 36 Management 6

  7. All schedules Compare Classes Recoverable schedules ST ⊂ CL ⊂ RC ⊂ ALL Cascadeless schedules Strict schedules CS 525 Notes 13 - Failure and Recovery 37 CS 525 Notes 13 - Failure and Recovery 38 Logging and Recovery One solution: undo logging (immediate modification) • We now discuss approaches for logging and how to use them in recovery due to: Hansel and Gretel, 782 AD CS 525 Notes 13 - Failure and Recovery 39 CS 525 Notes 13 - Failure and Recovery 40 Undo logging (Immediate modification) One solution: undo logging (immediate T 1 : Read (A,t); t ← t × 2 A=B modification) Write (A,t); Read (B,t); t ← t × 2 Write (B,t); due to: Hansel and Gretel, 782 AD Output (A); Output (B); • Improved in 784 AD to durable A:8 A:8 undo logging B:8 B:8 memory disk log CS 525 Notes 13 - Failure and Recovery 41 CS 525 Notes 13 - Failure and Recovery 42 7

Recommend


More recommend