The System Failure Problem It is possible that the stable database: (because of buffering of blocks in main memory) contains values written by uncommitted transactions. Recovery Techniques does not contain values written by committed transactions. Recovery protocols implement two actions: Undo action: required for atomicity. – Undoes all updates on the stable storage by an uncommitted transaction. Redo action: required for durability – redoes the update (on the stable storage) of committed transactions. CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 1 2 Recovery Techniques and Assumptions Failure Types Undo/Redo Algorithm Program Failures logical errors Undo/No-Redo bad input No-Undo/Redo (also called logging with deferred updates ) unavailable data No-Undo/No-Redo (also called shadowing ) resource limits user cancellation System Failures computer hardware malfunction bugs in O.S. power failures operator error CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 3 4 1
Failure Types Centralized DBMS T 1 T 2 T n Media Failures {Start, Read(x), Write(x), Commit, Abort} Transaction Manager disk head crash data transfer error {Start, Read(x), Write(x), Commit, Abort} Actions of Scheduler: 1. Execution Scheduler Disk controller failure 2. Reject 3. Delay Unrecoverable errors {Start, Read(x), Write(x), Commit, Abort} Data Manager Recovering Manager failure to make archive dumps destruction of archives {Flush(x), Fetch(x), Fix(x), Unfix(x), Write(x) } Cache Manager Database Buffer Log Buffer Stable Database DiskRead(x,a,b) Temporary Log DiskWrite(x,a,b) and Catalog Support: Transaction UNDO Global UNDO | Partial REDO Archive Log Archive Database Support: Global REDO CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 5 6 Recovery Techniques and Assumptions Buffer Management All the techniques assume the following: The goal of Cache or Buffer Manager ( BM ) is to Failures are detectable. maximize the likelihood that a block of data needed by a transaction is in main memory. Write operations are atomic(i.e., execute either in its entirety or not at all). The main memory is partitioned into buffer blocks (or – If this is not the case, we consider this failure as simply blocks). media failure. The size of a buffer is equal to the disk block size. The scheduler sends operations to DM in an order Buffer0 Buffer1 Buffer2 Buffer3 Buffer4 which produces executions that are correct and strict X2 X5 X3 No media failure buffer block The granularity of Writes that DM processes is the main memory same as the one of the atomic Write supported by the disk disk block hardware. X0 X1 X2 X3 X4 X5 CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 7 8 2
Buffer Management Buffer Management Table page dirty fix buffer If no more buffers are available the BM must replace one Id bit count number of the buffer blocks (writing the block back to disk, if it x 0 0 0 has been updated). y 1 2 1 Least Recently Used (LRU), z 0 1 2 Least Frequently Used (LFU), etc. Concurrency and recovery are two other factors affecting the replacement algorithm. Buffer Management Operations Read, Write, Fetch, Flush, Force-Write, Fix, Unfix CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 9 10 Buffer Management Operations Buffer Mgmt Operations Fix ( pid, flags )(Also called Pin ) Touch ( pid, flags ) A fixed page will not be replaced until the page is sets the flags of the page in the buffer table unfixed. without unfixing the pageBuffer Management It may involve up to two I/O's, one to write out a dirty Operations page and another to read in the requested page. Possible flags: make the page dirty Unfix ( pid, flags )(Also called Unpin ) decrements by one the counter that indicates the flush the page number of transactions that have fixed a page. fix the page without reading it from disk, etc If this counter is 0 the page is made available for swapping, e.g., it is placed at the tail of the LRU queue. CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 11 12 3
Stable Database Stable Database Shadowing: Write the new value of x in a copy (older The stable database is the state of the database in versions are called shadow copy ). stable storage. Disk read There are two ways to update a data item x in stable Di storage( propagation strategies ): Pi Dj In-Place Updating: Update x in place (i.e., overwrite x) Disk write In this case, there must be a directory in stable storage to Disk read tell where each item is. Di Directory Directory Pi Stable Storage Copy A Copy B Disk write X value of X X Y value of Y Y new Y CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 13 14 Logging Logging A Log is a sequence of records which represent all The order in which updates appear is the same as the modifications to the database. Log records may describe order in which they actually occurred. The precise way either physical changes or logical database operations. history is represented in the log depends on the technique followed by the recovery manager. A physical log contains information about the actual values of data items written by transactions. – state before change, before image – state after change, after image – transition causing the change A logical log represents higher level operations; e.g., insert this key in that index. CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 15 16 4
Log Records Log Records Update Record for physical state logging at page level For the moment, we will assume that a log record may be one of the following types: – [ T i , x, b, a] T i : the id of the transaction that performed a Write operation on x Start Record x: the id of data item x – [ T i , start] b: before image of x Commit Record a: after image of x – [ T i , commit] Abort Record – Assuming Strict Executions – [ T i , abort] [ T j , x, b]: T j wrote into x before T i [ T i , x, a] CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 17 18 Log Records Logical Logging on the Record Level Update Record for physical transition logging on page Simply record the operation and its arguments level [ T i , Op, Inv-op, Arg] Op = {Insert, Delete, Update} [REDO] – [ T i ,x, b, d] Inv-op = inverse operation [UNDO] – d is the difference between the before and after images Arg = arguments – d = before ⊗ after => It is not possible in all models to automatically generate the inverse; e.g., the network model. CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 19 20 5
Undoing and Redoing Writes Undoing and Redoing Writes UNDO Rule ( WAL, Write Ahead Logging principle ) REDO Rule T writes x T writes x T aborts or System crash T commits – If x was transferred to disk, then we need the System crash before image of x to undo this update. – If x was not transferred to disk, at restart time we need the after image of x to redo T 's update. Thus, when x is updated by T , the DM should store first the before image of x in the log on stable storage and Thus, the DM should not commit a transaction T until the then x itself in the stable database. after image of each data item written by T is in stable storage. CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 21 22 BM Table for Buffered Log Restarts page dirty fix block buffer Restart : consult the log and for each transaction T i do Id bit count LSN number the following: x 0 0 812 0 redo the updates of T i if there is a commit record of y 1 1 10 1 z 0 1 123 2 T i in the log. Undo the updates of T i if there is no such record in log (i.e., T i had been aborted or it was active when the system crashed). The Undo rule is: Before the BM replaces a block it should flush all log entries whose LSN is less than or equal to the LSN recorded on this buffer block. CS2550, Panos K. Chrysanthis – University of Pittsburgh CS2550, Panos K. Chrysanthis – University of Pittsburgh 23 24 6
Recommend
More recommend