Gordon CPS 352 - Crash Recovery

Unformatted text preview:

CS352 Lecture - Crash RecoveryLast revised 11/16/06Objectives:1. To introduce the use of a log with deferred update2. To introduce the use of a log with immediate update3. To introduce shadow pagingMaterials:1. PROJECTABLES showing Shadow PagingI. IntroductionA. One of the most important functions of a DBMS is ensuring the integrity of the data in the face of various unpredictable events such as power failures, hardware failures, software failures etc. In fact, the uniform protection of data that a good DBMS provides may be one of its greatest advantages over traditional file processing.B. In general, it is necessary to protect data against several kinds of sources of corruption:1. Logical errors in the incoming data, which cause an operation to be aborted before it is completed. (Also in this category is the possibility that an interactive system may allow a user to abort an operation he has begun before it completes.) 2. Failure of a transaction to complete execution due to issues related to concurrency (e.g. rollback, deadlock).3. System crashes that shut down the system unexpectedly. These can arise from:a) Power failures.b) Hardware failures - e.g. a chip going bad in the CPU.c) Software failures - e.g. operating system crashes.d) Network communication failures (due to many possible causes)e) Human error - an operator pressing a wrong button or issuing a command that crashes the system.If a system crash occurs in the middle of updating the database, the writing of the new data may be only partially completed, resulting in corrupted data.14. Hardware failures that damage the media storing data. Of these, the most potentially catastrophic is a head crash, in which a disk drive head comes into contact with the surface of the disk - effectively destroying all the data on the platter.5. External catastrophes such as fire, flood, etc.C. In general, data is stored in three types of storage, each with its own degree of security against loss:1. Data in VOLATILE STORAGE - e.g. the main memory of the computer - is subject to loss at any time due to any kind of system failure. In particular, power failures, most hardware failures, and many software crashes will cause data in volatile storage to be lost.2. Data in NON-VOLATILE STORAGE - e.g. disk and tape - is much more secure. Data in non-volatile storage is generally not lost unless there is a power failure while it is being written or a catastrophic failure of the storage device itself (e.g. a head crash on a disk) or an external catastrophe such as fire or flooding. In this regard, tape is much less vulnerable than disk.3. Conceptually, STABLE STORAGE is storage that is immune to any kind of loss. While no storage medium is totally immune to destruction of data, stable storage may be approximated in one of two ways:a) The use of certain kinds of storage media, such as write-once laser disks. Data written on a such media is immune to virtually any possible source of damage short of the destruction of the disk itself.b) The writing of the same data on more than one non-volatile medium, so that if one is damaged the other(s) will still retain the data intact.(1) Use of on-site mirroring - e.g. through RAID(2) Use of off-site mirroring (remote backup), which protects against physical dangers as well as system errors/crashes.D. Actually, the protection of data involves two different sorts of measures.1. Regular system backups are an essential part of protecting data against loss. Backups are designed to prevent loss of data due to physical damage to non-volatile storage media (e.g. head crashes on the disk), 2and also provide some protection against inadvertent erasure of data that is still needed.2. Measures taken to recover from aborted operations and system crashes.a) Note that backup is particularly designed to protect against loss of data due to damage to non-volatile storage media - a relatively rare occurrence. Since any work done since the last backup is not saved, other measures need to be taken to allow a fast restoration to normal operation after an aborted operation or a system crash. (It is not generally acceptable for a late afternoon power failure to be able to destroy all work done since the daily backup was run in the morning!) b) It is these measures we focus on today. The measures we will discuss seek to allow a rapid restoration of the system to a consistent state after an aborted operation or a crash that does NOT result in damage to non-volatile storage media (i.e. only the contents of volatile storage are lost.)3. The concept of a transaction will be a the heart of our discussion. In particular, the measures we will discuss are related to ensuring the durability property of a transaction. We will assume that each transaction is assigned a unique identification (e.g. a serial number), and that some record of incoming transactions is kept. We must ensure the durability of every transaction that committed before the crash, and must also deal with transactions that were in process at the time of the crash.E. Crucial to many schemes for guaranteeing the consistency of a database is the notion of a processing LOG, stored in "stable" storage. During the processing of each transaction, a series of entries are made in the log. Each entry includes the transaction's serial number, an entry type, and possibly other data.1. When the transaction begins, a start transaction entry is made in the log.2. Appropriate entries are made in the log to record changes that the transaction makes to the database (more on these later.)3. One of two types of entry is made in the log when the transaction completes:3a) A COMMIT entry indicates that the transaction completed successfully, so that the durability all its changes to the database should be preserved.b) An ABORT entry would indicate that the transaction failed for some internal reason (logical error in the data or user abort), so that none of its changes to the database should be allowed to remain.c) If a transaction was in process (but not finished) when a system crash occurs, then neither of these entries will appear in the log. This implies that:(1) No changes that the transaction has already made to the database should be allowed to remain.(2) If possible, the transaction can be restarted from the beginning after the database is restored to a consistent state. 4. We suggested that the log should be maintained in stable storage. Actually, this


View Full Document

Gordon CPS 352 - Crash Recovery

Download Crash Recovery
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Crash Recovery and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Crash Recovery 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?