Transactions (or units of work) against a database can be interrupted unexpectedly. If a failure occurs before all of the changes that are part of the unit of work are completed and committed, the database is left in an inconsistent and unusable state. Crash recovery is the process by which the database is moved back to a consistent and usable state. This is done by rolling back incomplete transactions and completing committed transactions that were still in memory when the crash occurred. When a database is in a consistent and usable state, it has attained what is known as a "point of consistency".
A transaction failure results from a severe error or condition that causes the database or the database manager to end abnormally. Partially completed units of work, or UOW that have not been flushed to disk at the time of failure, leave the database in an inconsistent state. Following a transaction failure, the database must be recovered. Conditions that can result in transaction failure include:
- A power failure on the machine, causing the database manager and the database partitions on it to go down
- A hardware failure such as memory corruption, or disk, CPU, or network failure.
- A serious operating system error that causes DB2® to go down
- An application terminating abnormally.
In computer science, Algorithms for Recovery and Isolation Exploiting Semantics, or ARIES is a recovery algorithm designed to work with a no-force, steal database approach; it is used by IBM DB2, Microsoft SQL Server and many other database systems.
The actual recovery process consists of three passes:
- Analysis. The recovery subsystem determines the earliest log record from which the next pass must start. It also scans the log forward from the checkpoint record to construct a snapshot of what the system looked like at the instant of the crash.
- Redo. Starting at the earliest LSN determined in pass (1) above, the log is read forward and each update redone.
- Undo. The log is scanned backward and updates corresponding to loser transactions are undone.
It is clear from this description of ARIES that the following features are required for a log manager:
- Ability to write log records. The log manager should maintain a log tail in main memory and write log records to it. The log tail should be written to stable storage on demand or when the log tail gets full. Implicit in this requirement is the fact that the log tail can become full halfway through the writing of a log record. It also means that a log record can be longer than a page.
- Ability to wraparound. The log is typically maintained on a separate disk. When the log reaches the end of the disk, it is wrapped around back to the beginning.
- Ability to store and retrieve the master log record. The master log record is stored separately in stable storage, possibly on a different duplex-disk.
- Ability to read log records given an LSN. Also, the ability to scan the log forward from a given LSN to the end of log. Implicit in this requirement is that the log manager should be able to detect the end of the log and distinguish the end of the log from a valid log record's beginning.
- Ability to create a log. In actual practice, this will require setting up a duplex-disk for the log, a duplex-disk for the master log record, and a raw device interface to read and write the disks bypassing the Operating System.
- Ability to maintain the log tail. This requires some sort of shared memory because the log tail is common to all transactions accessing the database the log corresponds to. Mutual exclusion of log writes and reads have to be taken care of.
Write Ahead Logging (WAL)
- Write ahead logging (WAL) is a method to ensure that if a data page is modified on disk, we have a log record for it on disk.
- To accomplish this, the log is always written to disk ahead of the data.
Before writing any data page modified from memory to disk: First, flush all the log records currently in memory including the information about what was changed in this data page. After log is written, the data pages can be written to disk.
Recovery from a crash
- ARIES series of algorithms provide safe recovery from a crash.
- Often recovery occurs after a catastrophic event that causes loss of all state information.
- To recover, we must find out the state of the database just before crash based on the portion of the log on disk. The first step of recovery is the “analysis step”.
- The analysis step will read log from the beginning all the way to its end to find all transactions that have ended and all transactions that were still in progress.
- To simplify analysis, we can take period snapshots of the database state called checkpoints.
- The analysis starts from the latest checkpoint.
- Based on the analysis, we find two things:
- All pages modified by committed transactions that may not have been written to disk. All these changes must be redone.If force is used, there is no need to REDO.
- All transactions that were still executing at the time of crash. The changes by these transactions must be undone.If steal is not used, there is no need UNDO.
- READ MORE HERE
0 comments:
Post a Comment
No insult and no Abuse