Database Recovery Mechanisms Using Transaction Logs
Log-Based Recovery in DBMS
Log-based recovery is a mechanism used in DBMS to restore the database to a consistent state after a system crash or failure. The system maintains a log file stored on stable storage, recording every transaction’s operations before applying them to the database. Each log entry contains the transaction ID, data item name, old value, and new value. Using these logs, the DBMS can reconstruct the database by either undoing incomplete transactions or redoing completed ones.
Major Recovery Operations
Two major operations are used during recovery:
- UNDO: Restores the database to its previous state by reversing operations of uncommitted transactions. The log entry format is <T, X, old_value, new_value>, and the system restores old_value.
- REDO: Reapplies the operations of committed transactions to ensure durability after a crash. The system uses the new_value from the log entry.
Recovery Example
Transaction T1 updates X from 10 → 20, and T2 updates Y from 50 → 70.
Log Records:
- <T1, X, 10, 20>
- <T2, Y, 50, 70>
- <Commit T1>
If a crash occurs before T2 commits, recovery performs:
- ✔ Redo T1 → write X = 20
- ✔ Undo T2 → write Y = 50
Thus, log-based recovery maintains both consistency and durability.
Traditional Recovery Techniques
Traditional recovery techniques ensure database consistency after failures such as system crashes, transaction errors, or media failures. These techniques include Deferred Update, Immediate Update, Checkpoints, and Shadow Paging.
Deferred Update Technique
• Changes are stored temporarily and written to the database only after the transaction commits. Since no update happens before commit, UNDO recovery is unnecessary.
Example: If T1 updates A=10→30 but crashes before commit, no changes are applied.
Immediate Update Technique
• Changes are written to the database before commit, but logs are written first to ensure recovery. Both UNDO and REDO may be required.
Shadow Paging
• Instead of using logs, the system keeps two copies of the database: the original (shadow) and the working copy. Upon commit, pointers switch to the new version. It eliminates logging but increases overhead.
Checkpointing in Traditional Recovery
• A checkpoint marks the point where all committed transactions are saved to disk. Logs before a checkpoint can be ignored during recovery. Traditional recovery techniques aim to ensure atomicity and durability with minimal performance overhead. They form the foundation of modern advanced recovery systems used in distributed DBMS and real-time applications.
The Role of Checkpointing
A checkpoint is a recorded moment during database execution where all committed transactions have their updated values permanently saved to stable storage. It acts as a recovery marker so the DBMS does not need to examine the entire log file during a crash, reducing recovery time significantly.
Checkpointing Tasks
During checkpointing, the system performs three tasks:
- Writes all dirty buffers to disk.
- Records active transactions in a checkpoint log entry.
- Saves the checkpoint marker to stable storage.
Distributed Checkpointing
In distributed environments, checkpoints are essential because transactions may span multiple systems. To maintain consistency, distributed checkpoint algorithms such as Coordinated Checkpointing, Uncoordinated Checkpointing, and Communication-Induced Checkpointing are used.
Coordinated Checkpoint Example
Assume T1 executes across Site A and Site B. To ensure consistent recovery, both sites synchronize before writing checkpoints.
In case of failure:
- Logs after the last checkpoint are scanned.
- REDO is applied to committed transactions after checkpoint.
- UNDO is applied only to incomplete ones.
Checkpoints reduce recovery overhead and prevent rollback propagation in distributed DBMS.
