1 Supplemental Notes: Practical Aspects of Transactions THIS MATERIAL IS NOT COVERED IN THE BOOK Dan Suciu -- 444 Spring 20102 Buffer Manager Policies • STEAL or NO-STEAL – Can an update made by an uncommitted transaction overwrite the most recent committed value of a data item on disk? • FORCE or NO-FORCE – Should all updates of a transaction be forced to disk before the transaction commits? • Easiest for recovery: NO-STEAL/FORCE • Highest performance: STEAL/NO-FORCE Dan Suciu -- 444 Spring 20103 Write-Ahead Log • Enables the use of STEAL and NO-FORCE • Log: append-only file containing log records • For every update, commit, or abort operation – Write physical, logical, physiological log record – Note: multiple transactions run concurrently, log records are interleaved • After a system crash, use log to: – Redo some transaction that did commit – Undo other transactions that didn’t commit Dan Suciu -- 444 Spring 20104 Write-Ahead Log • All log records pertaining to a page are written to disk before the page is overwritten on disk • All log records for transaction are written to disk before the transaction is considered committed – Why is this faster than FORCE policy? • Committed transaction: transactions whose commit log record has been written to disk Dan Suciu -- 444 Spring 20105 ARIES Method 1. Analysis pass – Figure out what was going on at time of crash – List of dirty pages and active transactions 2. Redo pass (repeating history principle) – Redo all operations, even for transactions that will not commit – Get back to state at the moment of the crash 3. Undo pass – Remove effects of all uncommitted transactions – Log changes during undo in case of another crash during undo Dan Suciu -- 444 Spring 20106 ARIES Method Illustration [Figure 3 from Franklin97] Dan Suciu -- 444 Spring 2010 First undo and first redo log entry might be in reverse order7 ARIES Method Elements • Each page contains a pageLSN – Log Sequence Number of log record for latest update to that page – Will serve to determine if an update needs to be redone • Physiological logging – page-oriented REDO • Possible because will always redo all operations in order – logical UNDO • Needed to undo only one transaction Dan Suciu -- 444 Spring 20108 ARIES Data Structures • Active Transactions Table – Lists all running transactions (active transactions) – For each txn: lastLSN = most recent update by transaction • Dirty Page Table – Lists all dirty pages – For each dirty page: recoveryLSN (recLSN)= first LSN that caused page to become dirty • Write Ahead Log contains log records – LSN, prevLSN = previous LSN for same transaction – other attributes Dan Suciu -- 444 Spring 2010ARIES Data Structures 9 pageID recLSN P5 102 P6 103 P7 101 LSN prevLSN transID pageID Log entry 101 - T100 P7 102 - T200 P5 103 102 T200 P6 104 101 T100 P5 Dirty pages Log transID lastLSN T100 104 T200 103 Active transactions Dan Suciu -- 444 Spring 2010 P5 PageLSN=104 P6 PageLSN=103 P7 PageLSN=101 Buffer Pool10 ARIES Method Details • Steps under normal operations – Add log record – Update transactions table – Update dirty page table – Update pageLSN Dan Suciu -- 444 Spring 201011 Checkpoints Write into the log • Entire active transactions table • Entire dirty pages table Dan Suciu -- 444 Spring 201012 1. Analysis Phase • Goal – Determine point in log where to start REDO – Determine set of dirty pages when crashed • Conservative estimate of dirty pages – Identify active transactions when crashed • Approach – Rebuild active transactions table and dirty pages table – Reprocess the log from the beginning (or checkpoint) • Only update the two data structures – Compute: firstLSN = smallest of all recoveryLSN Dan Suciu -- 444 Spring 20101. Analysis Phase Dan Suciu -- 444 Spring 2010 13 (crash) Checkpoint Dirty pages Active transactions Log Replay history firstLSN2. Redo Phase Main principle: replay history • Process Log forward, starting from firstLSN • Read every log record, sequentially • Redo actions are not recorded in the log • Needs the Dirty Page Table Dan Suciu -- 444 Spring 2010 1415 2. Redo Phase: Details For each Log entry record LSN • If affected page is not in Dirty Page Table then do not update • If recoveryLSN > LSN, then no update • Read page from disk; If pageLSN > LSN, then no update • Otherwise perform update Dan Suciu -- 444 Spring 20103. Undo Phase Main principle: “logical” undo • Start from the end of the log, move backwards • Read only affected log entries • Redo actions are written in the Log as special entries: CLR (Compensating Log Records) • CLRs are redone, but never undone Dan Suciu -- 444 Spring 2010 163. Undo Phase: Details • “Loser transactions” = uncommitted transactions in Active Transactions Table • ToUndo = set of lastLSN of loser transactions • While ToUndo not empty: – Choose most recent (largest) LSN in ToUndo – If LSN = regular record: undo; write a CLR where CLR.undoNextLSN = LSN.prevLSN – If LSN = CLR record: (don’t undo !) insert CLR.undoNextLSN in ToUndo Dan Suciu -- 444 Spring 2010 1718 Handling Crashes during Undo [Figure 4 from Franklin97] Dan Suciu -- 444 Spring 201019 Implementation: Locking • Can serve to enforce serializability • Two types of locks: Shared and Exclusive • Also need two-phase locking (2PL) – Rule: once transaction releases lock, cannot acquire any additional locks! – So two phases: growing then shrinking • Actually, need strict 2PL – Release all locks when transaction commits or aborts Dan Suciu -- 444 Spring 201020 Phantom Problem • A “phantom” is a tuple that is invisible during part of a transaction execution but not all of it. • Example: – T0: reads list of books in catalog – T1: inserts a new book into the catalog – T2: reads list of books in catalog • New book will appear! • Can this occur? • Depends on locking details (eg, granularity of locks) • To avoid phantoms needs predicate locking Dan Suciu -- 444 Spring 201021 Deadlocks • Two or more transactions are
View Full Document