Lecture 28: RecoveryOutlineExample 7.38Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Types of FailuresSystem FailuresTransactionsSlide 13Correctness PrinciplePrimitive Operations of TransactionsExampleThe LogUndo LoggingUndo-Logging RulesPowerPoint PresentationRecovery with Undo LogSlide 22Slide 23Slide 24Slide 25CheckpointingUndo Recovery with CheckpointingNonquiescent CheckpointingSlide 29Undo Recovery with Nonquiescent CheckpointingLecture 28: RecoveryFriday, December 1, 2000Outline•Finish example 7.38•Logging (8.1)•Undo loging (8.2)Example 7.38•Logical plan is:•Main memory M = 101 buffersR(w,x)5,000 blocksS(x,y)10,000 blocksU(y,z)10,000 blocksk blocksExample 7.38Naïve evaluation: •2 partitioned hash-joins•Cost 3B(R) + 3B(S) + 4k + 3B(U) = 75000 + 4kR(w,x)5,000 blocksS(x,y)10,000 blocksU(y,z)10,000 blocksk blocksExample 7.38Smarter:•Step 1: hash R on x into 100 buckets, each of 50 blocks; to disk•Step 2: hash S on x into 100 buckets; to disk•Step 3: read each Ri in memory (50 buffer) join with Si (1 buffer); hash result on y into 50 buckets (50 buffers) -- here we pipeline•Cost so far: 3B(R) + 3B(S)R(w,x)5,000 blocksS(x,y)10,000 blocksU(y,z)10,000 blocksk blocksExample 7.38Continuing:•How large are the 50 buckets on y ? Answer: k/50.•If k <= 50 then keep all 50 buckets in Step 3 in memory, then:•Step 4: read U from disk, hash on y and join with memory•Total cost: 3B(R) + 3B(S) + B(U) = 55,000R(w,x)5,000 blocksS(x,y)10,000 blocksU(y,z)10,000 blocksk blocksExample 7.38Continuing:•If 50 < k <= 5000 then send the 50 buckets in Step 3 to disk–Each bucket has size k/50 <= 100•Step 4: partition U into 50 buckets•Step 5: read each partition and join in memory•Total cost: 3B(R) + 3B(S) + 2k + 3B(U) = 75,000 + 2kR(w,x)5,000 blocksS(x,y)10,000 blocksU(y,z)10,000 blocksk blocksExample 7.38Continuing:•If k > 5000 then materialize instead of pipeline•2 partitioned hash-joins•Cost 3B(R) + 3B(S) + 4k + 3B(U) = 75000 + 4kR(w,x)5,000 blocksS(x,y)10,000 blocksU(y,z)10,000 blocksk blocksExample 7.38Summary:•If k <= 50, cost = 55,000•If 50 < k <=5000, cost = 75,000 + 2k•If k > 5000, cost = 75,000 + 4kTypes of Failures•Wrong data entry–Prevent by having constraints in the database–Fix with data cleaning•Disk crashes–Prevent by using redundancy (RAID, archive)–Fix by using archives•Fire, theft, bankruptcy…–Buy insurance, change profession…•System failures: most frequent (e.g. power)–Use recoverySystem Failures•Each transaction has internal state•When system crashes, internal state is lost–Don’t know which parts executed and which didn’t•Remedy: use a log–A file that records every single action of the transactionTransactions•In ad-hoc SQL–each command = 1 transaction•In embedded SQL–Transaction starts = first SQL command issued–Transaction ends =•COMMIT•ROLLBACK (=abort)Transactions•Assumption: the database is composed of elements–Usually 1 element = 1 block–Can be smaller (=1 record) or larger (=1 relation)•Assumption: each transaction reads/writes some elementsCorrectness Principle•There exists a notion of correctness for the database–Explicit constraints (e.g. foreign keys)–Implicit conditions (e.g. sum of sales = sum of invoices)•Correctness principle: if a transaction starts in a correct database state, it ends in a correct database state•Consequence: we only need to guarantee that transactions are atomic, and the database will be correct foreverPrimitive Operations of Transactions•INPUT(X): read element X to memory buffer•READ(X,t): copy element X to transaction local variable t•WRITE(X,t): copy transaction local variable t to element X•OUTPUT(X): write element X to diskExampleREAD(A,t); t := t*2;WRITE(A,t)READ(B,t); t := t*2;WRITE(B,t)Action T Mem A Mem B Disk A Disk BREAT(A,t) 8 8 8 8t:=t*2 16 8 8 8WRITE(A,t) 16 16 8 8READ(B,t) 8 16 8 8 8t:=t*2 16 16 8 8 8WRITE(B,t) 16 16 16 8 8OUTPUT(A) 16 16 16 16 8OUTPUT(B) 16 16 16 16 16The Log•An append-only file containing log records•Note: multiple transactions run concurrently, log records are interleaved•After a system crash, use log to:–Redo some transaction that didn’t commit–Undo other transactions that didn’t commitUndo LoggingLog records•<START T> = transaction T has begun•<COMMIT T> = T has committed•<ABORT T>= T has aborted•<T,X,v>= T has updated element X, and its old value was vUndo-Logging RulesU1: If T modifies X, then <T,X,v> must be written to disk before X is written to diskU2: If T commits, then <COMMIT T> must be written to disk only after all changes by T are written to diskAction T Mem A Mem B Disk A Disk B Log<START T>REAT(A,t) 8 8 8 8t:=t*2 16 8 8 8WRITE(A,t) 16 16 8 8 <T,A,8>READ(B,t) 8 16 8 8 8t:=t*2 16 16 8 8 8WRITE(B,t) 16 16 16 8 8 <T,B,8>OUTPUT(A) 16 16 16 16 8OUTPUT(B) 16 16 16 16 16<COMMIT T>Recovery with Undo LogAfter system’s crash, run recovery manager •Idea 1. Decide for each transaction T whether it is completed or not–<START T>….<COMMIT T>…. = yes–<START T>….<ABORT T>……. = yes–<START T>……………………… = no•Idea 2. Undo all modifications by incompleted transactionsRecovery with Undo LogRecovery manager:•Read log from the end; cases:–<COMMIT T>: mark T as completed–<ABORT T>: mark T as completed–<T,X,v>: if T is not completedthen write X=v to disk else ignore–<START T>: ignoreRecovery with Undo Log……<T6,X6,v6>……<START T5><START T4><T1,X1,v1><T5,X5,v5><T4,X4,v4><COMMIT T5><T3,X3,v3><T2,X2,v2>Recovery with Undo Log•Note: all undo commands are idempotent–If we perform them a second time, no harm is done–E.g. if there is a system crash during recovery, simply restart recovery from scratchRecovery with Undo LogWhen do we stop reading the log ?•We cannot stop until we reach the beginning of the log file•This is impractical•Better idea: use checkpointingCheckpointingCheckpoint the database periodically•Stop accepting new transactions•Wait until all curent transactions complete•Flush log to disk•Write a <CKPT> log record, flush•Resume transactionsUndo Recovery with Checkpointing……<T9,X9,v9>……(all completed)<CKPT><START T2><START T3<START T5><START T4><T1,X1,v1><T5,X5,v5><T4,X4,v4><COMMIT T5><T3,X3,v3><T2,X2,v2>During recovery,Can stop at first<CKPT> transactions T2,T3,T4,T5 other transactionsNonquiescent Checkpointing•Problem with checkpointing: database freezes during
View Full Document