Duke CPS 212 - Transactional Recovery - D258014

Home> Schools> Duke University> (CPS) > CPS 212> Transactional Recovery

Duke CPS 212 - Transactional Recovery

Course Cps 212- Distributed Information Systems

Pages 41

Download Save

Unformatted text preview:

Transactional RecoveryTransactions: ACID PropertiesThe Problem of Distributed RecoveryLoggingCommitting Distributed TransactionsTwo-Phase Commit (2PC)The 2PC ProtocolHandling Failures in 2PCAchieving Atomic DurabilityAtomic Durability with ForceShadowingNo-Force Durability with LoggingAnatomy of a LogRedo Logging: The Easy WayWhy It’s Not That EasyFast Durability 1: Rio VistaFast Durability II: Group CommitA Quick Look at Transaction PerformanceThe Need for CheckpointingAtomic Checkpointing: ExampleHow to Deal with Steal?Goals of ARIESIntroduction to ARIESARIES StructuresThe Dirty Page ListARIES Recovery: The Big PictureARIES RecoveryRedo PruningRedo Pruning: ExplanationEvaluating ARIESClient/Server Exodus (ESM-CS)Client/Server ARIESDistributed ARIESProblem 1: the Dirty Page ListReconstructing the Dirty Page ListConditional UndoProblem 2: The Trouble with PageLSNHandling PageLSNProblem 3: RecoveryLSNHandling RecoveryLSNEvaluating ARIES for ESM/CSTransactional RecoveryTransactional RecoveryTransactions: ACID PropertiesTransactions: ACID Properties“Full-blown” transactions guarantee four intertwined properties:•Atomicity. Transactions can never “partly commit”; their updates are applied “all or nothing”.The system guarantees this using logging, shadowing, distributed commit.•Consistency. Each transaction T transitions the dataset from one semantically consistent state to another.The application guarantees this by correctly marking transaction boundaries.•Independence/Isolation. All updates by T1 are either entirely visible to T2, or are not visible at all.Guaranteed through locking or timestamp-based concurrency control.•Durability. Updates made by T are “never” lost once T commits.The system guarantees this by writing updates to stable storage.The Problem of Distributed RecoveryThe Problem of Distributed RecoveryIn a distributed system, a recovered node’s state must also be consistent with the states of other nodes.E.g., what if a recovered node has forgotten an important event that others have remembered?A functioning node may need to respond to a peer’s recovery.•rebuild the state of the recovering node, and/or•discard local state, and/or•abort/restart operations/interactions in progresse.g., two-phase commit protocolHow to know if a peer has failed and recovered?LoggingLoggingvolatile memoryhome imageKey idea: supplement the home data image with a log of recent updates and/or events.append-onlysequential access (faster)preserves order of log entriesenables atomic commit with a single writeRecover by traversing, e.g., “replaying”, the log.Logging is fundamental to database systems and other storage systems. logCommitting Distributed TransactionsCommitting Distributed TransactionsTransactions may touch data stored at more than one site.Each site commits (i.e., logs) its updates independently.Problem: any site may fail while a commit is in progress, but after updates have been logged at another site.An action could “partly commit”, violating atomicity.Basic problem: individual sites cannot unilaterally choose to abort without notifying other sites.“Log locally, commit globally.”Two-Phase Commit (2PC)Two-Phase Commit (2PC)Solution: all participating sites must agree on whether or not each action has committed.•Phase 1. The sites vote on whether or not to commit.precommit: Each site prepares to commit by logging its updates before voting “yes” (and enters prepared phase).•Phase 2. Commit iff all sites voted to commit.A central transaction coordinator gathers the votes.If any site votes “no”, the transaction is aborted.Else, coordinator writes the commit record to its log.Coordinator notifies participants of the outcome.Note: one server ==> no 2PC is needed, even with multiple clients.The 2PC ProtocolThe 2PC Protocol1. Tx requests commit, by notifying coordinator (C)C must know the list of participating sites.2. Coordinator C requests each participant (P) to prepare.3. Participants validate, prepare, and vote. Each P validates the request, logs validates updates locally, and responds to C with its vote to commit or abort.If P votes to commit, Tx is said to be “prepared” at P.4. Coordinator commits.Iff P votes are unanimous to commit, C writes a commit record to its log, and reports “success” for commit request. Else abort.5. Coordinator notifies participants.C asynchronously notifies each P of the outcome for Tx.Each P logs the outcome locally and releases any resources held for Tx.Handling Failures in 2PCHandling Failures in 2PCHow to ensure consensus if a site fails during the 2PC protocol?1. A participant P fails before preparing.Either P recovers and votes to abort, or C times out and aborts.2. Each P votes to commit, but C fails before committing.Participants wait until C recovers and notifies them of the decision to abort. The outcome is uncertain until C recovers.3. P or C fails during phase 2, after the outcome is determined.Carry out the decision by reinitiating the protocol on recovery.Again, if C fails, the outcome is uncertain until C recovers.Achieving Atomic DurabilityAchieving Atomic DurabilityAtomic durability dictates that the system schedule its stable writes in a way that guarantees two key properties:1. Each transaction’s updates are tentative until commit.Database state must not be corrupted with uncommitted updates.If uncommitted updates can be written to the database, it must be possible to undo them if the transaction fails to commit.2. Buffered updates are written to stable storage synchronously with commit.Option 1: force dirty data out to the permanent (home) database image at commit time.Option 2: commit by recording updates in a log on stable storage, and defer writes of modified data to home (no-force).Atomic Durability with Atomic Durability with ForceForceA force strategy writes all updates to the home database file on each commit.•must be synchronous•disks are block-oriented devicesWhat if items modified by two different transactions live on the same block?need page/block granularity locking•writes may be scattered across filepoor performanceWhat if the system fails in the middle of the stream of writes?volatile memorystable storage (home)ShadowingShadowing1. starting pointmodify purple/grey blocks2. write new blocks to diskprepare new block map3. overwrite block map(atomic commit)and free old blocksShadowing is the basic technique for doing an atomic force.Frequent

View Full Document


School:
Email:
New Password:
Confirm Password:

Duke CPS 212 - Transactional Recovery

Sign up for free to view:

Please select your school