Stanford CS 140 - Handout #18 - Failure & Recovery - D584697

Home> Schools> Stanford University> Computer Science (CS) > CS 140> Handout #18 - Failure & Recovery

Stanford CS 140 - Handout #18 - Failure & Recovery

Course Cs 140- Operating Systems and Systems Programming

Pages 6

Download Save

Unformatted text preview:

Surviving failure!OSes and computers crash. – Your file system should not be destroyed.!Readings:– Silberschatz: »7th ed.: ch. 6.9 & 12» 6th ed.: ch 7.9 & 14– man fsck What is the problem?!The Big File System Promise: persistence– It will hold your data until you explicitly delete it– (and sometimes even beyond that: backup/restore)!What’s hard about this? Crashes– If your data is in main memory, a crash nukes it.– Performance tension: need to cache everything. But if so, then crash = lose everything.– More fundamental: interesting ops = multiple block modifications, but can only atomically modify disk a sector at a time.memorycrashWhat failure looks like!Crash = concurrency– two threads r/w same shared state?!Crash = time travel– Current volatile state lost; suddenly go back to old state– Plus: write back buffer cache = reordered disk writes!FSCrash (context switch) FS’crashBuffer cacheDisk!In general, coping with failure consists of first defining a failure model composed of:– Acceptable failures. E.g., the earth is destroyed by the weirdos from Mars. The loss of a file viewed as unavoidable.– Unacceptable failures. E.g. power outage: lost file not OK !Second, devise a recovery procedure for each unacceptable failure: – Takes system from a precisely understood but incorrect state to a new precisely understood and correct state.!Dealing with failure is *hard*– Containing effects of failure is complicated. – How to anticipate everything you haven’t anticipated?Fighting failureFS Caches: Three main approaches!Sol’n 1: Throw everything away and start over. – Done for most things (e.g., interrupted compiles).– Probably not what you want to happen to your email!Sol’n 2: Make updates seem indivisible (atomic)– Build arbitrary sized atomic units from smaller atomic ones (e.g., a sector write).– Similar to how we built critical sections from locks, and locks from atomic instructions.!Sol’n 3: Reconstruction– Try to fix things after crash (many Fses do this: “fsck”)– Usually do changes in stylized way so that if crash happens, can look at entire state and figure out where you left off.!Atomic operation: bundles a set of operations such that they appear to execute indivisibly.!For disk: construct a pair of operations:– put(blk, address) : writes data in blk on disk at address.– get(address) -> blk : returns blk at given disk address.– Such that “put” appears to place data on disk in its entirety or not at all and “get” returns the latest version.– What we have to guard against: a system crash during a call to “put”, which results in a partial write.!How? State duplication.– The algorithm was first described in 1961 for the SABRE American Airlines seat-reservation system.– Still relevant: LFS uses a variant to write checkpointsArbitrary-sized atomic disk operationsCS 140 - Summer 2008 - Handout #18: Failure & Recoveryvoid atomic-put(data) ! ! blk atomic-get() ! version++; #unique int! ! V1data := get(V1)! put(version, V1);!! D1data := get(D1);! put(data, D1);! ! ! V2data := get(V2);! put(version, V2);!! D2data := get(D2);! put(data, D2);! ! ! if(V1data == V2data)! ! ! ! ! ! return D1data;! ! ! ! ! ! else ! ! ! ! ! ! return D2data;!V1, D1, V2, D2: (different) disk addresses !version is a integer in volatile storage!a call to atomic-put(“seat 25”) might result in:– { #2, “seat 25”, #2, “seat 25” }SABRE atomic disk operationsput # fails possible disk contents atomic-get returns?Before {#2, “seat 25”, #2, “seat 25”} the first {#2.5, “seat 25”, #2, “seat 25” } the second {#3, “seat 35”, #2, “seat 25”}the third {#3, “seat 31”, #2.5, “seat 25”}the fourth {#3, “seat 31”, #3, “seat 35”}After {#3, “seat 31”, #3, “seat 31”}!Assume we have correctly written to disk:– { #2, “seat 25”, #2, “seat 25” }!And that the system has crashed during the operation atomic-put(“seat 31”) !There are 6 cases, depending on where we failed in atomic-put:Does it work?!Once data written, the disk returns it correctly– If data can be corrupted? Detect using checksums.– Checksum ~ a hash function s.t. corrupted block gives different value.– detection: store checksum with blk– and re-check on read.!Disk is in a correct state when atomic-put starts– before doing the next “put” after a failure, we need to repair the disk to get it back to a correct state– tricky part: if we crash during recovery, the disk should not get even more trashed!Two assumptionscksum( blk ) 4514845148Recovery: built on idempotent operationsvoid recover(void) { V1data := get(V1); # following 4 ops same as in a-get D1data := get(D1); V2data := get(V2); D2data := get(D2); if (V1data == V2data) if(D1data != D2data) # if we crash & corrupt D2, will get here again. put(D1data, D2); else # if we crash and corrupt D1 will get back here put(D2data, D1); # if we crash and corrupt V1, will get back here put(V2data, V1); version := V1data!Most approaches to tolerating failure have at their core a similar notion of state duplication– Want a reliable tire? Have a spare.– Want a reliable disk? Keep a tape backup. If disk fails, get data from backup. (Make sure not in same building.)– Want a reliable server? Have two, with identical copies of the same information. Primary fails? Switch. (Make sure not on same side of the country)– Like caches (another state duplication): easy to generalize to more (have ‘n’ spares)The power of state duplicationConcrete cases!What happens during crash happens during?– Creating, moving, deleting, growing a file?!How to deal with errors?– The simplest approach: synchronous writes + fsckSynchronous writes + fsck!Synchronous writes = ordering state updates – to do N modifications: – simple but slowwww.!fsck:– After crash, sweep down entire FS tree, finding what is broken and try to fix. – Cost = O(size of FS). Yuck.r=1write block 1Wait for diskwrite block 2Wait for disk...Unix file system invariants!File and directory names are unique.!All free objects are on free list.– + free list only holds free objects!Data blocks have exactly one pointer to them.!Inode’s ref count = the number of pointers to it.!All objects are initialized.– A new file should have no data blocks, a just

View Full Document


School:
Email:
New Password:
Confirm Password:

Stanford CS 140 - Handout #18 - Failure & Recovery

Sign up for free to view:

Please select your school