UW-Madison CS 736 - Checksumming RAID - D233998

Home> Schools> University of Wisconsin, Madison> (CS) > CS 736> Checksumming RAID

DOC PREVIEW

UW-Madison CS 736 - Checksumming RAID

School name University of Wisconsin, Madison

Course Cs 736- Advanced Operating Systems

Pages 8

This preview shows page 1-2-3 out of 8 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

IntroductionBackgroundThe ProblemRAIDEnd to End DetectionFilesystem DetectionImplementationChecksumming RAID LayoutLayout AnalysisComputing the ChecksumTypical Operation ProcessesTypical Write ProcessTypical Read ProcessData Block Corruption RecoveryChecksum Block Corruption RecoveryCache PolicyChanges to Linux's Software RAID DriverCrash RecoveryEvaluationOverheadsTest SetupCorrectnessEffects of Increasing CorruptionsEffects of Disk CountsSingle Disk TestsConclusionsChecksumming RAIDBrian [email protected] [email protected] systems exhibit silent data corruptions that gounnoticed until too late, potenially resulting in wholetrees of lost data. To deal with this, we’ve integrateda checksumming mechanism into Linux’s Multi-DeviceSoftware RAID layer so that we are able to detect andcorrect these silent data corruptions. The analysis of ournaive implementation shows that this can be done with areasonable performance overhead.1 IntroductionStorage systems, perhaps more than other computer sys-tem components, exhibit silent partial failures. Thesefailures can occur anywhere along the storage stack:from the operating system arranging and scheduling re-quests, to the device drivers, through the communicationchannel, to the backing device medium itself and all ofits internal firmware and mechanical underpinnings. Thefailures themselves can be partial or complete, tempro-rary or permanent, and, despite some hardware supportfor it, detected or not. They occur in the form of bit flipsalong the path, bit rot on the medium over time, mis-directed writes to the medium, and phantom writes thatnever actually make it to the medium.At the same time, though perhaps unwise, most softwarein a system presumes a stop-failure environment in whichall the underlying components either completely work orfail in their entirety. For example, if a region in memoryis damaged it is likely that either the program will crash,or in the case of virtual memory, perhaps the machinewill crash. These particular problems are probably notthe end of the world though, since the program or themachine can simply be restarted to restore a clean state.However, when this happens in a storage component, theresulting state can remain with the system across reboots.The effect of this could be as simple as a corrupted blockwithin a file. In some cases, the application format candeal with this, however in others it renders the file, in-cluding the rest of its intact data, useless. It is also possi-ble the corrupted block occurs in some metadata for thefilesystem living on top of the storage component, suchas a directory. In this case whole subtrees of data may be-come inaccessible. The results of this can be anywherefrom lost data to an inoperable machine.Given that the storing and processing of data is the pri-mary purpose of most computers, and that the data peo-ple store represents an investment in time and energy,these failures and their ensuing loses can be very costly.Thus, we propse a storage system that is not only capableof detecting these silent corruptions, but also correctingthem before it’s too late. To that end, we’ve extendedthe Software RAID layer in Linux’s Multi-Device driver,which already includes some redundancy information tobe able to recover from some stop-failures scenarios, toinclude an integrity checking component in the form ofchecksums. Using this we show that our solution is capa-ble of detecting and correcting single block corruptionswithin a RAID stripe at a reasonable performance over-head.The remainder of this paper is structured as follows. InSection 2 we’ll discuss some background and relatedwork. Section 3 will discuss our implementation details,followed by our evaluation and results in Section 4, andfinally our conclusions in Section 5.2 BackgroundIn this section we review the problem of silent corrup-tions and discuss a number of other approaches to deal-ing with it.2.1 The ProblemThe issue of silent data corruption within storage sys-tems and beyond has been well understood and stud-ied. The IRON Filesystems paper [9] presents a goodoverview of the possible sources of corruption along thestorage stack. Studies at CERN in 2007 [5] and statisticsquoted by [4] showed that error rates occurred in only 1in 1012to 1014bits (8 to 12TB). However, other studies[9] note that this rate will only increase as disks becomemore complex and new technologies such as flash takehold even as market pressures force the cost and qualityof these components downwards. Indeed, these statis-tics already increase when we take into account the othercomponents in the system such as memory and I/O con-trollers [?]. Given that this problem is well understood, itis natural to ask what existing solutions there are, whichwe now discuss.2.2 RAIDOne common technique used to deal with failures in stor-age infrastructures is RAID [6]. RAID typically pro-vides data redundancy to protect against drive failure byreplicating whole or calculated parity data to a separatedisk in the array. However, this mechanism only pro-tects against whole disk failures since the parity blockisn’t usually read, or more importantly compared, whena data block is read. Indeed, in a mostly read orientedarray, a silent data corruption could occur in a seldomlyread parity block resulting in the eventual restoration ofbad data. The integrity of a RAID system can be veri-fied periodically by a "scrubbing" process. However, thisprocess can take a very long time, so is seldom done. Incontrast, our solution incorporates an active check of dataand parity block integrity at the time of each read or writeoperation to the array. This allows us to detect and repairindividual active stripes much sooner. For idle data thatis not being actively read, it may still be wise to run pe-riodic scrubbing processes in order to detect corruptionsbefore we can’t repair them. However, this process couldpossibly be optimized to only examine the inactive data.2.3 End to End DetectionModern drives already provide some ECC capabilites todetect individual sector failures. However, studies [?]have shown that these don’t detect all errors. Other solu-tions, such as Data Integrity Extension (DIX) [8] attemptto integrate integrity checks throughout the entire storagestack – from the application through the OS down to theactual device medium. These particular extensions oper-ate by adding an extra 8 bytes to a typical on disk sectorto store

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 8 pages.

UW-Madison CS 736 - Checksumming RAID

Sign up for free to view:

Please select your school