Automatic RAID ConstructionOutlineWhat is RAID?RAID ImplementationProblems with Software RAIDOur Solution: Automatic RAID ConstructionSlide 7ArchitectureSlide 9Automatic ParitySlide 11Automatic Checksum - GoalsAutomatic Checksum - DesignSlide 14ImplementationSlide 16Slide 17Performance Evaluation: SetupPerformance Evaluation: SettingsPerformance: Avg Read and Write TimePerformance: Deviation of Read & WritePerformance: ReconstructionPerformance: TimelineSlide 24ConclusionFuture WorkQuestions & AnswersAutomatic RAID ConstructionBa-Quy Vuong(Bryan) and Yiying ZhangDepartment of Computer SciencesUniversity of Wisconsin-MadisonOutline•Introduction•Architecture and Design•Implementation•Performance Evaluation•Conclusion & Future WorkWhat is RAID?•Redundant Arrays of Inexpensive Disks•Purposes:–Reliability–PerformanceRAID Implementation•Hardware RAID: –Using dedicated hardware to control the disk array–Host independent•Software RAID: –Using a software layer sitting above the disk drivers to control the disk array–Host dependentProblems with Software RAID•There are many ways to build RAID systems, including:–Different checksum-based schemes–Different parity-based scheme•Not Flexible: Each RAID level requires a specific RAID driver•Not Robust: Writing a new RAID driver is time-consuming and may have lots of bugsOur Solution: Automatic RAID Construction•Approach:–A way to describe checksum and parity-based schemes–Mapping the specified scheme to a RAID driver•Advantages:–Flexibility –RobustnessOutline•Introduction•Architecture and Design•Implementation•Performance Evaluation•Conclusion & Future WorkArchitecture•Design Consideration–Parity on top of Checksum–Checksum on top of ParityArchitecture•Example: –3-disk RAID 5–Mirroring checksumAutomatic Parity•Goals: Allows any parity scheme•Two data structures–Layout matrix: How blocks are laid out•The whole matrix corresponds to a stripe•Each row corresponds to one strip•Zeros mean data blocks, ones mean parity blocks•Number of columns is the number of disks0 0 0 10 0 1 00 1 0 01 0 0 00 0 0 1 0 1 0 14-disk RAID 54-disk RAID 44-disk RAID 0+1Automatic Parity•Two data structures–Parity matrix: What data blocks contribute to a parity block•#rows: #parity blocks in one stripe•#columns: #data blocks in one stripe•The element at row i, column j is one means the data block j is used to calculate the parity block i1 1 1 0 0 0 0 0 0 0 0 00 0 0 1 1 1 0 0 0 0 0 00 0 0 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 0 1 1 11 1 11 00 14-disk RAID 54-disk RAID 44-disk RAID 0+1Automatic Checksum - Goals•Checksum over data and parity blocks•Flexible number of blocks as a checksum unit•Flexible checksum size•Flexible functions•Flexible locationsAutomatic Checksum - Design•User specified parameters:–# of blocks as a checksum unit–Checksum size for each block–Checksum function•Example:–3 blocks as a checksum unit–1 block for checksums•One more level mappingOutline•Introduction•Architecture and Design•Implementation•Performance Evaluation•Conclusion & Future WorkImplementation•RAID driver is implemented as a device driver in Linux•Checksums and parities are specified by users•Checksum functions•Provided: sum, hash-basedImplementation•Memory-based version–Uses each memory chunk as a disk–Easy to build and debug–No significant effect on the overall code•Disk-based version–Uses real disks–Communicates with disk drivers through bio structure–Problems of synchronization due to asynchronous IOsOutline•Introduction•Architecture and Design•Implementation•Performance Evaluation•Conclusion & Future WorkPerformance Evaluation: Setup•Host: VMWare, Fedora 8, Intel Core 2 Duo 2.2GHz, 1GB RAM•Memory-based•Simulating disk delay–Each low-level disk read: 15ms–Each low-level disk write: 17ms•Simulating disk failure–Unable to read (20%)–Read inconsistency (20%)Performance Evaluation: Settings•Evaluation settings–With and without reconstruction–Different layouts, parity logics, and checksum functions–Different workloads•Systems: –System 1: 4-disk no parity, no checksum–System 2: 4-disk Raid 0+1 with hash-based checksum–System 3: 4-disk Raid 0+1 with sum checksum–System 4: 4-disk Raid 5 with hash-based checksum–System 5: 4-disk Raid 5 with sum checksum•Workload: –reading, writing 30KB files–mkfs, mountPerformance: Avg Read and Write TimePerformance: Deviation of Read & WritePerformance: ReconstructionPerformance: TimelineOutline•Introduction•Architecture and Design•Implementation•Performance Evaluation•Conclusion & Future WorkConclusion•Why automatic RAID?–Flexible vs. fixed raid drivers–Robustness •Approach–Automatic Parity with two matrices–Automatic Checksum with user-defined parameters•Lessons learned –Performance is a big issue–Disk-based RAID is much harder to implement than Memory-based RAIDFuture Work•Complete the disk-based version•Improve the performance•Check for input correctness•Extend the parity and checksum layers to handle more schemesQuestions &
View Full Document