DOC PREVIEW
CORNELL CS 614 - Storage & File Systems

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Storage & File Systems3 February, 2004 William Josephson1ATypicalDisk• Logically organized as array of sectors– Each sector protected by ECC• Platters divided in concentric tracks– CAV on older disks– Newer disks multi-zoned• Tracks organized into cylinders• Key performance characteristics:– Rotational delay– Seek time– Head/track and cylinder switch– Sustained transfer rate– Scheduling (zero-latency xfers)2Storage Technology Trends• Disk trends in the last decade– Head switch time little changed– 2.5x improvement in seek time– 3x improvement in rotational speed– 10x improvement in bandwidth– ≈ 102denser– ≈ 103cheaper– Compare: the memory wall between processor & core• Other mass storage technologies becoming popular, too– e.g. flash in small devices3Dealing with Disaster• A typical modern disk has– MTBF of ≈ 1.2M hours– Unrecoverable ECC errors on order of 1 in 1015• Failure modes– Manufacturing defects (holes in the film, etc.)– Magnetic domains decay/flip (thermodynamics!)– Head crashes (physical/thermal shock, contamination)– ECC errors due to partial writes (esp. ATA disks)4The Unix Filesystem• Filesystem consists of a (fixed) number of blocks• Basic unit of organization is the i-node• User data stored as a sequence of bytes in data blocks• Directories are just special files containing index nodes– Directories map path components to i-node numbers• Ken’s filesystem was slow and vulnerable to failures– For instance, allocated blocks from a free list on disk5The BSD Fast File System• CSRG addressed performance and reliability concerns– Increased the block size and introduced fragments– Improved allocation and layout policies∗ Allocate file blocks in “rotationally optimal” manner∗ Allocate file blocks in one cylinder group if possible toreduce fragmentation– Further work includes softupdates, clusters, traxtents, etc.• Most operations still require multiple disk I/Os6Softupdates: Motivation• Metadata updates are a headache:– Performance, integrity, security, & availability problems• Traditional filesystems either:– Compromise on safety (e.g. Ext2, FAT)– Make extensive use of synchronous updates (e.g. FFS)– Use special-purpose hardware (e.g. WAFL)– Use shadow-paging or write-ahead logging• Softupdates allow write back caches to delay writes safely– Low-cost sequencing of fine-grained updates7Softupdates: Operational Overview• Goal: better performance through fine-grain dependency tracking• Softupdates allow for safe use of write-back caches for metadata:– Track depenedencies to present consistent view to client– Ensure state on stable storage is also consistent∗ May lose data not yet on disk, but disk image not corrupt– Dependency information consulted when flushing a dirty block∗ Aging problems avoided as new dependencies are neveradded to existing update sequences8Implementing Softupdates• Maintain update list for each pointer in cache– File system operations add updates to each pointer affected– Updates can be rolled backwards or forwards as needed– Blocks are locked during a roll-back• Simple block-based ordering is insufficient– Cycles must still be broken with synchronous writes– Some blocks may “starve” waiting for dependencies– Block granularity introduces false sharing9Softupdates: Cyclic Dependencies• Block level granularity of writes can introduce dependenciesShaded regions indicate free metadata structures10Softupdates f or FFS• Structural changes: block (de)allocation, link addition/removal• File system semantics essentially unchanged– Synchronous metadata updates do not imply synchronoussemantics (last write typically asynchronous)– Softupdates allow caching metadata with same write-backstrategies as for file data• With cheap update sequencing, can afford stronger guarantees– Can therefore safely mount filesystem immediately– Background fsck can reclaim leaked blocks11Softupdates: Performance, I• Compare create and delete throughput as a function of file sizefor conventional, no order, and softupdatesCreate Microb enchmark Delete Microbenchmark12Softupdates: Performance, II• Read performance improved by delayed writes– Better indirect block placement• Softupdates coalesces metadata updates in the create benchmarkRead throughput (MB/s) Total Disk Writes for Create Benchmark13Softupdates: Performance, III• In macrobenchmarks, softupdates also performs well• Postmark – small, ephemeral file workload (mail server):– No order: 165 tps; Softupdates: 170tps; Conventional 45 tps• On a real mailserver: softupdates offers 70% fewer writes• fsck: virtually instantaneous on 4.5G file system vs.conventional fsck time of almost three minutes14The Log-structured File System: Motivation• CPU speed increases faster than disk speed – I/O bottleneck• Aggressive caching can improve read performance• Relative write performance suffers– Can’t naively cache writes and still maintain safety– Many filesystems use synchronous writes for metadata– So metadata dominates for small files typical for Unix15LFS: Operational Overview• Goal: improve throughput through better write scheduling• Write performance drives filesystem design:– Treat the disk as a circular log– Write all data and metadata blocks together in the log– Attempt to keep large free extents∗ Batch writes in “segments”∗ Segment size chosen on the basis of disk geometry16LFS: On-Disk Data Structures & Crash Recovery• Data, index nodes, and other metadata all written to the log– Inodes are not written at fixed locations• Fixed check-point regions record inode map locations– Inode maps are aggresively cached to avoid disk seeks• Log is regularly checkpointed– Flush active segment and inode maps– Serialize dependent operations with additional log records• On restart, either truncate or roll the log forward; no fsck17LFS: The Cleaner• Logically, the log is infinite, but the disk is finite– Garbage collect (“clean”) old segments• Cleaner can either “thread” or copy and compact the log– LFS uses a hybrid approach, copying entire segments• Compare cleaning policies on the basis of write cost– Average time disk is busy per byte of new data:wc =read + write live + write newwrite new=21 − u18LFS: Cleaner Policies• When to clean?– At night, in the


View Full Document

CORNELL CS 614 - Storage & File Systems

Documents in this Course
Load more
Download Storage & File Systems
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Storage & File Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Storage & File Systems 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?