CORNELL CS 614 - Storage & File Systems - D1777475

Home> Schools> Cornell University> Computer Science (CS) > CS 614> Storage & File Systems

DOC PREVIEW

CORNELL CS 614 - Storage & File Systems

School name Cornell University

Course Cs 614- Advanced Systems

Pages 23

This preview shows page 1-2-22-23 out of 23 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Storage & File Systems3 February, 2004 William Josephson1ATypicalDisk• Logically organized as array of sectors– Each sector protected by ECC• Platters divided in concentric tracks– CAV on older disks– Newer disks multi-zoned• Tracks organized into cylinders• Key performance characteristics:– Rotational delay– Seek time– Head/track and cylinder switch– Sustained transfer rate– Scheduling (zero-latency xfers)2Storage Technology Trends• Disk trends in the last decade– Head switch time little changed– 2.5x improvement in seek time– 3x improvement in rotational speed– 10x improvement in bandwidth– ≈ 102denser– ≈ 103cheaper– Compare: the memory wall between processor & core• Other mass storage technologies becoming popular, too– e.g. ﬂash in small devices3Dealing with Disaster• A typical modern disk has– MTBF of ≈ 1.2M hours– Unrecoverable ECC errors on order of 1 in 1015• Failure modes– Manufacturing defects (holes in the ﬁlm, etc.)– Magnetic domains decay/ﬂip (thermodynamics!)– Head crashes (physical/thermal shock, contamination)– ECC errors due to partial writes (esp. ATA disks)4The Unix Filesystem• Filesystem consists of a (ﬁxed) number of blocks• Basic unit of organization is the i-node• User data stored as a sequence of bytes in data blocks• Directories are just special ﬁles containing index nodes– Directories map path components to i-node numbers• Ken’s ﬁlesystem was slow and vulnerable to failures– For instance, allocated blocks from a free list on disk5The BSD Fast File System• CSRG addressed performance and reliability concerns– Increased the block size and introduced fragments– Improved allocation and layout policies∗ Allocate ﬁle blocks in “rotationally optimal” manner∗ Allocate ﬁle blocks in one cylinder group if possible toreduce fragmentation– Further work includes softupdates, clusters, traxtents, etc.• Most operations still require multiple disk I/Os6Softupdates: Motivation• Metadata updates are a headache:– Performance, integrity, security, & availability problems• Traditional ﬁlesystems either:– Compromise on safety (e.g. Ext2, FAT)– Make extensive use of synchronous updates (e.g. FFS)– Use special-purpose hardware (e.g. WAFL)– Use shadow-paging or write-ahead logging• Softupdates allow write back caches to delay writes safely– Low-cost sequencing of ﬁne-grained updates7Softupdates: Operational Overview• Goal: better performance through ﬁne-grain dependency tracking• Softupdates allow for safe use of write-back caches for metadata:– Track depenedencies to present consistent view to client– Ensure state on stable storage is also consistent∗ May lose data not yet on disk, but disk image not corrupt– Dependency information consulted when ﬂushing a dirty block∗ Aging problems avoided as new dependencies are neveradded to existing update sequences8Implementing Softupdates• Maintain update list for each pointer in cache– File system operations add updates to each pointer aﬀected– Updates can be rolled backwards or forwards as needed– Blocks are locked during a roll-back• Simple block-based ordering is insuﬃcient– Cycles must still be broken with synchronous writes– Some blocks may “starve” waiting for dependencies– Block granularity introduces false sharing9Softupdates: Cyclic Dependencies• Block level granularity of writes can introduce dependenciesShaded regions indicate free metadata structures10Softupdates f or FFS• Structural changes: block (de)allocation, link addition/removal• File system semantics essentially unchanged– Synchronous metadata updates do not imply synchronoussemantics (last write typically asynchronous)– Softupdates allow caching metadata with same write-backstrategies as for ﬁle data• With cheap update sequencing, can aﬀord stronger guarantees– Can therefore safely mount ﬁlesystem immediately– Background fsck can reclaim leaked blocks11Softupdates: Performance, I• Compare create and delete throughput as a function of ﬁle sizefor conventional, no order, and softupdatesCreate Microb enchmark Delete Microbenchmark12Softupdates: Performance, II• Read performance improved by delayed writes– Better indirect block placement• Softupdates coalesces metadata updates in the create benchmarkRead throughput (MB/s) Total Disk Writes for Create Benchmark13Softupdates: Performance, III• In macrobenchmarks, softupdates also performs well• Postmark – small, ephemeral ﬁle workload (mail server):– No order: 165 tps; Softupdates: 170tps; Conventional 45 tps• On a real mailserver: softupdates oﬀers 70% fewer writes• fsck: virtually instantaneous on 4.5G ﬁle system vs.conventional fsck time of almost three minutes14The Log-structured File System: Motivation• CPU speed increases faster than disk speed – I/O bottleneck• Aggressive caching can improve read performance• Relative write performance suﬀers– Can’t naively cache writes and still maintain safety– Many ﬁlesystems use synchronous writes for metadata– So metadata dominates for small ﬁles typical for Unix15LFS: Operational Overview• Goal: improve throughput through better write scheduling• Write performance drives ﬁlesystem design:– Treat the disk as a circular log– Write all data and metadata blocks together in the log– Attempt to keep large free extents∗ Batch writes in “segments”∗ Segment size chosen on the basis of disk geometry16LFS: On-Disk Data Structures & Crash Recovery• Data, index nodes, and other metadata all written to the log– Inodes are not written at ﬁxed locations• Fixed check-point regions record inode map locations– Inode maps are aggresively cached to avoid disk seeks• Log is regularly checkpointed– Flush active segment and inode maps– Serialize dependent operations with additional log records• On restart, either truncate or roll the log forward; no fsck17LFS: The Cleaner• Logically, the log is inﬁnite, but the disk is ﬁnite– Garbage collect (“clean”) old segments• Cleaner can either “thread” or copy and compact the log– LFS uses a hybrid approach, copying entire segments• Compare cleaning policies on the basis of write cost– Average time disk is busy per byte of new data:wc =read + write live + write newwrite new=21 − u18LFS: Cleaner Policies• When to clean?– At night, in the

View Full Document