Review: FFS backgroundReview: FFS href {http://www.scs.stanford.edu/10wi-cs140/sched/readings/ffs.pdf}{[McKusic]} basicsFFS disk layoutBasic FFS data structuresFFS superblockCylinder groupsInode allocationFragment allocationBlock allocationDirectoriesUpdating FFS for the 90sFixing corruption -- fsckCrash recovery permeates FS codeOrdering of updatesPerformance vs. consistencyFirst attempt: Ordered updatesOrdered updates (continued)Problem: Cyclic dependenciesCyclic dependencies illustratedMore problemsBreaking dependencies w. rollbackBreaking dependencies w. rollbackBreaking dependencies w. rollbackBreaking dependencies w. rollbackSoft updateshypertarget {simple-example}{Simple example}Operations requiring soft updates (1)Operations requiring soft updates (2)Soft update issuesSoft updates fsckAn alternative: JournalingJournaling detailsCase study: XFS href {http://www.scs.stanford.edu/10wi-cs140/sched/readings/xfs.pdf}{[Sweeney]}B+-treesB+-trees continuedB+-trees in XFSMore B+-trees in XFSContiguous allocationJournaling vs. soft updatesReview: FFS background• 1980s improvement to original Unix FS, which had:- 512-byte blocks- Free blocks in linked list- All inodes at beginning of disk- Low throughput: 512 bytes per average seek time• Unix FS performance problems:- Transfers only 512 bytes per disk access- Eventually random allocation → 512 bytes / disk seek- Inodes far from directory and file data- Within directory, inodes far from each other• Also had some usability problems:- 14-character file names a pain- Can’t atomically update file in crash-proof way1/36Review: FFS [McKusic] basics• Change block size to at least 4K- To avoid wasting space, use “fragments” for ends of files• Cylinder groups spread inodes around disk• Bitmaps replace free list• FS reserves space to improve allocation- Tunable parameter, default 10%- Only superuser can u se space when over 90% full• Usability improvements:- File names up to 255 characters- Atomic rename system call- Symbolic links assign one file name to another2/36FFS disk layoutsuperblockscylindergroupsinodes data blocksinformationbookkeeping• Each cylinder group has its own:- Superblock- Bookkeeping information- Set of inodes- Data/directory blocks3/36Basic FFS data structuresdatadatadatadatanamei-number...contentsdirectory...inode...indirectblock...double indirindirect ptr...metadata......data ptrdata ptrdata ptrdata ptr. . .• Inode is key data structure for each file- Has permissions and access/modification/inode-change times- Has link count (# directories containing file); file deleted when 0- Points to data blocks of file (and indirect blocks)• By convention, inode #2 always root directory4/36FFS superblock• Superblock contains file system parameters- Disk characteristics, block size, CG info- Information necessary to get inode given i-number• Replicated once per cylinder group- At shifting offsets, so as to span multiple platters- Contains magic number to find replicas if 1st superblock dies• Contains non-replicated “summary info”- # blocks, fragments, inodes, directories in FS- Flag stating if FS was cleanly unmounted5/36Cylinder groups• Groups related inodes and their data• Contains a number of inodes (set when FS created)- Default one inode per 2K data• Contains file and directory blocks• Contains bookkeeping information- Block map – bit map of available fragments- Summary info within CG – # free inodes, blocks/frags, files,directories- # free blocks by rotational position (8 positions)[Was reasonable in 1980s when disks weren’t commonly zoned]6/36Inode allocation• Allocate inodes in same CG as directory if possible• New directories put in new cylinder groups- Consider CGs with greater than average # free inodes- Chose CG with smallest # directories• Within CG, inodes allocate d randomly (next free)- Wou ld like related inodes as close as possible- OK, because one CG doesn’t have that many inodes7/36Fragment allocatio n• Allocate space when user writes beyond end of file• Want last block to be a fragment if not full-size- If already a fragment, may contain space for write – done- Else, must deallocate any existing fragment, allocate new• If no appropriate free fragments, break full block• Problem: Slow for many small writes- (Partial) soution: new stat struct field st blksize- Tells applications file system block size- stdio library can buffer this much data8/36Block allocation• Try to optimize for sequential acces s- If available, use rotationally close block in same cylinder- Otherwise, use block in same CG- If CG totally full, find other CG w ith quadratic hashingi.e., if CG #n is full, try n + 12, n + 22, n + 32, . . . (mod #CGs)- Otherwise, search all CGs for some free space• Problem: Don’t w a nt one file filling up whole CG- Otherwise other inodes will have data far away• Solution: Break big files over many CGs- But large extents in each CGs, so sequential access doesn’trequire many seeks9/36Directories• Inodes like files, but with different type bits• Contents considered as 512-byte chunks• Each chunk has direct structure(s) with:- 32-bit inumber- 16-bit size of directory entry- 8-bit file type (NEW)- 8-bit length of file name• Coalesce when deleting- If first direct in chunk deleted, set inumber = 0• Periodically compact directory chunks10/36Updating FFS for the 90s• No longer wanted to assume rotational delay- With disk caches, want data contiguously allocated• Solution: Cluster writes- FS delays writing a block back to get more blocks- Accumulates blocks into 64K clusters, written at once• Allocation of clusters similar to fragments/blocks- Summary info- Cluster map has one bit for each 64K if all free• Also read in 64K chunks when doing read ahead11/36Fixing corruption – fsck• Must run FS check (fsck) program after crash• Summary info usually bad after crash- Scan to check free block map, block/inode counts• System may have corrupt inodes (not simple crash)- Bad block numbers, cross-allocation, etc.- Do sanity check, clear inodes with garbage• Fields in inodes may be wrong- Count number of directory entries to verify link count, if noentries but count 6= 0, move to lost+found- Make sure size and used data counts match blocks• Directories may be bad- Holes illegal, . and .. must be valid, . . .- All directories must be reachable12/36Crash recov e ry permeates FS code•Have to ensure fsck can recover file
View Full Document