CS 105 Tour of the Black Holes of Computing File Systems Topics fs ppt Design criteria History of file systems Berkeley Fast File System Effect of file systems on programs File Systems Disk Organization A disk is a sequence of 512 byte sectors Unfortunate historical fact now we re stuck with it First comes boot block and partition table Partition table divides the rest of disk into partitions May appear to operating system as logical disks Useful for multiple OSes etc Otherwise bad idea hangover from earlier days File system partition structured to hold files data 2 Usually aggregates sectors into blocks or clusters Typical size 2 16 sectors 1K 8K bytes Increases efficiency by reducing overhead CS 105 Design Problems As seen before disks have mechanical delays Seek time move heads to where data is 2 20 ms Rotational delay wait for data to get under heads 2 8 ms Transfer time move data into memory 1 15 s Fundamental problem in file system design how to hide or at least minimize these delays Side problems also important 3 Making things reliable in face of s w and h w crashes Organizing data e g in directories or databases Enforcing security CS 105 Important File Systems FAT old Windows and MSDOS standard NTFS Windows current standard FFS Unix standard since 80 s AFS distributed system developed at CMU LFS Berkeley redesign for high performance ZFS redesigned Unix system recently released by Sun ISO 9660 CD ROM standard EXT2 EXT3 Linux standards variants of FFS ReiserFS redesigned Linux system Other OS s have own file organization VMS MVS 4 CS 105 Typical Similarities Among File Systems A secondary boot record A top level directory Support for hierarchical directories Management of free and used space Metadata about files e g creation date Protection and security 5 CS 105 Typical Differences Between File Systems Naming conventions case length special symbols File size and placement Speed Error recovery Metadata details Support for special files 6 CS 105 Case Study Berkeley Fast File System FFS First public Unix Unix V7 introduced many important concepts in Unix File System UFS I nodes Indirect blocks Unix directory structure and permissions system UFS was simple elegant and slow Berkeley initiated project to solve the slowness Many modern file systems are direct or indirect descendants of FFS 7 In particular EXT2 and EXT3 CS 105 FFS Headers Boot block first few sectors Typically all of cylinder 0 is reserved for boot blocks partition tables etc Superblock file system parameters including Size of partition note that this is dangerously redundant Location of root directory Block size Cylinder groups including 8 Data blocks List of inodes Bitmap of used blocks and fragments in the group Replica of superblock not always at start of group CS 105 FFS File Tracking Directory file containing variable length records File name I node number Inode holds metadata for one file Located by number using info from superblock Integral number of inodes in a block Includes 9 Owner and group File type regular directory pipe symbolic link Access permissions Time of last i node change last modification last access Number of links reference count Size of file for directories and regular files Pointers to data blocks CS 105 FFS Inodes Inode has 15 pointers to data blocks 12 point directly to data blocks 13th points to an indirect block containing pointers to data blocks 14th points to a double indirect block 15th points to a triple indirect block With 4K blocks and 4 byte pointers triple indirect block can address 4 terabytes 242 bytes of data Data blocks might not be contiguous on disk But OS tries to cluster related items in cylinder group 10 Directory entries Corresponding inodes Their data blocks CS 105 FFS Free Space Management Free space managed by bitmaps One bit per block Makes it easy to find groups of contiguous blocks Each cylinder group has own bitmap Can find blocks that are physically nearby Prevents long scans on full disks Prefer to allocate block in cylinder group of last previous block 11 If can t pick group that has most space Heuristic tries to maximize number of blocks close to each other CS 105 FFS Fragmentation Blocks are typically 4K or 8K Amortizes overhead of reading or writing block On average wastes 1 2 block total per file FFS divides blocks into 4 16 fragments Free space bitmap manages fragments Small files or tails of files are placed in fragments This turned out to be terrible idea Greatly complicates OS code Didn t foresee how big disks would get Linux EXT2 uses smaller block size typically 1K instead of fragments 12 CS 105 File Systems and Data Structures Almost every data structure you can think of is present in FFS 13 Arrays of blocks and i nodes Variable length records in directories Heterogeneous records Indirection directories and inodes Reference counting Lists inodes in different cylinder groups Trees indirect data blocks Bitmaps free space management Caching CS 105 Effect of File Systems on Programs Software can take advantage of FFS design Small files are cheap spread data across many files Directories are cheap use as key value database where file name is the key Large files well supported don t worry about file size limits Random access adds little overhead OK to store database inside large file But don t forget you re still paying for disk latencies and indirect blocks FFS design also suggests optimizations Put related files in single directory Example of serious violator CVS SVN 14 Keep directories relatively small Recognize that single large file will eat all remaining free space in cylinder group Create small files before large ones CS 105 Summary Goals of Unix File Systems Simple data model Hierarchical directory tree Uninterpreted by OS sequences of bytes Extensions are just strings in long filename Multiuser protection model High speed 15 Reduce disk latencies by careful layout Hide latencies with caching Amortize overhead with large transfers Sometimes trade off reliability for speed CS 105
View Full Document
Unlocking...