W4118 Operating Systems Instructor: Junfeng Yang1Outline File system conceptsWhat is a file?What operations can be performed on files?What is a directory and how is it organized? File implementationHow to allocate disk space to files?12What is a file User viewNamed byte array• Types defined by userPersistent across reboots and power failures OS viewMap bytes as collection of blocks on physical storageStored on nonvolatile storage device• Magnetic Disks23Role of file systemNamingHow to “name” filesTranslate “name” + offset logical block #ReliabilityMust not lose file dataProtectionMust mediate file access from different usersDisk managementFair, efficient use of disk spaceFast access to files34File metadataName – only information kept in human-readable formIdentifier – unique tag (number) identifies file within file system (inode number in UNIX)Location – pointer to file location on deviceSize – current file sizeProtection – controls who can do reading, writing, executingTime, date, and user identification – data for protection, security, and usage monitoringHow is metadata stored? (inode in UNIX)45File operations int creat(const char* pathname, mode_t mode) int unlink(const char* pathname) int rename(const char* oldpath, const char* newpath) int open(const char* pathname, int flags, mode_tmode) int read(int fd, void* buf, size_t count); int write(int fd, const void* buf, size_t count) int lseek(int fd, offset_t offset, int whence) int truncate(const char* pathname, offset_t len) ...6Open filesProblem: expensive to resolve name to identifier on each accessSolution: open file before accessName resolution: search directories for file name and check permissionRead relevant file metadata into open file table in memoryReturn index in open file table (file descriptor)Application pass index to OS for subsequent accessSystem-wide open file table shared across processesPer-process open file table stores current pointer position and index to system-wide open file table7Directories Organization techniqueMap file name to location on diskAlso stored on disk Single-Level directorySingle directory for entire disk• Each file must have unique nameNot very usable Two-level directoryDirectory for each userStill not very usable78Tree-structured directory Directory stored on disk just like filesData consists of <name, index> pairs• Name can be another directoryDesignated by special bit in meta-dataReference by separating names with slashesOperations• User programs can read (readdir())• Only special system calls can write Special directoriesRoot (/): fixed index for metadata. : this directory.. : parent directory89Acyclic-graph directories Directories can share files Create links from one file Two types of linksHard link• Multiple directory entries point to same file• Store reference count in file metadata• Cannot refer to directories; why?Symbolic link• Special file, designated by bit in meta-data• File data is name to another file910Path names Absolute path name (full path name)Start at root directory• E.g. /home/junfeng/teaching Relative path nameFull path is lengthy and inflexibleGive each process current working directoryAssume file in current directory1011Directories as files Direction as special files that store pointers to the contained filesFile data is interpreted by FS code Separate functionality in two levelsLowest: storage managementHighest: naming, directory Advantage: simplifies design and implementation12ProtectionType of access Read, write, execute, append, delete, list …Access control list Associate lists of users with access rights for every file Advantage: complete control Disadvantage• Tedious to construct list (may not know in advance for all users)• Require variable-size informationClassify users user, group, other Advantage: easier to implement Disadvantage: no fine grained control13Outline File system conceptsWhat is a file?What operations can be performed on files?What is a directory and how is it organized? File implementationHow to allocate disk space to files?1314Typical file access patterns Sequential AccessData read or written in order• Most common access pattern– E.g., copy files, compiler read and write files, Can be made very fast (peak transfer rate from disk) Random AccessRandomly address any block• E.g., update records in a database fileDifficult to make fast (seek time and rotational delay)1415Disk management Need to track where file data is on diskHow should we map logical sector # to surface #, track #, and sector #?• Order disk sectors to minimize seek time for sequential access Need to track where file metadata is on disk Need to track free versus allocated areas of diskE.g., block allocation bitmap (Unix)• Array of bits, one per block• Usually keep entire bitmap in memory1516Allocation strategiesVarious approaches (similar to memory allocation)ContiguousExtent-basedLinkedFAT tablesIndexedMulti-Level IndexedKey metricsFragmentation (internal & external)?Grow file over time after initial creation?Fast to find data for sequential and random access?Easy to implement?Storage overhead?1617Contiguous allocation Allocate files like continuous memory allocation (base & limit)User specifies length, file system allocates space all at onceCan find disk space by examining bitmap Metadata: contains starting location and size of file1718Contiguous allocation example19Pros and cons ProsEasy to implementLow storage overhead (two variables to specify disk area for file)Fast sequential access since data stored in continuous blocksFast to compute data location for randomaddresses. Just an array index ConsLarge external fragmentationDifficult to grow file1920Extent-based allocation Multiple contiguous regions per file (like segmentation)Each region is an extentMetadata: contains small array of entries designating extents• Each entry: start and size of extent2021Pros and cons ProsEasy to implementLow storage overhead (a few entries to specify file blocks)File can grow overtime (until run out of
View Full Document