Advanced File Systems Issues Andy Wang COP 5611 Advanced Operating Systems Outline File systems basics Making file systems faster Making file systems more reliable Making file systems do more Using other forms of persistent storage File System Basics File system a collection of files An OS may support multiples file systems Instances of the same type Different types of file systems All file systems are typically bound into a single namespace Often hierarchical A Hierarchy of File Systems Some Questions Why hierarchical What are some alternative ways to organize a namespace Why not a single file system Types of Namespaces Flat Hierarchical Relational Contextual Content based Example Internet FS Flat each URL mapped to one file Hierarchical navigation within a site Relational keyword search via search engines Contextual page rank to improve search results Content based searching for images without knowing their names Why not a single FS Advantages of Independent File Systems Easier support for multiple hardware devices More control over disk usage Fault isolation Quicker to run consistency checks Support for multiple types of file systems Overall Hierarchical Organizations Constrained Unconstrained Constrained Organizations Independent file systems only located at particular places Usually at the highest level in the hierarchy e g DOS Windows and Mac Simplicity simple user model lack of flexibility Unconstrained Organizations Independent file systems can be put anywhere in the hierarchy e g UNIX Generality invisible to user Complexity not always what user expects These organizations requires mounting Mounting File Systems Each FS is a tree with a single root Its root is spliced into the overall tree Typically on top of another file directory Or the mount point Complexities in traversing mount points Mounting Example tmp root mount dev sd01 w x y z tmp After the Mount tmp root mount dev sd01 w x y z tmp Before and After the Mount Before mounting if you issue ls w x y z tmp You see the contents of w x y z tmp After mounting if you issue ls w x y z tmp You see the contents of root Questions Can we end up with a cyclic graph What are some implications What are some security concerns What is a File A collection of data and metadata often called attributes Usually in persistent storage In UNIX the metadata of a file is represented by the i node data structure Logical File Representation Name s File i node File attributes Data File Attributes Typical attributes include File length File ownership File type Access permissions Typically stored in special fixedsize area Extended Attributes Some systems store more information with attributes e g Mac OS Sometimes user defined attributes Some such data can be very large In such cases treat attributes similar to file data Storing File Data Where do you store the data Next to the attributes or elsewhere Usually elsewhere Data is not of single size Data is changeable Storing elsewhere allows more flexibility Physical File Representation Name s File i node File attributes Data locations Data blocks Ext2 i node data block location data block location data block location data block location 12 index block location index block location index block location i node A Major Design Assumption File size distribution number of files 22KB 64 KB file size Pros Cons of i node Design Faster accesses for small files also accessed more frequently No external fragmentations Internal fragmentations Limited maximum file size Directories A directory is a special type of file Instead of normal data it contains pointers to other files Directories are hooked together to create the hierarchical namespace Ext2 Directory Representation data block location file1 file1 file i node i nodelocation number data block location index block location index block location index block location i node file1 file2 file2 file i node i nodelocation number Links Multiple different names for the same file A Hard link A second name that points to the same file A Symbolic link A special file that directs name translation to take another path Hard Link Diagram data block location file1 file1 file i node i nodelocation number data block location index block location index block location index block location i node file1 file2 file1 file i node i nodelocation number Implications of Hard Links Multiple indistinguishable pathnames for the same file Need to keep link count with file for garbage collection Remove sometimes only removes a name Rather odd and unexpected semantics Symbolic Link Diagram data block location file1 file1 file i node i nodelocation number data block location index block location index block location index block location i node file1 file2 file2 file i node i nodelocation number file1 Implications of Symbolic Links If file at the other end of the link is removed dangling link Only one true pathname per file Just a mechanism to redirect pathname translation Less system complications Disk Hardware in Brief One or more rotating disk platters One disk head per platter they typically move together with one head activated at a time Disk arm Disk Hardware in Brief Track Sector Cylinder Modern Disk Complexities Zone bit recording Track skews More sectors near outer tracks Track starting positions are not aligned Optimize sequential transfers across multiple tracks Thermo calibrations Laying Out Files on Disks Consider a long sequential file And a disk divided into sectors with 1 KB blocks Where should you put the bytes File Layout Methods Contiguous allocation Threaded allocation Segment based variable sized extent based allocation Indexed fixed sized extent based allocation Multi level indexed allocation Inverted hashed allocation Contiguous Allocation Fast sequential access Easy to compute random offsets External fragmentation Threaded Allocation Example FAT Easy to grow files Internal fragmentation Not good for random accesses Unreliable Segment Based Allocation A number of contiguous regions of blocks Combines strengths of contiguous and threaded allocations Internal fragmentation Random accesses are not as fast as contiguous allocation Segment Based Allocation segment list location i node begin block location end block location begin block location end block location Indexed Allocation Fast random accesses Internal fragmentation Complexity in growing shrinking indices data block location data block location i node Multi level Indexed Allocation UNIX ext2 Easy to grow indices
View Full Document