Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43Slide 4415-410, S’04- 1 -File System (Internals)Mar. 31, 2004Dave EckhardtDave EckhardtBruce MaggsBruce MaggsL26_Filesystem15-410“...Does this look familiar?...”15-410, S’04- 1 -SynchronizationProject 3 statusProject 3 statusSeveral groups skipped Checkpoint 3 (the easiest one!)Not everybody took advantage of opportunity to planSeveral groups seem on track to finish earlySeveral groups dangerously close to the “90% problem”First 90% of the work takes the first 90% of the timeLast 10% of the work takes the second 90% of the timeWe want everybody to finish!We want everybody to finish!Project 3 is the core experience of the classCan't bury it and move on!15-410, S’04- 1 -SynchronizationProject 3 / Project 4 “hurdle” test suiteProject 3 / Project 4 “hurdle” test suiteReleased this weekTwo sectionsBasic tests, solidity testsAt P3 deadline, you will run the testsAt P3 deadline, you will run the testsGoal: pass ~80% of each sectionRegister to begin Project 4 (some P3 extensions)Not passing the hurdle?Not passing the hurdle?Extra week to work on P3Cannot submit P4, grade will be 0%15-410, S’04- 1 -SynchronizationTodayTodayChapter 12 (not: Log-structured, NFS)15-410, S’04- 1 -OutlineFile system code layers (abstract)File system code layers (abstract)Disk, memory structuresDisk, memory structuresUnix “VFS” layering indirectionUnix “VFS” layering indirectionDirectoriesDirectoriesBlock allocation strategies, free spaceBlock allocation strategies, free spaceCache tricksCache tricksRecovery, backupsRecovery, backups15-410, S’04- 1 -File System LayersDevice driversDevice driversread/write(disk, start-sector, count)Block I/OBlock I/Oread/write(partition, block) [cached]File I/OFile I/Oread/write (file, block)File systemFile systemmanage directories, free space15-410, S’04- 1 -File System LayersMulti-filesystem namespaceMulti-filesystem namespacePartitioning, names for devicesMountingUnifying multiple file system typesUFS, ext2fs, ext3fs, reiserfs, FAT, 9660, ...15-410, S’04- 1 -Shredding DisksSplit disk into Split disk into partitionspartitions/slices/minidisks/.../slices/minidisks/...–PC: 4 “partitions” – Windows, FreeBSD, Plan 9–Mac: “volumes” – OS 9, OS X, system vs. user dataOr: glue disks together into Or: glue disks together into volumesvolumes/logical disks/logical disksPartition may contain...Partition may contain...–Paging area●Indexed by in-memory structures●“random garbage” when OS shuts down–File system●Block allocation: file # block list●Directory: name file #15-410, S’04- 1 -Disk StructuresBoot area (first block/track/cylinder)Boot area (first block/track/cylinder)Interpreted by hardware bootstrap (“BIOS”)May include partition tableFile system control blockFile system control blockKey parameters: #blocks, metadata layoutUnix: “superblock”““File control block” (Unix: “inode”)File control block” (Unix: “inode”)ownership/permissionsdata location15-410, S’04- 1 -Memory StructuresIn-memory partition tablesIn-memory partition tablesSanity check file system I/O in correct partitionCached directory informationCached directory informationSystem-wide open-file tableSystem-wide open-file tableIn-memory file control blocksProcess open-file tablesProcess open-file tablesOpen mode (read/write/append/...)“Cursor” (read/write position)15-410, S’04- 1 -VFS layerGoalGoalAllow one machine to use multiple file system typesUnix FFSMS-DOS FATCD-ROM ISO9660Remote/distributed: NFS/AFSStandard system calls should work transparentlySolutionSolutionInsert a level of indirection!15-410, S’04- 1 -Single File Systemn = read(fd, buf, size)INT 54sys_read(fd, buf, len)rdblk(dev, N)sleep() wakeup()namei() iget() iput()startIDE() IDEintr()15-410, S’04- 1 -VFS “Virtualization”n = read(fd, buf, size)INT 54iget() iput()vfs_read()ufs_read() procfs_read()namei() procfs_domem()15-410, S’04- 1 -VFS layer – file system operationsstruct vfsops { char *name; int (*vfs_mount)(); int (*vfs_statfs)(); int (*vfs_vget)(); int (*vfs_unmount)(); ...}15-410, S’04- 1 -VFS layer – file operationsEach VFS provides an array of methodsEach VFS provides an array of methodsVOP_LOOKUP(vnode, new_vnode, name)VOP_CREATE(vnode, new_vnode, name, attributes)VOP_OPEN(vnode, mode, credentials, process)VOP_READ(vnode, uio, readwrite, credentials)Operating system provides fs-independent codeOperating system provides fs-independent codeValidating system call parametersMoving data from/to user memoryThread sleep/wakeupCaches (data blocks, name inode mappings)15-410, S’04- 1 -DirectoriesExternal interfaceExternal interfacevnode2 = lookup(vnode1, name)Traditional Unix FFS directoriesTraditional Unix FFS directoriesList of (name,inode #) - not sorted!Names are variable-lengthLookup is linearHow long does it take to delete N files?Common alternative: hash-table directoriesCommon alternative: hash-table directories15-410, S’04- 1 -Allocation / MappingAllocation problemAllocation problemWhere do I put the next block of this file?Near the previous block?Mapping problemMapping problemWhere is block 32 of this file?Similar to virtual memoryMultiple large “address spaces” specific to each fileOnly one underlying “address space” of blocksSource address space may be sparse!15-410, S’04- 1 -Allocation – ContiguousApproachApproachFile location defined as (start, length)MotivationMotivationSequential disk accesses are cheapBookkeeping is easyIssuesIssuesDynamic storage allocation (fragmentation, compaction)Must pre-declare file size at creation15-410, S’04- 1 -Allocation – LinkedApproachApproachFile location defined as (start)Each disk block contains pointer to nextMotivationMotivationAvoid fragmentation problemsAllow file growthIssues?Issues?15-410, S’04- 1 -Allocation – LinkedIssuesIssues508-byte blocks don't match memory pagesIn general, one seek per block
View Full Document