Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43Slide 44Slide 45Slide 46Slide 4715-410, F’04- 1 -File System (Internals)Nov. 12, 2004Dave EckhardtDave EckhardtBruce MaggsBruce MaggsL28_Filesystem15-410“...Does this look familiar?...”15-410, F’04- 1 -SynchronizationProject 3 statusProject 3 statusMany groups took advantage of opportunity to planSeveral groups seem on track to finish earlyWatch out for the “90% problem”First 90% of the work takes the first 90% of the timeLast 10% of the work takes the second 90% of the timeWe want everybody to finish!We want everybody to finish!Project 3 is the core experience of the classCan't bury it and move on!15-410, F’04- 1 -SynchronizationProject 3 / Project 4 “hurdle” test suiteProject 3 / Project 4 “hurdle” test suiteReleased later todayFour sectionsShell, basic tests, solidity tests, stability testsAt P3 deadline, you will run the hurdle testsAt P3 deadline, you will run the hurdle testsGoal: pass ~80% of each section (details on web page)Register to begin Project 4 (some P3 extensions)Not passing the hurdle?Not passing the hurdle?Extra week to work on P3Cannot submit P4, grade will be 0%15-410, F’04- 1 -SynchronizationAgonizing over “close” probably doesn't make senseAgonizing over “close” probably doesn't make senseEither you have completed the kernel...Pass basically all testsCode quality is at least “good”Preemptibility & concurrency control are reasonableYou feel confident»(except maybe a little unsure about thread support)Or you probably haven'tGrading test suite much larger than hurdle suite»So “almost passing” could be hiding a lot of problems“Almost passing the hurdle” plus two “I hope they don't read this part” issues probably isn't passing15-410, F’04- 1 -SynchronizationTodayTodayChapter 12 (not: Log-structured, NFS)15-410, F’04- 1 -OutlineFile system code layers (abstract)File system code layers (abstract)Disk, memory structuresDisk, memory structuresUnix “VFS” layering indirectionUnix “VFS” layering indirectionDirectoriesDirectoriesBlock allocation strategies, free spaceBlock allocation strategies, free spaceCache tricksCache tricksRecovery, backupsRecovery, backups15-410, F’04- 1 -File System LayersDevice driversDevice driversread/write(disk, start-sector, count)Block I/OBlock I/Oread/write(partition, block) [cached]File I/OFile I/Oread/write (file, block)File systemFile systemmanage directories, free space15-410, F’04- 1 -File System LayersMulti-filesystem namespaceMulti-filesystem namespacePartitioning, names for devicesMountingUnifying multiple file system typesUFS, ext2fs, ext3fs, reiserfs, FAT, 9660, ...15-410, F’04- 1 -Shredding DisksSplit disk into Split disk into partitionspartitions/slices/minidisks/.../slices/minidisks/...–PC: 4 “partitions” – Windows, FreeBSD, Plan 9–Mac: “volumes” – OS 9, OS X, system vs. user dataOr: glue disks together into Or: glue disks together into volumesvolumes/logical disks/logical disksPartition may contain...Partition may contain...–Paging area●Indexed by in-memory structures●“random garbage” when OS shuts down–File system●Block allocation: file # block list●Directory: name file #15-410, F’04- 1 -Shredding Disks# fdisk -s/dev/ad0: 993 cyl 128 hd 63 secPart Start Size Type Flags 1: 63 1233729 0x06 0x00 2: 1233792 6773760 0xa5 0x8015-410, F’04- 1 -Shredding Disks8 partitions:# size offset fstype [fsize bsize bps/cpg] a: 131072 0 4.2BSD 2048 16384 101 # (Cyl. 0 - 16*) b: 393216 131072 swap # (Cyl. 16*- 65*) c: 6773760 0 unused 0 0 # (Cyl. 0 - 839) e: 65536 524288 4.2BSD 2048 16384 104 # (Cyl. 65*- 73*) f: 6183936 589824 4.2BSD 2048 16384 89 # (Cyl. 73*- 839*)Filesystem 1K-blocks Used Avail Capacity Mounted on/dev/ad0s2a 64462 55928 3378 94% //dev/ad0s2f 3043806 2608458 191844 93% /usr/dev/ad0s2e 32206 7496 22134 25% /varprocfs 4 4 0 100% /proc(FreeBSD 4.7 on ThinkPad 560X)(FreeBSD 4.7 on ThinkPad 560X)15-410, F’04- 1 -Disk StructuresBoot area (first block/track/cylinder)Boot area (first block/track/cylinder)Interpreted by hardware bootstrap (“BIOS”)May include partition tableFile system control blockFile system control blockKey parameters: #blocks, metadata layoutUnix: “superblock”““File control block” (Unix: “inode”)File control block” (Unix: “inode”)ownership/permissionsdata location15-410, F’04- 1 -Memory StructuresIn-memory partition tablesIn-memory partition tablesSanity check file system I/O in correct partitionCached directory informationCached directory informationSystem-wide open-file tableSystem-wide open-file tableIn-memory file control blocksProcess open-file tablesProcess open-file tablesOpen mode (read/write/append/...)“Cursor” (read/write position)15-410, F’04- 1 -VFS layerGoalGoalAllow one machine to use multiple file system typesUnix FFSMS-DOS FATCD-ROM ISO9660Remote/distributed: NFS/AFSStandard system calls should work transparentlySolutionSolutionInsert a level of indirection!15-410, F’04- 1 -Single File Systemn = read(fd, buf, size)INT 54sys_read(fd, buf, len)rdblk(dev, N)sleep() wakeup()namei() iget() iput()startIDE() IDEintr()15-410, F’04- 1 -VFS “Virtualization”n = read(fd, buf, size)INT 54iget() iput()vfs_read()ufs_read() procfs_read()namei() procfs_domem()15-410, F’04- 1 -VFS layer – file system operationsstruct vfsops { char *name; int (*vfs_mount)(); int (*vfs_statfs)(); int (*vfs_vget)(); int (*vfs_unmount)(); ...}15-410, F’04- 1 -VFS layer – file operationsEach VFS provides an array of methodsEach VFS provides an array of methodsVOP_LOOKUP(vnode, new_vnode, name)VOP_CREATE(vnode, new_vnode, name, attributes)VOP_OPEN(vnode, mode, credentials, process)VOP_READ(vnode, uio, readwrite, credentials)Operating system provides fs-independent codeOperating system
View Full Document