Princeton COS 318 - Filesystems – Metadata, Paths, & Caching - D850756

Home> Schools> Princeton University> Computer Science (COS) > COS 318> Filesystems – Metadata, Paths, & Caching

DOC PREVIEW

Princeton COS 318 - Filesystems – Metadata, Paths, & Caching

School name Princeton University

Course Cos 318- Computing for Physical & Social Sciences

Pages 34

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 34 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Filesystems – Metadata, Paths, & CachingDiskgedankenToday’s OverviewQuiz 1 ObservationsOccam’s RazorA Reasonable ApproachChanges Over TimeMost Common AnswerLinked Files (Alto)Contiguous AllocationSingle-Level Indexed Files or Extent-based FilesystemsFile Allocation Table (FAT)Multi-Level Indexed Files (Unix)Reliability In Disk SystemsRecovery After FailureReducing Synchronous TimesChallengesBigger, Faster, StrongerRAID (Redundant Array of Inexpensive Disks)Synopsis of RAID LevelsDid RAID Work?RAID’s Real BenefitNamespaceA Sample File TreeWhat If You Have Two Disks?As Mariah’s Files Grow?Mount PointsSlide 28PathsFinding PathsConsider The FollowingVarious SolutionsDoes It “Do What You Want”Symbolic LinkFilesystems – Metadata, Paths, & CachingVivek PaiPrinceton University2DiskgedankenAssuming you back-up and restore files, what factors affect the time involved?How are these factors changing?What issues affect the rates of change?How is total backup time changing over the years?What is Occam’s razor?3Today’s OverviewQuiz recapFinish up metadata, reliabilityA little discussion of mounting, etcMove on to performance4Quiz 1 ObservationsI’m disappointedQuizzes not yet graded, but…Most people did poorly on question 1Lots of dimensional analysisLots of sleepers, chatting, weird facesVery little (too little) feedback in generalOpen question – looking for a methodical approach5Occam’s RazorFrom William of Occam (philosopher)“entities should not be multiplied unnecessarily”Often reduced to other statements“one should not increase, beyond what is necessary, the number of entities required to explain anything”“Make as few assumptions as possible”“once you have eliminated all other possible explanations, what remains must be the answer”6A Reasonable ApproachDisk size: 40GB (20-80GB common)File size: 10KB (5-20KB common)Access time: 10ms (5-20ms common)Assume 1 seek per file (reasonable)100 files = 1MB, each access .01 secSo, 40GB at 1MB/s = 40K sec = 11+ hours7Changes Over TimeDisk density doubling each yearSeek time dropping < 10%File size growing slowlyResults# of files grows faster than access time reductionBackup time increases8Most Common AnswerDisk size / maximum transfer rateIn other words, read sectors, not filesCan this be done?Yes, if you have access to “raw” diskWhich means that you have “root” permissionAnd that the system has raw disk supportFaster than file-based dump/restoreNo concept of files, howeverWhat happens if you restore to a disk with a different geometry?9Linked Files (Alto)File header points to 1st block on diskEach block points to nextProsCan grow files dynamicallyFree list is similar to a fileConsrandom access: horribleunreliable: losing a block means losing the restFile headernull. . .10Contiguous AllocationRequest in advance for the size of the fileSearch bit map or linked list to locate a spaceFile headerfirst sector in filenumber of sectorsProsFast sequential accessEasy random accessConsExternal fragmentationHard to grow files11Single-Level Indexed Files orExtent-based FilesystemsA user declares max sizeA file header holds an array of pointers to point to disk blocksProsCan grow up to a limitRandom access is fastConsClumsy to grow beyond limitPeriodic cleanup of new filesUp-front declaration a real painFile headerDiskblocks12217File Allocation Table (FAT)ApproachA section of disk for each partition is reservedOne entry for each blockA file is a linked list of blocksA directory entry points to the 1st block of the fileProsSimpleConsAlways go to FATWasting space619399foo217EOFFAT039961913Multi-Level Indexed Files (Unix)13 Pointers in a header10 direct pointers11: 1-level indirect12: 2-level indirect13: 3-level indirectPros & ConsIn favor of small filesCan growLimit is 16G and lots of seekWhat happens to reach block 23, 5, 340?1 2 datadata...11 12 13 data... ... data... ... data... ...14Reliability In Disk SystemsMake sure certain actions have occurred before function completesKnown as “synchronous” operationEx: make sure new inode is on disk & that the directory has been modified before declaring a file creation is completeDrawback: speedSome ops easily asynchronous: access timeSome filesystems don’t care: Linux ext2fs15Recovery After FailureNeed to ensure consistencyDoes free bitmap match tree walk?Do reference counts in inodes match directory entries?Do blocks appear in multiple inodes?This kind of recovery grows with disk sizeClean shutdown – mark as such, no recovery16Reducing Synchronous TimesWrite to a faster storageNonvolatile memory – expensive, requires some additional OS/firmware supportWrite to a special disk or section – loggingOnly have to examine log when recoveringEventually have to put information in placeSome information dies in the log itselfWrite in a special orderWrite metadata in a way that is consistent but possibly recovers less17ChallengesUnix filesystem has great flexibilityExtent-based filesystems have speedSeeks kill performance – localityBitmaps show contiguous free spaceLinked lists easy to searchHow do you perform backup/restore?18Bigger, Faster, StrongerMaking individual disks larger is hardThrow more disks at the problemCapacity increasesEffective access speed may increaseProbability of failure also increasesUse some disks to provide redundancyGenerally assume a fail-stop modelFail-stop versus Byzantine failures19RAID (Redundant Array of Inexpensive Disks)Main ideaStore the error correcting codes on other disksGeneral error correcting codes are too powerfulUse XORs or single parityUpon any failure, one can recover the entire block from the spare disk (or any disk) using XORsProsReliabilityHigh bandwidthConsThe controller is complexRAID controllerXOR20Synopsis of RAID LevelsRAID Level 0: Non redundant (JBOD)RAID Level 1:MirroringRAID Level 2:Byte-interleaved, ECCRAID Level 3:Byte-interleaved, parityRAID Level 4:Block-interleaved, parityRAID Level 5:Block-interleaved, distributed parity21Did RAID Work?Performance: yesReliability:

View Full Document