FSU COP 5611 - Lecture 3 FFS LFS and RAID - D2446164

Home> Schools> Florida State University> Computer Science (COP) > COP 5611> Lecture 3 FFS LFS and RAID

DOC PREVIEW

FSU COP 5611 - Lecture 3 FFS LFS and RAID

School name Florida State University

Course Cop 5611- Advanced Operating Systems (3).

Pages 66

This preview shows page 1-2-3-4-31-32-33-34-35-63-64-65-66 out of 66 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 66 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

FFS LFS and RAID Andy Wang COP 5611 Advanced Operating Systems UNIX Fast File System Designed to improve performance of UNIX file I O Two major areas of performance improvement Bigger block sizes Better on disk layout for files Block Size Improvement Quadrupling of block size quadrupled amount of data gotten per disk fetch But could lead to fragmentation problems So fragments introduced Small files stored in fragments Fragments addressable but not independently fetchable Disk Layout Improvements Aimed toward avoiding disk seeks Bad if finding related files takes many seeks Very bad if find all the blocks of a single file requires seeks Spatial locality keep related things close together on disk Cylinder Groups A cylinder group a set of consecutive disk cylinders in the FFS Files in the same directory stored in the same cylinder group Within a cylinder group tries to keep things contiguous But must not let a cylinder group fill up Locations for New Directories Put new directory in relatively empty cylinder group What is empty Many free i nodes Few directories already there The Importance of Free Space FFS must not run too close to capacity No room for new files Layout policies ineffective when too few free blocks Typically FFS needs 10 of the total blocks free to perform well Performance of FFS 4 to 15 times the bandwidth of old UNIX file system Depending on size of disk blocks Performance on original file system Limited by CPU speed Due to memory to memory buffer copies FFS Not the Ultimate Solution Based on technology of the early 80s And file usage patterns of those times In modern systems FFS achieves only 5 of raw disk bandwidth The Log Structured File System Good large buffer caches can catch almost all reads But most writes have to go to disk So file system performance can be limited by writes So produce a FS that writes quickly Like an append only log Basic LFS Architecture Buffer writes send them sequentially to disk Data blocks Attributes Directories And almost everything else Converts small sync writes to large async writes A Simple Log Disk Structure File A File Z File M File A File F File A File L File L Block Block Block Block Block Block Block Block 7 1 202 3 1 7 26 25 Head of Log Key Issues in Log Based Architecture 1 Retrieving information from the log No matter how well you cache sooner or later you have to read 2 Managing free space on the disk You need contiguous space to write in the long run how do you get more Finding Data in the Log File A File Z File M File A File F File A File L File L Block Block Block Block Block Block Block Block 7 1 202 3 1 7 26 25 Give me block 25 of file L Or Give me block 1 of file F Retrieving Information From the Log Must avoid sequential scans of disk to read files Solution store index structures in log Index is essentially the most recent version of the i node Finding Data in the Log Foo Block1 old Foo Foo Block3 Block2 Foo Block 1 How do you find all blocks of file Foo Finding Data in the Log with an I node Foo Block1 old Foo Foo Block3 Block2 Foo Block 1 How Do You Find a File s I node You could search sequentially LFS optimizes by writing i node maps to the log The i node map points to the most recent version of each i node A file system s i nodes cover multiple blocks of i node map How Do You Find the Inode The Inode Map How Do You Find Inode Maps Use a fixed region on the disk that always points to the most recent i node map blocks But cache i node maps in main memory Small enough that few disk accesses required to find i node maps Finding I node Maps An old i node map New i node maps Reclaiming Space in the Log Eventually the log reaches the end of the disk partition So LFS must reuse disk space like superseded data blocks Space can be reclaimed in background or when needed Goal is to maintain large free extents on disk Example of Need for Reuse Head of log New data to be logged Major Alternatives for Reusing Log Threading Fast Fragmentation Slower reads Head of log New data to be logged Major Alternatives for Reusing Log Copying Simple Avoids fragmentation Expensive New data to be logged LFS Space Reclamation Strategy Combination of copying and threading Copy to free large fixed size segments Thread free segments together Try to collect long lived data permanently into segments A Threaded Segmented Log Head of log Cleaning a Segment 1 Read several segments into memory 2 Identify the live blocks 3 Write live data back hopefully into a smaller number of segments Identifying Live Blocks Clearly not feasible to track down live blocks of all files Instead each segment maintains a segment summary block Identifying what is in each block Crosscheck blocks with owning i node s block pointers Written at end of log write for low overhead Segment Cleaning Policies What are some important questions When do you clean segments How many segments to clean Which segments to clean How to group blocks in their new segments When to Clean Periodically Continuously During off hours When disk is nearly full On demand LFS uses a threshold system How Many Segments to Clean The more cleaned at once the better the reorganization of the disk But the higher the cost of cleaning LFS cleans a few tens at a time Till disk drops below threshold value Empirically LFS not very sensitive to this factor Which Segments to Clean Cleaning segments with lots of dead data gives great benefit Some segments are hot some segments are cold But cold free space is more valuable than hot free space Since cold blocks tend to stay cold Cost Benefit Analysis u utilization A age Benefit to cost u A u 1 Clean cold segments with some space hot segments with a lot of space What to Put Where Given a set of live blocks and some cleaned segments which goes where Order blocks by age Write them to segments oldest first Goal is very cold highly utilized segments Goal of LFS Cleaning number of segments empty number of segments 100 full empty 100 full Performance of LFS On modified Andrew benchmark 20 faster than FFS LFS can create and delete 8 times as many files per second as FFS LFS can read 1 times as many small files LFS slower than FFS at sequential reads of randomly written files Logical Locality vs Temporal Locality Logical locality spatial locality Normal file systems keep a file s data blocks close together Temporal locality LFS keeps data written at the same time close together When temporal locality logical locality Systems perform the same Major Innovations of LFS

View Full Document