UT CS 372 - Lecture 19 File system – data layout, naming - D467972

Home> Schools> University of Texas at Austin> Computer Science (CS) > CS 372> Lecture 19 File system – data layout, naming

UT CS 372 - Lecture 19 File system – data layout, naming

School name University of Texas at Austin

Course Cs 372- Introduction to Operating Systems

Pages 13

Download Save

Unformatted text preview:

1.1 Case study: sector layout2. Technology trends3. Data layout on disk4. Disk management mechanisms4.1 contiguous allocation4.2 Linked files4.3 FAT (MS-DOS, Windows9x, OS2)4.4 Indexed files (VMS)4.5 Multilevel index (Unix 4.1)4.6 DEMOS4.7 UNIX BSD 4.24.8 NTFS5. Policy v. mechanismCS 372: Operating Systems Mike Dahlin Lecture# 19: File system – data layout, naming ********************************* Review -- 1 min ********************************* Intro to I/O Performance model: Log Disk physical characteristics/desired abstractions Physical reality Desired abstraction disks are slow fast access to data sector addresses (“platter 2, cylinder 42, sector 15”) named files, directories write 1 sector at a time atomic writes, transactions 1.1 Case study: sector layout What is the fastest way to lay out a sequential file on disk answer 1: a series of sequential sectors on a track problem (in old systems) read sector 1 process sector 1 read sector 2 -- whoops, sector 2 is already past wait 1 rotation read sector 2 …  N rotations to read N blocks  BW for sequential read is 512 bytes/rotation = 100KB/s answer 2: (in old systems) skip 1 sector (or 2 sectors) between sequential blocks  2 rotations to read N blocks 1CS 372: Operating Systems Mike Dahlin answer 3: (modern systems) track buffer -- on-disk cache read entire sector into track buffer in parallel (once sector 1 arrives…) read sector 1 (from track ) then read sector 2 …  1 rotation to read N blocks Moral: OS designer needs to understand physical properties of disk Latency, overhead, bandwidth: From disk -- what is overhead for a 1-sector read? what is latency for a 1-sector read? what is bandwidth term for a 1-sector read? From CPU/memory system  what is overhead for a 1-sector read  what is latency for a 1 sector read  what is BW term for a 1-sector read Be careful: What is end-to-end average bandwidth for a 1-sector read (people phrase this question to mean end-to-end bytes/sec including latency and overhead) 2. Technology trends 1. Disks getting smaller for similar capacity smaller  disk spins faster (less rotational delay, higher BW) smaller  less distance for head to travel (faster seeks) smaller  lighter weight (for portables) 2. disk data getting denser (more bits/square inch; allows smaller disks w/o sacrificing capacity) Tracks closer together  faster seeks 3. Disks getting cheaper (2x/year since 1991) 4. Disks getting (a little) faster seek, rotation – 5-10%/year (2-3x per decade) bandwdith – 20-30%/year (~10x per decade) Overall – disk density ($/byte) improving much faster than mechanical limitations (seek, rotation) 2CS 372: Operating Systems Mike Dahlin Key to improving density: get head close to surface Heads are spring loaded, aerodynamically designed to fly as close to surface as possible (also, lightweight to allow for faster seeks) What happens if head contacts surface? Head crash – scrapes off magnetic material (and data) ********************************* Outline - 1 min ********************************** Data layout -- given a file header, find the file’s blocks mechanism v. policy ********************************* Preview - 1 min ********************************* File systems • Performance -- data layout • Performance/persistence -- naming • Reliability -- transactions Networks Security ********************************* Lecture - 20 min ********************************* 3. Data layout on disk 2 driving forces 1) technology: avoid seeks, rotation (last time) 3CS 372: Operating Systems Mike Dahlin 2) workloads: How do users access files? 1. Sequential access – bytes read in order (give me the next X bytes, then give me the next) 2. Random access - read/write elements out of middle of array (give me bytes j-k) How are files typically used? 1. Most files are small (e.g. .login, .c files) 2. Large files use up most of the disk space 3. Large files account for most of the bytes transferred to/from disk Bad news: need everything to be efficient • Need small files to be efficient since lots of them • need large files to be efficient, b/c most of the disk space, most of the I/O due to them 4. Disk management mechanisms How do we organize files on disk? recall – seeks are slow, for good bandwidth lay data out on disk sequentially 2 tasks (1) find ith block of a file easily (2) quickly access ith block of file common data structures file header – one per file; which disk sectors are associated with each file  Head of linked list, array, root of tree  find ith block of file 4CS 372: Operating Systems Mike Dahlin What about performance Separate mechanism from policy – once I can find where ith block of file is no matter where it is, then I have freedom to place any block anywhere  policy choice to lay data out sequentially when possible. TO support such policies: free space (bitmap) – 1 bit per block or sector; blocks numbered in cylinder-major order, so that adjacent numbered blocks can be accessed without seeks or rotational delay Other aspect of performance caching – every OS today keeps a cache of recently used disks blocks in memory to avoid having to go to disk. Common to all organizations. For now, assume no cache; add it later. 4.1 contiguous allocation User says in advance how big file will be Search bit map (using best fit/first fit) to locate space for file File header contains: • first sector in file • file size (# sectors) Pros & cons: + fast sequential access + easy random access DA: external fragmentation DA: hard to grow files 4.2 Linked files Each block, pointer to next on disk (Xerox Alto) (DRAW PICTURE) 5CS 372: Operating Systems Mike Dahlin file header – points to first block on disk Pros&cons + can grow files dynamically + free list managed same as file DA: sequential access horrible: seek between each block DA: random access is horrible DA: unreliable (lose block, lose rest of file) 4.3 FAT (MS-DOS, Windows9x, OS2) Store liked list in separate table ("File allocation table") A table entry for each block on disk Each table entry in a file has pointer to next table entry in file (with special "eof" value to mark end) Use "0" value to mean "free" (why not just put free elements on linked free list?) compare to linked allocation Sequential access OK if FAT is

View Full Document


School:
Email:
New Password:
Confirm Password:

UT CS 372 - Lecture 19 File system – data layout, naming

Sign up for free to view:

Please select your school