Unformatted text preview:

Recap of Feb 25 Physical Storage Media Issues are speed cost reliability Media types Primary storage volatile Cache Main Memory Secondary or On line storage non volatile Flash Memory Mag Disk Tertiary or Off line storage non volatile Optical Storage Tape Storage Mag disk issues definitions sector track cylinder disk controllers multiple disks disk performance measures seek time rotational latency data transfer rate MTTF Now we start with Optimization of Disk Block Access Optimization of Disk Block Access Motivation Requests for disk I O are generated both by the file system and by the virtual memory manager Each request specifies the address on the disk to be referenced in the form of a block number a block is a contiguous sequence of sectors from a single track on one platter block sizes range from 512 bytes to several K 4 16K is typical smaller blocks mean more transfers from disk larger blocks makes for more wasted space due to partially filled blocks block is the standard unit of data transfer between disk to main memory Since disk access speed is much slower than main memory access methods for optimizing disk block access are important Optimization of Disk Block Access Methods Disk arm Scheduling requests for several blocks may be speeded up by requesting them in the order they will pass under the head If the blocks are on different cylinders it is advantageous to ask for them in an order that minimizes disk arm movement Elevator algorithm move the disk arm in one direction until all requests from that direction are satisfied then reverse and repeat Sequential access is 1 2 orders of magnitude faster random access is about 2 orders of magnitude slower Optimization of Disk Block Access Methods Non volatile write buffers store written data in a RAM buffer rather than on disk write the buffer whenever it becomes full or when no other disk requests are pending buffer must be non volatile to protect from power failure called non volatile random access memory NV RAM typically implemented with battery backed up RAM dramatic speedup on writes with a reasonable sized buffer write latency essentially disappears why can t we do the same for reads hints ESP clustering Optimization of Disk Block Access Methods File organization Clustering reduce access time by organizing blocks on disk in a way that corresponds closely to the way we expect them to be accessed sequential files should be kept organized sequentially hierarchical files should be organized with mothers next to daughters for joining tables relations put the joining tuples next to each other over time fragmentation can become an issue restoration of disk structure copy and rewrite reordered controls fragmentation Optimization of Disk Block Access Methods Log based file system does not update in place rather writes updates to a log disk essentially a disk functioning as a non volatile RAM write buffer all access in the log disk is sequential eliminating seek time eventually updates must be propogated to the original blocks as with NV RAM write buffers this can occur at a time when no disk requests are pending the updates can be ordered to minimize arm movement this can generate a high degree of fragmentation on files that require constant updates fragmentation increases seek time for sequential reading of files Storage Access 11 5 Basic concepts some already familiar block based A block is a contiguous sequence of sectors from a single track blocks are units of both storage allocation and data transfer a file is a sequence of records stored in fixed size blocks pages on the disk each block page has a unique address called BID optimization is done by reducing I O seek time etc database systems seek to minimize the number of block transfers between the disk and memory We can reduce the number of disk accesses by keeping as many blocks as possible in main memory Buffer portion of main memory used to store copies of disk blocks buffer manager subsystem responsible for allocating buffer space in main memory and handling block transfer between buffer and disk Buffer Management The buffer pool is the part of the main memory alocated for temporarily storing disk blocks read from disk and made available to the CPU The buffer manager is the subsystem responsible for the allocation and the management of the buffer space transparent to users On a process user request for a block page the buffer manager checks to see if the page is already in the buffer pool if so passes the address to the process if not it loads the page from disk and then passes the address to the process loading a page might require clearing writing out a page to make space Very similar to the way virtual memory managers work although it can do a lot better why Buffer Replacement Strategies Most operating systems use a LRU replacement scheme In database environments MRU is better for some common operations e g join LRU strategy replace the least recently used block MRU strategy replace the most recently used block Sometimes it is useful to fasten or pin blocks to keep them available during an operation and not let the replacement strategy touch them pinned block is thus a block that is not allowed to be written back to disk There are situations where it is necessary to write back a block to disk even though the buffer space it occupies is not yet needed This write is called the forced output of a block useful in recovery situations Toss immediate strategy free the space occupied by a block as soon as the final tuple of that block has been processed Buffer Replacement Strategies Most recently used MRU strategy system must pin the block currently being processed After the final tuple of that block has been processed the block is unpinned and becomes the most recently used block This is essentially toss immediate with pinning and works very well with joins The buffer manager can often use other information design or statistical to predict the probability that a request will reference a particular page e g the data dictionary is frequently accessed keep the data dictionary blocks in main memory buffer if several pages are available for overwrite choose the one that has the lowest number of recent access requests to replace Buffer Management cont Existing OS affect DBMS operations by read ahead write behind wrong replacement strategies Unix is not good for DBMS to run on top Most commercial systems implement their own I O on a raw disk partition Variations of buffer


View Full Document

UMD CMSC 424 - Physical Storage Media

Documents in this Course
Lecture 2

Lecture 2

36 pages

Databases

Databases

44 pages

Load more
Loading Unlocking...
Login

Join to view Physical Storage Media and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Physical Storage Media and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?