1Physical Data OrganizationCPS 116Introduction to Database Systems2Announcements (November 3) Homework #3 due today Project milestone #2 due in a week3Outline It’s all about disks! That’s why we always draw databases as And why the single most important metric in database processing is the number of disk I/O’s performed Storing data on a disk Record layout Block layout24Storage hierarchyRegistersCacheMemoryDiskTapesWhy a hierarchy?5How far away is data?Location CyclesRegisters 1On-chip cache 2On-board cache 10Memory 100Disk 106Tape 109Location Time(Source: AlphaSort paper, 1995)) I/O dominates—design your algorithms to reduce I/O!6A typical diskSpindle rotationPlatterPlatterSpindlePlatterTracksArm movementDisk armDisk headCylinders“Moving parts” are slow37Top viewTrackTrackTrackSectorsHigher-density sectors on inner tracksand/or more sectorson outer tracksA block is alogical unitof transferconsisting ofone or more sectors8Disk access timeSum of: Seek time: time for disk heads to move to the correct cylinder Rotational delay: time for the desired block to rotate under the disk head Transfer time: time to read/write data in the block (= time for disk to rotate over the block)9Random disk accessSeek time + rotational delay + transfer time Average seek time Time to skip one half of the cylinders? “Typical” value: 5 ms Average rotational delay Time for a half rotation (a function of RPM) “Typical” value: 4.2 ms (7200 RPM)410Sequential disk accessSeek time + rotational delay + transfer time Seek time 0 (assuming data is on the same track) Rotational delay 0 (assuming data is in the next block on the track) Easily an order of magnitude faster than random disk access!11Performance tricks Disk layout strategy Keep related things (what are they?) close together: same sector/block → same track → same cylinder → adjacent cylinder Double buffering While processing the current block in memory, prefetch the next block from disk (overlap I/O with processing) Disk scheduling algorithm Example: “elevator” algorithm Track buffer Read/write one entire track at a time Parallel I/O More disk heads working at the same time12Record layoutRecord = row in a table Variable-format records Rare in DBMS—table schema dictates the format Relevant for semi-structured data such as XML Focus on fixed-format records With fixed-length fields only, or With possible variable-length fields513Fixed-length fields All field lengths and offsets are constant Computed from schema, stored in the system catalog Example: CREATE TABLE Student(SID INT, name CHAR(20), age INT, GPA FLOAT);14204Bart (padded with space)2410 2.328 36Watch out for alignment May need to pad; reorder columns if that helps What about NULL?14Variable-length records Example: CREATE TABLE Student(SID INT, name VARCHAR(20), age INT, GPA FLOAT,comment VARCHAR(100)); Approach 1: use field delimiters (‘\0’ okay?) Approach 2: use an offset array Put all variable-length fields at the end (why?) Update is messy if it changes the length of a field14204Bart\010 2.3816Weird kid!\014204Bart10 2.3816Weird kid!18 22 3222 3215LOB fields Example: CREATE TABLE Student(SID INT, name CHAR(20), age INT, GPA FLOAT, picture BLOB(32000));Decomposition (automatically done by DBMS and transparent to the user) Student(SID, name, age, GPA) StudentPicture(SID, picture)616Block layoutHow do you organize records in a block? NSM (N-ary Storage Model) Most commercial DBMS PAX (Partition Attributes Across) Ailamaki et al., VLDB 200117NSM Store records from the beginning of each block Use a directory at the end of each block To locate records and manage free space Necessary for variable-length records142 Bart 10 2.3 123 Milhouse 10 3.1456 Ralph 8 2.3857 Lisa 8 4.3Why store data and directoryat two different ends?18Options Reorganize after every update/delete to avoid fragmentation (gaps between records) Need to rewrite half of the block on average What if records are fixed-length? Reorganize after delete• Only need to move one record• Need a pointer to the beginning of free space Do not reorganize after update• Need a bitmap indicating which slots are in use719Cache behavior of NSM Query: SELECT SID FROM Student WHERE GPA > 2.0;Assumption: cache block size < record size Lots of cache misses ID and GPA are not close enough by memory standards142 Bart 10 2.3 123 Milhouse 10 3.1456 Ralph 8 2.3857 Lisa 8 4.3142 Bart 10 2.3 123 Milhouse10 3.1 857 Lisa8 4.3 456 Ralph 8Cache2.3 20PAX Most queries only access a few columns Cluster values of the same columns in each block When a particular column of a row is brought into the cache, thesame column of the next row is brought in together142 123 857 4561111Bart Milhouse Lisa Ralph10 10 8 82.3 3.1 4.3 2.34(number of records)1111Reorganize after every update(for variable-length records only)and delete to keep fields together(IS NOT NULL bitmap)21Summary Storage hierarchy Why I/O’s dominate the cost of database operations Disk Steps in completing a disk access Sequential versus random accesses Record layout Handling variable-length fields Handling NULL Handling modifications Block layout NSM: the traditional layout PAX: a layout that tries to improve cache
View Full Document