Data StorageOverviewStorage HierarchySecondary Storage DevicesDisksComponents of A DiskComponents of A Disk (cont.)Slide 8Disk Access TimeSeek TimeAverage Random Seek TimeRotational DelayTransfer TimeFormulasSlide 15Units!!!How many bytes?What about Reading Next Block?What About Writing and Updating?Disk Example: IBM Ultrastar 36LPArranging Pages on DiskDisk Space ManagementBuffer Management in a DBMSWhen a Page is Requested ...Buffer Replacement PolicyRepresenting RecordsExample: Fixed Format and LengthFile of RecordsPacking Records into BlocksFile Size: Spanned vs UnspannedSummarySummary (cont.)Look AheadData StorageJohn OrtizLecture 17 Data Storage 2OverviewDatabase stores data on secondary storage Disk has distinct storage and access characteristicsDBMS must bring data from disk to main memory buffer for processing & write data from memory buffer to disk for storageLow level DBMS software must effectively manage disk and buffer spaceHow disks are accessed?How to manage a buffer?Lecture 17 Data Storage 3Storage HierarchyArchitecture of a typical computerProcessor: fast, slow, RISC, cache, pipelined. Typical speed: 100 500 1000 MIPSMain Memory: fast, slow, volatile, read-only Access time: 1s – 1nsPMSecondaryStorageCLecture 17 Data Storage 4Secondary Storage DevicesDisk:Floppy (hard, soft)Removable PacksWinchesterRam disksOptical, CDROM, DVD ROMDisk ArraysTapeReal, cartridge, robotsLecture 17 Data Storage 5DisksDBMS stores information on (“hard”) disks.This has major implications for DBMS design!Data must be transferred from disk to main memory (for read) & vice versa (for write)Transfer unit: block (= 1 or more sectors)Why not store everything in main memory?Costs too much. $147 will buy you 128MB of RAM or 40GB of disk (Oct. 2000).Main memory is volatile. We want data to be saved between runs.Lecture 17 Data Storage 6Components of A DiskPlattersSpindleDisk headArm movementArm assemblyTracksSectorTop viewLecture 17 Data Storage 7Components of A Disk (cont.)Terms: Platter, Head, Actuator, Cylinder, Track, Sector, Block (Page) , Gap, ClusterCommon measurementDiameter: 1 inch 15 inchesCylinders: 100 16383Surfaces: 1 (CDs) 10Tracks/cyl: 2 (floppies) 30Sector Size: 512B 64KCapacity: 360 KB (old floppy) 200 GBLecture 17 Data Storage 8Components of A Disk (cont.)Division of tracks into sectors is hard coded on disk surfaceA sector is subdivided into one or more blocksFormatting sets the block sizeInterblock gaps make the difference between formatted and unformatted capacityLecture 17 Data Storage 9block xin memory?I wantblock XDisk Access TimeAccess Time = Seek Time + Rotational Delay + Transfer Time + OtherLecture 17 Data Storage 10Seek TimeTime to move arms to position disk head on a given track3 or 5xx1 NCylinders TraveledTimeLecture 17 Data Storage 11Average Random Seek TimeTypical S: 8 ms 40 ms N = number of tracks/surface (= # cylinders))()(11 1 NNjiSeektimeSNiNijjLecture 17 Data Storage 12Rotational DelayTime to wait for block to rotate under head also called rotational latencyAverage R = 1/2 revolutionTypically R = 4.2 ms (7200 RPM)Complication: may need to wait for track start Head HereStart TrackBlock I WantLecture 17 Data Storage 13Transfer TimeTime for actually moving data to/from disk surfaceTypical transfer rate tr: 16 100 MB/sBlock Transfer Time btt = block size / trTypically, 4 KB: about 1 msOther DelayCPU time to issue I/OContention for controllerContention for bus, memoryTypical value: 0Lecture 17 Data Storage 14Formulastr = track size * rpm (transfer rate)rd = ½ * 1/rpm (rotational delay)btt = B/tr (block transfer time)B = Block Sizebtr = ( B/(B + G) * tr) (bulk transfer rate)G = interblock gap sizeRead ‘K’ consecutive blocks:s + rd + (B/btr) * KRead ‘K’ random blocks(s + rd + (B/btr) ) * KLecture 17 Data Storage 15FormulasLog base N of X = log X/log NIf no base indicated, assume base 10Log2 500 = log 500/log 2bfr = floor(B/record size)File Size (in blocks) = ceiling (# records/bfr) (unspanned)= ceiling(# bytes in file/B) (spanned)Lecture 17 Data Storage 16Units!!!The units are just as important as the valueUse the units to check your resultExample: tr = track size * rpmGiven track size = 32768 bytes and 3600 rpm, what is the transfer rate?32768 * 3600 = 117,964,800 which SEEMS very unreasonable!However, it is 117,964,800 bytes per minute!1,966,080 bytes per second, = 1.875 MB/secPretty Slow for a modern disk!Lecture 17 Data Storage 17How many bytes?1 kilobyte = 1024 bytes (most of the time!)1 megabyte = 10242 = 1,048,576 bytes1 gigabyte = 10243 bytes1 terabyte = 10244 bytesDisk manufacturers often use 1000 bytes as a kilobyte for marketing purposes!My 200 GB disk drive, is actually 186 GB, though it’s 200 billion bytesRAM marketing has so far not followed suit.Lecture 17 Data Storage 18What about Reading Next Block?If we do things right (double buffer, stagger blocks …) Time = T + NegligibleSkip gap, change track, change cylinderRule of Thumb:Random I/O: ExpensiveSequential I/O: Much lessEX: For 1 KB block Random I/O : about 20ms Sequential I/O: about 1msLecture 17 Data Storage 19What About Writing and Updating?Cost of writing is similar to reading, unless we need to verifyAdd full rotation + TCost of modifying a block:(a) Read Block(b) Modify in Memory(c) Write Block[(d) Verify?]Block Address: Device, Cylinder #, Surface #, SectorLecture 17 Data Storage 20Disk Example: IBM Ultrastar 36LPFormatted capacity: 36.9 GBSector size: 512 to 528 variable (2-byte inc)Platters: 10Max. recording density: 350000 BPITrack density: 18400 TPI (per surface)Rotation speed: 7200 RPMAverage rotational delay: 4.17 msSustained data rate: 19.5-31.9 MBSeek time: average 6.8ms, next track 0.6msLecture 17 Data Storage 21Arranging Pages on DiskBlocks in a file should be arranged sequentially on disk (by next block), to minimize seek and rotational delay.The concept of next block : 1. Next block on the same track2. 1st block on next track of the same cylinder3. 1st block on 1st track of the next cylinder For a sequential scan, pre-fetching several pages at a time is a big
View Full Document