Unformatted text preview:

Data StorageOverviewStorage HierarchySecondary Storage DevicesDisksComponents of A DiskComponents of A Disk (cont.)Slide 8Disk Access TimeSeek TimeAverage Random Seek TimeRotational DelayTransfer TimeFormulasSlide 15Units!!!How many bytes?What about Reading Next Block?What About Writing and Updating?Disk Example: IBM Ultrastar 36LPArranging Pages on DiskDisk Space ManagementBuffer Management in a DBMSWhen a Page is Requested ...Buffer Replacement PolicyRepresenting RecordsExample: Fixed Format and LengthFile of RecordsPacking Records into BlocksFile Size: Spanned vs UnspannedSummarySummary (cont.)Look AheadData StorageJohn OrtizLecture 17 Data Storage 2OverviewDatabase stores data on secondary storage Disk has distinct storage and access characteristicsDBMS must bring data from disk to main memory buffer for processing & write data from memory buffer to disk for storageLow level DBMS software must effectively manage disk and buffer spaceHow disks are accessed?How to manage a buffer?Lecture 17 Data Storage 3Storage HierarchyArchitecture of a typical computerProcessor: fast, slow, RISC, cache, pipelined. Typical speed: 100  500  1000 MIPSMain Memory: fast, slow, volatile, read-only Access time: 1s – 1nsPMSecondaryStorageCLecture 17 Data Storage 4Secondary Storage DevicesDisk:Floppy (hard, soft)Removable PacksWinchesterRam disksOptical, CDROM, DVD ROMDisk ArraysTapeReal, cartridge, robotsLecture 17 Data Storage 5DisksDBMS stores information on (“hard”) disks.This has major implications for DBMS design!Data must be transferred from disk to main memory (for read) & vice versa (for write)Transfer unit: block (= 1 or more sectors)Why not store everything in main memory?Costs too much. $147 will buy you 128MB of RAM or 40GB of disk (Oct. 2000).Main memory is volatile. We want data to be saved between runs.Lecture 17 Data Storage 6Components of A DiskPlattersSpindleDisk headArm movementArm assemblyTracksSectorTop viewLecture 17 Data Storage 7Components of A Disk (cont.)Terms: Platter, Head, Actuator, Cylinder, Track, Sector, Block (Page) , Gap, ClusterCommon measurementDiameter: 1 inch  15 inchesCylinders: 100  16383Surfaces: 1 (CDs)  10Tracks/cyl: 2 (floppies)  30Sector Size: 512B  64KCapacity: 360 KB (old floppy)  200 GBLecture 17 Data Storage 8Components of A Disk (cont.)Division of tracks into sectors is hard coded on disk surfaceA sector is subdivided into one or more blocksFormatting sets the block sizeInterblock gaps make the difference between formatted and unformatted capacityLecture 17 Data Storage 9block xin memory?I wantblock XDisk Access TimeAccess Time = Seek Time + Rotational Delay + Transfer Time + OtherLecture 17 Data Storage 10Seek TimeTime to move arms to position disk head on a given track3 or 5xx1 NCylinders TraveledTimeLecture 17 Data Storage 11Average Random Seek TimeTypical S: 8 ms  40 ms N = number of tracks/surface (= # cylinders))()(11 1 NNjiSeektimeSNiNijjLecture 17 Data Storage 12Rotational DelayTime to wait for block to rotate under head also called rotational latencyAverage R = 1/2 revolutionTypically R = 4.2 ms (7200 RPM)Complication: may need to wait for track start Head HereStart TrackBlock I WantLecture 17 Data Storage 13Transfer TimeTime for actually moving data to/from disk surfaceTypical transfer rate tr: 16  100 MB/sBlock Transfer Time btt = block size / trTypically, 4 KB: about 1 msOther DelayCPU time to issue I/OContention for controllerContention for bus, memoryTypical value: 0Lecture 17 Data Storage 14Formulastr = track size * rpm (transfer rate)rd = ½ * 1/rpm (rotational delay)btt = B/tr (block transfer time)B = Block Sizebtr = ( B/(B + G) * tr) (bulk transfer rate)G = interblock gap sizeRead ‘K’ consecutive blocks:s + rd + (B/btr) * KRead ‘K’ random blocks(s + rd + (B/btr) ) * KLecture 17 Data Storage 15FormulasLog base N of X = log X/log NIf no base indicated, assume base 10Log2 500 = log 500/log 2bfr = floor(B/record size)File Size (in blocks) = ceiling (# records/bfr) (unspanned)= ceiling(# bytes in file/B) (spanned)Lecture 17 Data Storage 16Units!!!The units are just as important as the valueUse the units to check your resultExample: tr = track size * rpmGiven track size = 32768 bytes and 3600 rpm, what is the transfer rate?32768 * 3600 = 117,964,800 which SEEMS very unreasonable!However, it is 117,964,800 bytes per minute!1,966,080 bytes per second, = 1.875 MB/secPretty Slow for a modern disk!Lecture 17 Data Storage 17How many bytes?1 kilobyte = 1024 bytes (most of the time!)1 megabyte = 10242 = 1,048,576 bytes1 gigabyte = 10243 bytes1 terabyte = 10244 bytesDisk manufacturers often use 1000 bytes as a kilobyte for marketing purposes!My 200 GB disk drive, is actually 186 GB, though it’s 200 billion bytesRAM marketing has so far not followed suit.Lecture 17 Data Storage 18What about Reading Next Block?If we do things right (double buffer, stagger blocks …) Time = T + NegligibleSkip gap, change track, change cylinderRule of Thumb:Random I/O: ExpensiveSequential I/O: Much lessEX: For 1 KB block Random I/O : about 20ms Sequential I/O: about 1msLecture 17 Data Storage 19What About Writing and Updating?Cost of writing is similar to reading, unless we need to verifyAdd full rotation + TCost of modifying a block:(a) Read Block(b) Modify in Memory(c) Write Block[(d) Verify?]Block Address: Device, Cylinder #, Surface #, SectorLecture 17 Data Storage 20Disk Example: IBM Ultrastar 36LPFormatted capacity: 36.9 GBSector size: 512 to 528 variable (2-byte inc)Platters: 10Max. recording density: 350000 BPITrack density: 18400 TPI (per surface)Rotation speed: 7200 RPMAverage rotational delay: 4.17 msSustained data rate: 19.5-31.9 MBSeek time: average 6.8ms, next track 0.6msLecture 17 Data Storage 21Arranging Pages on DiskBlocks in a file should be arranged sequentially on disk (by next block), to minimize seek and rotational delay.The concept of next block : 1. Next block on the same track2. 1st block on next track of the same cylinder3. 1st block on 1st track of the next cylinder For a sequential scan, pre-fetching several pages at a time is a big


View Full Document

UTSA CS 3743 - Data Storage

Download Data Storage
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Data Storage and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Data Storage 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?