UW-Milwaukee COMPSCI 557 - RAID - D2655244

Home> Schools> University of Wisconsin-Milwaukee> Computer Science (COMPSCI) > COMPSCI 557> RAID

DOC PREVIEW

UW-Milwaukee COMPSCI 557 - RAID

School name University of Wisconsin-Milwaukee

Course Compsci 557- Introduction to Database Systems

Pages 24

This preview shows page 1-2-23-24 out of 24 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Announcements• Today–RAID– Begin Indexes• Program 1 due Friday– Office Hours today 2-3 pm– I’ll have limited email contact over the weekend– later today I’ll give info for turning in the programRAIDRedundant Arrays of Inexpensive Disks• Goal of RAID is to even out rates of disk improvements (small) w/ those in RAM and CPU• RAID use multiple physical disks to behave as a single logical diskData StripingData striping stores data across multiple disksThere are different granularitiesbit level granularityblock level granularityNaïve Striping Reduces Reliability• Likelihood of failure increases w/ # of disks– Mirroring, error correcting codes are used to increase reliability at the expense of speed• But is this statement correct? – (from Section 13.10.1)“For an array of n disks, the likelihood of failure is ntimes as much as that for one disk. Hence, if the MTTF of a disk drive is 200,000 hours (22.8 years), that of a bank of 100 disk drives becomes only 2000 hours (83 days)”RAID Organizationsbalance speed and reliabilityIndexing Structures for FilesChapter 14“If you don’t find it in the index, look very carefully through the whole catalog”- Sears, Roebuck and Co. consumers’ Guide, 1897Indexes provide alternative access pathsQuery: Find record for student “Troy Allen”Index on “name”Step 1: query the index for the RID for the record (hopefully a few IOs)Step 2: query the buffer manager for the appropriate block (1 IO)RID = (3438, 9)“Troy Allen”An index• Is a collection of data entries• Is associated with a specific file• Is associated with a specific field called the indexing field (sometimes called the search or key field)• Contains data so that BlkIDs (or RIDs) whose indexing fields match a given value can be found quicklySome Considerations• What is the organization of the underlying file– Eg, is it ordered on the search key?• Are the values of the indexing field unique (ie, is the indexing field a key field)?• How are the data entries of the index organized?– Example: make index a hashed file on index field where each record contains (value, RID) pairsSome Definitions• primary index: an index on the ordering key field of a ordered file• secondary index: an index on any non-ordering field of the file• clustered index: an index whose data entries are ordered in the same way as the underlying file• dense index: has an index entry for every search key value (and hence every record) in the data file. • sparse index: has index entries for only some of the search valuesPrimary Index• A primary index is an ordered file of <value,blockID> pairs• A record is stored for each block in the file. The records for Blk B contains the value of the first record on that blockCost of Maintaining a Clustered Primary Index• Inserting of record in the ordered file (already expensive) may require significant updates to the index– Why is this?Clustering Indexes• Recall a clustering index is an index on a non-key ordering field of an ordered file• What do we need to store in the index?– as with pri idx, <value, blk> pairs– but now we need a record in the index for every unique value of the indexing key– the blk field of the index gives the first block that a record for value appears onone way to handle the “insert” problem of ordered filesSecondary Indexes• An index on field that is not the ordering field of the underlying file• The indexing field may or may not be a key field for the file• What is the format for records in a secondary index on a key field? How many records are needed?More Secondary Indexes• What if the indexing field is not a key field?– Option 1: Keep index entry for each record, so we will have multiple index entries for each value– Option 2: Have one record / value and store a “RID list” for each value. Thus the index records are variable length records• <‘Jim’, { (389, 3), (3239,30), (193, 78) } >– Option 3: Mixed type of index records (next slide)Properties of Index TypesSQL to Create an IndexCREATE INDEX idxAge ON StudentsWITH STRUCTURE = BTREEKEY = (age)Next time: Multilevel

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-23-24 out of 24 pages.

UW-Milwaukee COMPSCI 557 - RAID

Sign up for free to view:

Please select your school