DOC PREVIEW
UMD CMSC 424 - Class Notes

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Quick Review of Apr 15 materialTodayIndex Definition in SQLMultiple-Key AccessMultiple-Key Access (2)Indices on Multiple AttributesMulti-Attribute IndexingGrid FilesExample Grid File for accountQueries on a Grid FileGrid Files (cont)Grid Files (end)Bitmap IndicesBitmap Indices (cont)Slide 15Slide 16Slide 17Efficient Implementation of Bitmap OperationsBitmaps and B+-treesQuick Review of Apr 15 material •Overflow–definition, why it happens–solutions: chaining, double hashing•Hash file performance–loading factor–search speed•Hash Indices (as separate from file organization)•Static vs. Dynamic Hashing•Extendable Hashing–with a detailed example of how it worksToday •HW#3 due today•HW #4: due Thursday April 24 (next week)–Questions: 12.11, 12.12, 12.13, 12.16•Today:–indices on multiple attributes–grid files–bitmap indicesIndex Definition in SQL•Create an indexcreate index <index-name> on <relation-name> (<attribute-list>)e.g.,create index br-index on branch(branch-name)•Use create unique index to indirectly specify and enforce the condition that the search key is a candidate key–not really required if SQL unique integrity constraint is supported•To drop an indexdrop index <index-name>Multiple-Key Access•With some queries we can use multiple indices•Example: select account-numberfrom accountwhere branch-name=“Perryridge” and balance=1000•Possible strategies for processing this query using indices on single attributes:–use index on balance to find accounts with balances =1000, then test them individually to see if branch-name=“Perryridge”–use index on branch-name to find accounts with branch-name=“Perryridge”, then test them individually to see if balances =1000–use branch-name index to find pointers to all records of the Perryridge branch, and use balance index similarly, then take intersection of both sets of pointersMultiple-Key Access (2)•With some queries using a single-attribute index is unnecessarily expensive–with methods (1) and (2) from the earlier slide, we might have the index we use return a very large set, even though the final result is quite small–Even with method (3) (use both indices and then find the intersection) we might have both indices return a large set, which will make for a lot of unneeded work if the final result (the intersection) is small•An alternative strategy is to create and use an index on more than one attribute -- in this example, an index on (branch-name, balance) (both attributes)Indices on Multiple Attributes•Suppose we have an ordered index on the combined search-key (branch-name, balance)•Examining the earlier query with the clausewhere branch-name=“Perryridge” and balance=1000Our new index will fetch only records that satisfy both conditions -- much more efficient than trying to answer the query with separate single-valued indices•We can also handle range queries like:where branch-name=“Perryridge” and balance<1000•But our index will not work forwhere branch-name<“Perryridge” and balance=1000Multi-Attribute IndexingExample: EMP(eno, ename, age, sal)lots of ways to handle this problem•separate indices: lots of false hits•combined index based on composite key–key= sal*100 + age–search for 30<sal<40 translates into 3000<key<4000 (easy)–search for 60<age<80 difficult•Grid files (up next)•R-tree (B-tree generalization)•Quad-trees, K-d trees, etc...Grid Files•Structure used to speed processing of general multiple search-key queries involving one or more comparison operators•The grid file has:–a single grid array–array has number of dimensions equal to the number of search-key attributes–one linear scale for each search-key attribute•Multiple cells of the grid array can point to the same bucket•To find the bucket for a search-key value, locate the row and column of the cell using the linear scales to get the grid location, then follow the pointer in that grid location to the bucketExample Grid File for accountQueries on a Grid File•A grid file on two attributes A and B can handle queries of all the following forms with reasonable efficiency:–(a1 < A < a2)–(b1 < B < b2)–(a1 < A < a2 and b1 < B < b2)•For example, to answer (a1 < A < a2 and b1 < B < b2), use the linear scales to find corresponding candidate grid array cells, and look up all the buckets pointed-to from those cellsGrid Files (cont)•During insertion, if a bucket becomes full, a new bucket can be created if more than one cell points to it–idea similar to extendable hashing, but in multiple dimensions–if only one cell points to the bucket, either an overflow bucket must be created or the grid array size must be increased•Linear scales can be chosen to uniformly distribute records across buckets–not necessary to have scale uniform across the domain -- if records are distributed in some other pattern, the linear scale can mirror it (as shown in the example earlier)Grid Files (end)•Periodic re-organization to increase grid array size helps with good performance–reorganization can be very expensive, though•Space overhead of the grid array can be high•R-trees (chapter 23) are an alternativeBitmap Indices•Bitmap indices are a special type of index designed for efficient queries on multiple keys•Records in a relation are assumed to be numbered sequentially from 0–given a number n the objective is for it to be easy to retrieve record n–very easy to achieve if we’re looking at fixed-length records•Bitmap indices are applicable on attributes that take on a relatively small number of distinct values–e.g. gender, country, state, hockey team–or an arbitrary mapping of a wider spectrum of values into a small number of categories (e.g. income level: divide income into a small number of levels such as 0-9999, 10K-19999, 20K-49999, 50K and greater)•A bitmap is simply an array of bitsBitmap Indices (cont)•In the simplest form a bitmap index on an attribute has a bitmap for each value of the attribute–bitmap has as many bits as there are records–in a bitmap for value v, the bit for a record is 1 if the record has the value v for the attribute, and 0 otherwiseBitmap Indices (cont)•Bitmap indices are useful for queries on multiple attributes–not particularly useful for single-attribute queries•Queries are answered using bitmap operations–intersection (and)–union (or)–complementation (not)•Each operation takes two bitmaps of


View Full Document

UMD CMSC 424 - Class Notes

Documents in this Course
Lecture 2

Lecture 2

36 pages

Databases

Databases

44 pages

Load more
Download Class Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Class Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Class Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?