DOC PREVIEW
USC CSCI 585 - Session12

This preview shows page 1-2-3-24-25-26 out of 26 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing ((CntCnt.).)Basic ConceptsOrdered Indices B+-Tree Index FilesB-Tree Index FilesStatic HashingDynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQLMultiple-Key AccessDatabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2Dynamic HashingDynamic HashingGood for database that grows and shrinks in sizeAllows the hash function to be modified dynamicallyExtendable hashing – one form of dynamic hashing Hash function generates values over a large range —typically b-bit integers, with b = 32. At any time use only a prefix of the hash function to index into a table of bucket addresses.  Let the length of the prefix be i bits, 0 ≤ i ≤ 32.  Bucket address table size = 2i.Initially i = 0 Value of i grows and shrinks as the size of the database grows and shrinks. Multiple entries in the bucket address table may point to a bucket.  Thus, actual number of buckets is < 2i• The number of buckets also changes dynamically due to coalescing and splitting of buckets.Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3General Extendable Hash Structure General Extendable Hash Structure In this structure, i2= i3= i, whereas i1= i – 1 (see next slide for details)Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4Use of Extendable Hash Structure: Example Use of Extendable Hash Structure: Example Initial Hash structure, bucket size = 2Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 5Example (Cont.)Example (Cont.)Hash structure after insertion of one Brighton and two Downtown recordsBrighton 0010Downtown 1010Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 6Example (Cont.)Example (Cont.)Hash structure after insertion of Mianus recordBrighton 0010Downtown 1010Mianus 1100Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7Example (Cont.)Example (Cont.)Hash structure after insertion of three Perryridge recordsBrighton 0010Downtown 1010Mianus 1100Perryridge 1111Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8Example (Cont.)Example (Cont.)Hash structure after insertion of Redwood and Round Hill recordsBrighton 0010Downtown 1010Mianus 1100Perryridge 1111Redwood 0011Round 1101Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9Use of Extendable Hash StructureUse of Extendable Hash StructureEach bucket j stores a value ij; all the entries that point to the same bucket have the same values on the first ijbits.To locate the bucket containing search-key Kj:1. Compute h(Kj) = X2. Use the first i high order bits of X as a displacement into bucket address table, and follow the pointer to appropriate bucketTo insert a record with search-key value Kj follow same procedure as look-up and locate the bucket, say j.  If there is room in the bucket j insert record in the bucket.  Else the bucket must be split and insertion re-attempted (next slide.)• Overflow buckets used instead in some cases (as the case for Perryridge in previous example)Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10Updates in Extendable Hash Structure Updates in Extendable Hash Structure If i > ij(more than one pointer to bucket j) allocate a new bucket z, and set ijand izto the old ij-+ 1. make the second half of the bucket address table entries pointing to j to point to z remove and reinsert each record in bucket j. recompute new bucket for Kjand insert record in the bucket (further splitting is required if the bucket is still full)If i = ij(only one pointer to bucket j) increment i and double the size of the bucket address table. replace each entry in the table by two entries that point to the same bucket. recompute new bucket address table entry for KjNow i > ijso use the first case above. To split a bucket j when inserting record with search-key value Kj:Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11Updates in Extendable Hash Structure (Cont.)Updates in Extendable Hash Structure (Cont.)When inserting a value, if the bucket is full after several splits (that is, i reaches some limit b) create an overflow bucket instead of splitting bucket entry table further.To delete a key value,  locate it in its bucket and remove it.  The bucket itself can be removed if it becomes empty (with appropriate updates to the bucket address table).  Coalescing of buckets can be done (can coalesce only with a “buddy” bucket having same value of ijand same ij–1 prefix, if it is present)  Decreasing bucket address table size is also possible• Note: decreasing bucket address table size is an expensive operation and should be done only if number of buckets becomes much smaller than the size of the tableDatabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12Extendable Hashing vs. Other SchemesExtendable Hashing vs. Other SchemesBenefits of extendable hashing:  Hash performance does not degrade with growth of file Minimal space overheadDisadvantages of extendable hashing Extra level of indirection to find desired record Bucket address table may itself become very big (larger than memory)• Need a tree structure to locate desired record in the structure! Changing size of bucket address table is an expensive operationLinear hashing is an alternative mechanism which avoids these disadvantages at the possible cost of more bucket overflowsDatabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13Comparison of Ordered Indexing and HashingComparison of Ordered Indexing and HashingCost of periodic re-organizationRelative frequency of insertions and deletionsIs it desirable to optimize average access time at the expense of worst-case access time?Expected type of queries: Hashing is generally better at retrieving records having a specified value of the key. If range queries are common, ordered indices are to be preferredDatabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14Index Definition in SQLIndex Definition in SQLCreate an indexcreate index <index-name> on <relation-name><attribute-list>)E.g.: create index b-index on branch(branch-name)Use create unique index to


View Full Document

USC CSCI 585 - Session12

Download Session12
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Session12 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Session12 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?