UH COSC 3480 - Tree-Structured Indexes Chapter 9

Unformatted text preview:

Tree-Structured IndexesIntroductionRange SearchesB+ Tree: The Most Widely Used IndexExample B+ Tree (order p=5, m=4)B+ Trees in PracticeInserting a Data Entry into a B+ TreeInserting 8* into Example B+ TreeExample B+ Tree After Inserting 8*Deleting a Data Entry from a B+ TreeExample Tree After (Inserting 8*, Then) Deleting 19* and 20* ...... And Then Deleting 24*Example of Non-leaf Re-distributionAfter Re-distributionClarifications B+ TreeSummary B+ TreeTransparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick1Tree-Structured IndexesChapter 9Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick2IntroductionAs for any index, 3 alternatives for data entries k*: Data record with key value k <k, rid of data record with search key value k> <k, list of rids of data records with search key k>Choice is orthogonal to the indexing technique used to locate data entries k*.Tree-structured indexing techniques support both range searches and equality searches.ISAM: static structure; B+ tree: dynamic, adjusts gracefully under inserts and deletes.Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick3Range Searches``Find all students with gpa > 3.0’’–If data is in sorted file, do binary search to find first such student, then scan to find others.–Cost of binary search can be quite high.Simple idea: Create an `index’ file. Can do binary search on (smaller) index file!Page 1Page 2Page NPage 3Data Filek2kNk1Index FileTransparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick4B+ Tree: The Most Widely Used IndexInsert/delete at log F N cost; keep tree height-balanced. (F = fanout, N = # leaf pages)Minimum 50% occupancy (except for root). Supports equality and range-searches efficiently.Index EntriesData Entries("Sequence set")(Direct search)Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick5Example B+ Tree (order p=5, m=4)Search begins at root, and key comparisons direct it to a leaf (as in ISAM).Search for 5*, 15*, all data entries >= 24* ... Based on the search for 15*, we know it is not in the tree!Root16 22292*3* 5*7* 14* 16*19* 20* 22* 24* 27*29* 33* 34*38*39*7p=5 because tree can have at most 5 pointers in intermediate node; m=4 because at most 4 entries in leaf node.Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick6B+ Trees in PracticeTypical order: 200. Typical fill-factor: 67%.–average fanout = 133Typical capacities:–Height 4: 1334 = 312,900,700 records–Height 3: 1333 = 2,352,637 recordsCan often hold top levels in buffer pool:–Level 1 = 1 page = 8 Kbytes–Level 2 = 133 pages = 1 Mbyte–Level 3 = 17,689 pages = 133 MBytesTransparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick7Inserting a Data Entry into a B+ TreeFind correct leaf L. Put data entry onto L.–If L has enough space, done!–Else, must split L (into L and a new node L2)Redistribute entries evenly, copy up middle key.Insert index entry pointing to L2 into parent of L.This can happen recursively–To split index node, redistribute entries evenly, but push up middle key. (Contrast with leaf splits.)Splits “grow” tree; root split increases height. –Tree growth: gets wider or one level taller at top.Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick8Inserting 8* into Example B+ TreeObserve how minimum occupancy is guaranteed in both leaf and index pg splits.Note difference between copy-up and push-up; be sure you understand the reasons for this.2*3* 5*7*8*5Entry to be inserted in parent node.(Note that 5 iscontinues to appear in the leaf.)s copied up andappears once in the index. Contrast5 24 301713Entry to be inserted in parent node.(Note that 17 is pushed up and onlythis with a leaf split.)Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick9Example B+ Tree After Inserting 8* Notice that root was split, leading to increase in height. In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice.2* 3*Root16222914* 16*19* 20* 22* 24* 27*29* 33* 34*38*39*837*5* 8*Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick10Deleting a Data Entry from a B+ TreeStart at root, find leaf L where entry belongs.Remove the entry.–If L is at least half-full, done! –If L has only d-1 entries,Try to re-distribute, borrowing from sibling (adjacent node with same parent as L).If re-distribution fails, merge L and sibling.If merge occurred, must delete entry (pointing to L or sibling) from parent of L.Merge could propagate to root, decreasing height.Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick11Example Tree After (Inserting 8*, Then) Deleting 19* and 20* ...Deleting 19* is easy.Deleting 20* is done with re-distribution. Notice how middle key is copied up.2* 3*Root162914* 16*33* 34*38*39*837*5* 8* 22* 24*2427* 29*Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick12 ... And Then Deleting 24*Must merge.Observe `toss’ of index entry (on right), and `pull down’ of index entry (below).2922* 27*29* 33* 34*38*39*2*3*7*14* 16*22*27*29*33* 34*38*39*5* 8*Root298316Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick13Example of Non-leaf Re-distributionTree is shown below during deletion of 24*. (What could be a possible initial tree?)In contrast to previous example, can re-distribute entry from left child of root to right child. Root8316 18212914* 16*17* 18*20* 33* 34*38*39*22* 27* 29*21*7*5* 8*3*2*Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick14After Re-distributionIntuitively, entries are re-distributed by `pushing through’ the splitting entry in the parent node.It suffices to re-distribute index entry with key 20; we’ve re-distributed 17 as well for illustration.14* 16*33* 34*38*39*22* 27* 29*17* 18*20* 21*7*5* 8*2* 3*Root8316291821Transparencies on B+ Trees R. Ramakrishnan and J. Gehrke, revised by Christoph F. Eick15Clarifications B+ TreeB+ trees can be used to store relations as well as index structuresIn the drawn B+ trees we


View Full Document

UH COSC 3480 - Tree-Structured Indexes Chapter 9

Download Tree-Structured Indexes Chapter 9
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Tree-Structured Indexes Chapter 9 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Tree-Structured Indexes Chapter 9 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?