DOC PREVIEW
UW CSE 444 - Lecture Notes

This preview shows page 1-2-3-19-20-39-40-41 out of 41 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 41 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 Lecture 15: Indexes Friday, May 7, 2010 Dan Suciu -- 444 Spring 20102 Outline • Index structures (14.1, 14.2) • B-trees (14.3) Note: in old edition this is Chapter 13 instead of 14 Dan Suciu -- 444 Spring 20103 File Types The data file can be one of: • Heap file: – Set of records, partitioned into blocks – Unsorted • Sequential file: – Sorted according to some attribute(s) called key Dan Suciu -- 444 Spring 2010 Note: “key” here means something else than “primary key”Index • A (possibly separate) file, that allows fast access to records in the data file • The index contains (key, value) pairs: – The key = an attribute value – The value = one of: • pointer to the record secondary index • or the record itself primary index 4 Dan Suciu -- 444 Spring 2010 Note: “key” (aka “search key”) again means something else5 Index Classification • Clustered/unclustered – Clustered = data file is ordered by the index’ search key – Unclustered = othewise • Primary/secondary: – Meaning 1: same as clustered/unclustured – Meaning 2: • Primary = is over attributes part of the primary • Secondary = cannot reorder data • Organization: B+ tree or Hash table6 Clustered Index • File is sorted on the index attribute • Only one per table 10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 807 Unclustered Index • Several per table 10 10 20 20 20 30 30 30 20 30 30 20 10 20 10 30Clustered vs. Unclustered Index Data entries (Index File) (Data file) Data Records Data entries Data Records CLUSTERED UNCLUSTERED B+ Tree B+ Tree 8 Dan Suciu -- 444 Spring 20109 B+ Trees • Search trees • Idea in B Trees: – make 1 node = 1 block • Idea in B+ Trees: – Make leaves into a linked list (range queries are easier) Dan Suciu -- 444 Spring 201010 • Parameter d = the degree • Each node has >= d and <= 2d keys (except root) • Each leaf has >=d and <= 2d keys: B+ Trees Basics 30 120 240 Keys k < 30 Keys 30<=k<120 Keys 120<=k<240 Keys 240<=k 40 50 60 40 50 60 Next leaf11 B+ Tree Example 80 20 60 100 120 140 10 15 18 20 30 40 50 60 65 80 85 90 10 15 18 20 30 40 50 60 65 80 85 90 d = 2 Find the key 40 40 ≤ 80 20 < 40 ≤ 60 30 < 40 ≤ 40 Dan Suciu -- 444 Spring 201012 Using a B+ Tree • Exact key values: – Start at the root – Proceed down, to the leaf • Range queries: – As above – Then sequential traversal Select name From People Where age = 25 Select name From People Where 20 <= age and age <= 30 Dan Suciu -- 444 Spring 2010 Index on People(age)Which queries can use this index ? Dan Suciu -- 444 Spring 2010 13 Select * From People Where name = ‘Smith’ and zipcode = 12345 Index on People(name, zipcode) Select * From People Where name = ‘Smith’ Select * From People Where zipcode = 1234514 B+ Tree Design • How large d ? • Example: – Key size = 4 bytes – Pointer size = 8 bytes – Block size = 4096 byes • 2d x 4 + (2d+1) x 8 <= 4096 • d = 170 Dan Suciu -- 444 Spring 2010B+ Trees in Practice • Typical order: 100. Typical fill-factor: 67%. – average fanout = 133 • Typical capacities: – Height 4: 1334 = 312,900,700 records – Height 3: 1333 = 2,352,637 records • Can often hold top levels in buffer pool: – Level 1 = 1 page = 8 Kbytes – Level 2 = 133 pages = 1 Mbyte – Level 3 = 17,689 pages = 133 MBytes 15 Dan Suciu -- 444 Spring 201016 Insertion in a B+ Tree Insert (K, P) • Find leaf where K belongs, insert • If no overflow (2d keys or less), halt • If overflow (2d+1 keys), split node, insert in parent: • If leaf, keep K3 too in right node • When root splits, new root has 1 key only K1 K2 K3 K4 K5 P0 P1 P2 P3 P4 p5 K1 K2 P0 P1 P2 K4 K5 P3 P4 p5 parent K3 parent17 Insertion in a B+ Tree 80 20 60 100 120 140 10 15 18 20 30 40 50 60 65 80 85 90 10 15 18 20 30 40 50 60 65 80 85 90 Insert K=1918 Insertion in a B+ Tree 80 20 60 100 120 140 10 15 18 19 20 30 40 50 60 65 80 85 90 10 15 18 20 30 40 50 60 65 80 85 90 19 After insertion19 Insertion in a B+ Tree 80 20 60 100 120 140 10 15 18 19 20 30 40 50 60 65 80 85 90 10 15 18 20 30 40 50 60 65 80 85 90 19 Now insert 2520 Insertion in a B+ Tree 80 20 60 100 120 140 10 15 18 19 20 25 30 40 50 60 65 80 85 90 10 15 18 20 25 30 40 60 65 80 85 90 19 After insertion 5021 Insertion in a B+ Tree 80 20 60 100 120 140 10 15 18 19 20 25 30 40 50 60 65 80 85 90 10 15 18 20 25 30 40 60 65 80 85 90 19 But now have to split ! 5022 Insertion in a B+ Tree 80 20 30 60 100 120 140 10 15 18 19 20 25 60 65 80 85 90 10 15 18 20 25 30 40 60 65 80 85 90 19 After the split 50 30 40 5023 Deletion from a B+ Tree 80 20 30 60 100 120 140 10 15 18 19 20 25 60 65 80 85 90 10 15 18 20 25 30 40 60 65 80 85 90 19 Delete 30 50 30 40 5024 Deletion from a B+ Tree 80 20 30 60 100 120 140 10 15 18 19 20 25 60 65 80 85 90 10 15 18 20 25 40 60 65 80 85 90 19 After deleting 30 50 40 50 May change to 40, or not25 Deletion from a B+ Tree 80 20 30 60 100 120 140 10 15 18 19 20 25 60 65 80 85 90 10 15 18 20 25 40 60 65 80 85 90 19 Now delete 25 50 40 5026 Deletion from a B+ Tree 80 20 30 60 100 120 140 10 15 18 19 20 60 65 80 85 90 10 15 18 20 40 60 65 80 85 90 19 After deleting 25 Need to rebalance Rotate 50 40 5027 Deletion from a B+ Tree 80 19 30 60 100 120 140 10 15 18 19 20 60 65 80 85 90 10 15 18 20 40 60 65 80 85 90 19 Now delete 40 50 40 5028 Deletion from a B+ …


View Full Document

UW CSE 444 - Lecture Notes

Documents in this Course
XML

XML

48 pages

SQL

SQL

25 pages

SQL

SQL

42 pages

Recovery

Recovery

30 pages

SQL

SQL

36 pages

Indexes

Indexes

35 pages

Security

Security

36 pages

Wrap-up

Wrap-up

6 pages

SQL

SQL

37 pages

More SQL

More SQL

48 pages

SQL

SQL

35 pages

XML

XML

46 pages

Triggers

Triggers

26 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?