DOC PREVIEW
UW CSE 444 - Lecture Notes

This preview shows page 1-2-3-24-25-26-27-48-49-50 out of 50 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Introduction to Database Systems CSE 444 Lecture 20: Operator Algorithms Magda Balazinska - CSE 444, Fall 2010 1Why Learn About Op Algos? • Implemented in commercial DBMSs – DBMSs implement different subsets of known algorithms • Good algorithms can greatly improve performance • Need to know about physical operators to understand query optimization Magda Balazinska - CSE 444, Fall 2010 2Magda Balazinska - CSE 444, Fall 2010 Cost Parameters • In database systems the data is on disk • Cost = total number of I/Os • Parameters: – B(R) = # of blocks (i.e., pages) for relation R – T(R) = # of tuples in relation R – V(R, a) = # of distinct values of attribute a • When a is a key, V(R,a) = T(R) • When a is not a key, V(R,a) can be anything < T(R) 3Magda Balazinska - CSE 444, Fall 2010 Cost • Cost of an operation = number of disk I/Os to – Read the operands – Compute the result • Cost of writing the result to disk is not included – Need to count it separately when applicable 4Magda Balazinska - CSE 444, Fall 2010 Cost of Scanning a Table • Result may be unsorted: B(R) • Result needs to be sorted: 3B(R) – We will discuss sorting later 5Magda Balazinska - CSE 444, Fall 2010 Outline for Today • Join operator algorithms – One-pass algorithms (Sec. 15.2 and 15.3) – Index-based algorithms (Sec 15.6) – Two-pass algorithms (Sec 15.4 and 15.5) – Note about readings: • In class, we will discuss only algorithms for join operator (because other operators are easier) • Read the book to get more details about these algos • Read the book to learn about algos for other operators 6Magda Balazinska - CSE 444, Fall 2010 Basic Join Algorithms • Logical operator: – Product(pname, cname) ⋈ Company(cname, city) • Propose three physical operators for the join, assuming the tables are in main memory: – Hash join – Nested loop join – Sort-merge join 7Magda Balazinska - CSE 444, Fall 2010 Hash Join Hash join: R ⋈ S • Scan R, build buckets in main memory • Then scan S and join • Cost: B(R) + B(S) • One-pass algorithm when B(R) <= M – By “one pass”, we mean that the operator reads its operands only once. It does not write intermediate results back to disk. 8Hash Join Example 9 Patient Insurance Patient(pid, name, address) Insurance(pid, provider, policy_nb) 1 ‘Bob’ ‘Seattle’ 2 ‘Ela’ ‘Everett’ 3 ‘Jill’ ‘Kent’ 4 ‘Joe’ ‘Seattle’ Patient 2 ‘Blue’ 123 4 ‘Prem’ 432 Insurance 4 ‘Prem’ 343 3 ‘GrpH’ 554 Two tuples per pageHash Join Example 10 Patient Insurance 1 2 3 4 Patient 2 4 Insurance 4 3 Showing pid only 8 5 9 6 2 8 8 9 6 6 1 3 Disk Memory M = 21 pagesHash Join Example 11 Step 1: Scan Patient and create hash table in memory 1 2 3 4 Patient 2 4 Insurance 4 3 8 5 9 6 2 8 8 9 6 6 1 3 Disk Memory M = 21 pages Hash h: pid % 5 Input buffer 1 2 4 3 9 6 8 5 1 2Hash Join Example 12 Step 2: Scan Insurance and probe into hash table 1 2 3 4 Patient 2 4 Insurance 4 3 8 5 9 6 2 8 8 9 6 6 1 3 Disk Memory M = 21 pages Hash h: pid % 5 Input buffer 1 2 4 3 9 6 8 5 1 2 2 4 Output buffer 2 2 Write to diskHash Join Example 13 Step 2: Scan Insurance and probe into hash table 1 2 3 4 Patient 2 4 Insurance 4 3 8 5 9 6 2 8 8 9 6 6 1 3 Disk Memory M = 21 pages Hash h: pid % 5 Input buffer 1 2 4 3 9 6 8 5 1 2 2 4 Output buffer 4 4Hash Join Example 14 Step 2: Scan Insurance and probe into hash table 1 2 3 4 Patient 2 4 Insurance 4 3 8 5 9 6 2 8 8 9 6 6 1 3 Disk Memory M = 21 pages Hash h: pid % 5 Input buffer 1 2 4 3 9 6 8 5 1 2 4 3 Output buffer 4 4 Keep going until read all of Insurance Cost: B(R) + B(S)Hash Join Details 15 Open( ) { H = newHashTable( ); S.Open( ); x = S.GetNext( ); while (x != null) { H.insert(x); x = S.GetNext( ); } S.Close( ); R.Open( ); buffer = [ ]; }Hash Join Details 16 GetNext( ) { while (buffer == [ ]) { x = R.GetNext( ); if (x==Null) return NULL; buffer = H.find(x); } z = buffer.first( ); buffer = buffer.rest( ); return z; }Hash Join Details 17 Close( ) { release memory (H, buffer, etc.); R.Close( ) } Magda Balazinska - CSE 444, Fall 2010Magda Balazinska - CSE 444, Fall 2010 Nested Loop Joins • Tuple-based nested loop R ⋈ S • R is the outer relation, S is the inner relation • Cost: B(R) + T(R) B(S) • Not quite one-pass since S is read many times for each tuple r in R do for each tuple s in S do if r and s join then output (r,s) 18Magda Balazinska - CSE 444, Fall 2010 Page-at-a-time Refinement • Cost: B(R) + B(R)B(S) for each page of tuples r in R do for each page of tuples s in S do for all pairs of tuples if r and s join then output (r,s) 191 2 Nested Loop Example 20 1 2 3 4 Patient 2 4 Insurance 4 3 8 5 9 6 2 8 8 9 6 6 1 3 Disk Input buffer for Patient Output buffer 2 2 Input buffer for Insurance 2 4Nested Loop Example 21 1 2 3 4 Patient 2 4 Insurance 4 3 8 5 9 6 2 8 8 9 6 6 1 3 Disk Input buffer for Patient 1 2 Output buffer Input buffer for Insurance 4 3 1 2Nested Loop Example 22 1 2 3 4 Patient 2 4 Insurance 4 3 8 5 9 6 2 8 8 9 6 6 1 3 Disk Input buffer for Patient 1 2 Output buffer Input buffer for Insurance 2 8 1 2 2 2 Cost: B(R) + B(R)B(S) Keep going until read all of Insurance Then repeat for next page of Patient… until end of PatientMagda Balazinska - CSE 444, Fall 2010 Sort-Merge Join Sort-merge join: R ⋈ S • Scan R and sort in main memory • Scan S and sort in main memory • Merge R and S • Cost: B(R) + B(S) • One pass algorithm when B(S) + B(R) <= M • Typically, this is NOT a one pass algorithm 23Sort-Merge Join Example 24 1 2 3 4 Patient 2 4 Insurance 4 3 8 5 9 6 2 8 8 9 6 6 1 3 Disk Memory M = 21 pages 1 2 4 3 9 6 8 5 Step 1: Scan Patient and sort in memorySort-Merge Join Example 25 …


View Full Document

UW CSE 444 - Lecture Notes

Documents in this Course
XML

XML

48 pages

SQL

SQL

25 pages

SQL

SQL

42 pages

Recovery

Recovery

30 pages

SQL

SQL

36 pages

Indexes

Indexes

35 pages

Security

Security

36 pages

Wrap-up

Wrap-up

6 pages

SQL

SQL

37 pages

More SQL

More SQL

48 pages

SQL

SQL

35 pages

XML

XML

46 pages

Triggers

Triggers

26 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?