DOC PREVIEW
UW CSE 444 - Query Optimization

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Outline Search space Lecture 22 Query Optimization 2 Algorithms for enumerating query plans Friday November 19 2010 Estimating the cost of a query plan 1 Key Decisions Dan Suciu 444 Spring 2010 2 Key Decisions Logical plan What logical plans do we consider leftdeep bushy Search Space Which algebraic laws do we apply and in which context s Optimization rules In what order do we explore the search space Optimization algorithm Physical plan What physical operators to use What access paths to use file scan or index 3 Optimizers 4 The Search Space Heuristic based optimizers Complete plans Apply greedily rules that always improve Typically push selections down Bottom up plans Very limited no longer used today Cost based optimizers Top down plans Use a cost model to estimate the cost of each plan Select the cheapest plan Dan Suciu 444 Spring 2010 5 Dan Suciu 444 Spring 2010 6 1 Complete Plans R A B S B C T C D Bottom up Partial Plans SELECT FROM R S T WHERE R B S B and S C T C and R A 40 A 40 T A 40 S R Why is this search space inefficient S R 7 SELECT FROM R S WHERE R B S B and R A 40 A 40 T SELECT FROM R WHERE R A 40 T S S A 40 T R S R A 40 S T S R 8 Dynamic programming in class Classical algorithm 1979 Limited to joins join reordering algorithm Bottom up Rule based algorithm will not discuss SELECT R A T D FROM R S T WHERE R B S B and S C T C 9 Dynamic Programming Database of rules algebraic laws Usually dynamic programming Usually top down Dan Suciu 444 Spring 2010 10 Dynamic Programming Originally proposed in System R 1979 Only handles single block queries Search space join trees SELECT list FROM R1 Rn WHERE cond1 AND cond2 AND AND condk Algebraic laws commutativity associativity Algorithm dynamic programming Heuristics selections down projections up Dan Suciu 444 Spring 2010 Plan Enumeration Algorithms SELECT FROM R S T WHERE R B S B and S C T C and R A 40 Why is this better R Top down Partial Plans R A B S B C T C D SELECT FROM R S T WHERE R B S B and S C T C and R A 40 A 40 T Dan Suciu 444 Spring 2010 R A B S B C T C D 11 Dan Suciu 444 Spring 2010 12 2 Join Trees Types of Join Trees R1 R2 Rn Join tree Left deep R4 R3 R1 R2 R2 R4 A plan a join tree A partial plan a subtree of a join tree R5 R3 Dan Suciu 444 Spring 2010 R1 Dan Suciu 444 Spring 2010 13 Types of Join Trees 14 Types of Join Trees Bushy Right deep R3 R1 R3 R2 R5 R4 R2 R1 R4 R5 Dan Suciu 444 Spring 2010 15 SELECT list FROM R1 Rn WHERE cond1 AND cond2 AND AND condk Dynamic Programming Dan Suciu 444 Spring 2010 16 SELECT list FROM R1 Rn WHERE cond1 AND cond2 AND AND condk Dynamic Programming Join ordering For each subquery Q R1 Rn compute the following Size Q the estimated size of Q Plan Q a best plan for Q Cost Q the estimated cost of that plan Given a query R1 R2 Rn Find optimal order Assume we have a function cost that gives us the cost of every join tree Dan Suciu 444 Spring 2010 17 Dan Suciu 444 Spring 2010 18 3 SELECT list FROM R1 Rn WHERE cond1 AND cond2 AND AND condk SELECT list FROM R1 Rn WHERE cond1 AND cond2 AND AND condk Dynamic Programming Dynamic Programming Step 1 For each Ri set Step 2 For each Q R1 Rn involving i relations What s a reasonable Size Ri B Ri Plan Ri Ri Cost Ri cost of scanning Ri Size Q estimate it recursively estimate For every pair of subqueries Q Q s t Q Q Q compute cost Plan Q Plan Q Cost Q the smallest such cost Plan Q the corresponding plan Dan Suciu 444 Spring 2010 Dan Suciu 444 Spring 2010 19 20 SELECT list FROM R1 Rn WHERE cond1 AND cond2 AND AND condk Dynamic Programming Example To illustrate ad hoc cost model from the book Step 3 Return Plan R1 Rn Cost P1 P2 Cost P1 Cost P2 size intermediate results for P1 P2 Cost of a scan 0 Dan Suciu 444 Spring 2010 Dan Suciu 444 Spring 2010 21 SELECT FROM R S T U WHERE cond1 AND cond2 AND Example Subquery T R 2000 T S 5000 T T 3000 T U 1000 R S T U Assumptions Size Cost 22 Plan RS RT RU ST SU All join selectivities 1 T R 2000 T S 5000 T T 3000 T U 1000 TU T R S 0 01 T R T S T S T 0 01 T S T T etc RST RSU RTU STU Dan Suciu 444 Spring 2010 23 RSTU 24 4 T R 2000 T S 5000 T T 3000 T U 1000 Subquery Size Cost Plan RS 100k 0 RS RT 60k 0 RT RU 20k 0 RU ST 150k 0 ST SU 50k 0 SU TU 30k 0 TU RST 3M 60k RT S RSU 1M 20k RU S RTU 0 6M 20k RU T STU 1 5M 30k TU S RSTU 30M 60k 50k 110k RT SU Reducing the Search Space Restriction 1 only left linear trees no bushy Why Restriction 2 no trees with cartesian product R A B S B C T C D Plan R A B T C D S B C has a cartesian product Most query optimizers will not consider it 25 Dynamic Programming Summary Dan Suciu 444 Spring 2010 26 Rule Based Optimizers Extensible collection of rules Handles only join queries Rule Algebraic law with a direction Selections are pushed down i e early Projections are pulled up i e late Algorithm for firing these rules Generate many alternative plans in some order Prune by cost Takes exponential time in general BUT Left linear joins may reduce time Non cartesian products may reduce time further Volcano later SQL Sever Starburst later DB2 Dan Suciu 444 Spring 2010 27 Completing the Physical Query Plan 28 Access Path Selection Access path a way to retrieve tuples from a table Choose algorithm for each operator A file scan An index plus a matching selection condition How much memory do we have Are the input operand s sorted Index matches selection condition if it can be used to retrieve just tuples that satisfy the condition Access path selection for base tables Decide for each intermediate result Example Supplier sid sname scity sstate B tree index on scity sstate matches scity Seattle does not match sid 3 does not match sstate WA To materialize To pipeline Dan Suciu 444 Spring 2010 Dan Suciu 444 Spring 2010 29 Dan Suciu 444 Spring 2010 30 5 Access Path Selection Access Path Selectivity Access path …


View Full Document

UW CSE 444 - Query Optimization

Documents in this Course
XML

XML

48 pages

SQL

SQL

25 pages

SQL

SQL

42 pages

Recovery

Recovery

30 pages

SQL

SQL

36 pages

Indexes

Indexes

35 pages

Security

Security

36 pages

Wrap-up

Wrap-up

6 pages

SQL

SQL

37 pages

More SQL

More SQL

48 pages

SQL

SQL

35 pages

XML

XML

46 pages

Triggers

Triggers

26 pages

Load more
Download Query Optimization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Query Optimization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Query Optimization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?