DOC PREVIEW
UMD CMSC 424 - Lecture 18 Query optimization

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CMSC 424 – Database designLecture 18Query optimizationMihai PopAdmin•More midterm solutions•Projects – do not be late!IntroductionIntroduction■Alternative ways of evaluating a given query●Equivalent expressions●Different algorithms for each operationIntroduction (Cont.)Introduction (Cont.)■An evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated.Introduction (Cont.)Introduction (Cont.)■Cost difference between evaluation plans for a query can be enormous●E.g. seconds vs. days in some cases■Steps in cost-based query optimization★Generate logically equivalent expressions using equivalence rules★Annotate resultant expressions to get alternative query plans★Choose the cheapest plan based on estimated cost■Estimation of plan cost based on:●Statistical information about relations. Examples:number of tuples, number of distinct values for an attribute●Statistics estimation for intermediate resultsto compute cost of complex expressions●Cost formulae for algorithms, computed using statisticsGenerating Equivalent ExpressionsTransformation of Relational ExpressionsTransformation of Relational Expressions■Two relational algebra expressions are said to be equivalent if the two expressions generate the same set of tuples on every legal database instance●Note: order of tuples is irrelevant■In SQL, inputs and outputs are multisets of tuples●Two expressions in the multiset version of the relational algebra are said to be equivalent if the two expressions generate the same multiset of tuples on every legal database instance. ■An equivalence rule says that expressions of two forms are equivalent●Can replace expression of first form by second, or vice versaEquivalence RulesEquivalence Rules1. Conjunctive selection operations can be deconstructed into a sequence of individual selections.2. Selection operations are commutative.3. Only the last in a sequence of projection operations is needed, the others can be omitted.4. Selections can be combined with Cartesian products and theta joins.■σθ(E1 X E2) = E1 θ E2 ■σθ1(E1 θ2 E2) = E1 θ1∧ θ2 E2 ))(())((1221EEθθθθσσσσ=))(()(2121EEθθθθσσσ=∧)())))((((121EELLnLLΠ=ΠΠΠEquivalence Rules (Cont.)5. Theta-join operations (and natural joins) are commutative.E1 θ E2 = E2 θ E16. (a) Natural join operations are associative: (E1 E2) E3 = E1 (E2 E3)(b) Theta joins are associative in the following manner: (E1 θ1 E2) θ2∧ θ3 E3 = E1 θ1∧ θ3 (E2 θ2 E3) where θ2 involves attributes from only E2 and E3.Pictorial Depiction of Equivalence RulesPictorial Depiction of Equivalence RulesTransformation rules•Many more...•Read chapter 14!!!!Transformation Example: Pushing Selections•Query: Find the names of all customers who have an account at some branch located in Brooklyn.Πcustomer_name(σbranch_city = “Brooklyn”(branch (account depositor)))•Transformation using rule 7a. Πcustomer_name ((σbranch_city =“Brooklyn” (branch)) (account depositor))•Performing the selection as early as possible reduces the size of the relation to be joined.Example with Multiple Transformations•Query: Find the names of all customers with an account at a Brooklyn branch whose account balance is over $1000.Πcustomer_name((σbranch_city = “Brooklyn” ∧ balance > 1000 (branch (account depositor)))•Transformation using join associatively (Rule 6a):Πcustomer_name((σbranch_city = “Brooklyn” ∧ balance > 1000 (branch account)) depositor) •Second form provides an opportunity to apply the “perform selections early” rule, resulting in the subexpression σbranch_city = “Brooklyn” (branch) σ balance > 1000 (account)•Thus a sequence of transformations can be usefulMultiple Transformations (Cont.)Transformation Example: Pushing Projections•When we compute(σbranch_city = “Brooklyn” (branch) account )we obtain a relation whose schema is:(branch_name, branch_city, assets, account_number, balance)•Push projections using equivalence rules 8a and 8b; eliminate unneeded attributes from intermediate results to get: Πcustomer_name (( Πaccount_number ( (σbranch_city = “Brooklyn” (branch) account )) depositor )•Performing the projection as early as possible reduces the size of the relation to be joined. Πcustomer_name((σbranch_city = “Brooklyn” (branch) account) depositor)Join Ordering Example•For all relations r1, r2, and r3,(r1 r2) r3 = r1 (r2 r3 )(Join Associativity)•If r2 r3 is quite large and r1 r2 is small, we choose (r1 r2) r3 so that we compute and store a smaller temporary relation.Join Ordering Example (Cont.)•Consider the expressionΠcustomer_name ((σbranch_city = “Brooklyn” (branch)) (account depositor))•Could compute account depositor first, and join result with σbranch_city = “Brooklyn” (branch)but account depositor is likely to be a large relation.•Only a small fraction of the bank’s customers are likely to have accounts in branches located in Brooklyn– it is better to compute σbranch_city = “Brooklyn” (branch) account first.Enumeration of Equivalent Expressions•Query optimizers use equivalence rules to systemati cally generate expressions equivalent to the given expression•Can generate all equivalent expressions as follows: – Repeat•apply all applicable equivalence rules on every equivalent expression found so far•add newly generated expressions to the set of equivalent expressions Until no new equivalent expressions are generated above•The above approach is very expensive in space and time–Two approaches•Optimized plan generation based on transformation rules•Special case approach for queries with only selections, projections and joinsImplementing Transformation Based Optimization•Space requirements reduced by sharing common sub-expressions:–when E1 is generated from E2 by an equivalence rule, usually only the top level of the two are different, subtrees below are the same and can be shared using pointers•E.g. when applying join commutativity–Same sub-expression may get generated multiple times•Detect duplicate sub-expressions and share one copy•Time requirements are reduced by not generating all expressions–Dynamic programming•We will study only the


View Full Document

UMD CMSC 424 - Lecture 18 Query optimization

Documents in this Course
Lecture 2

Lecture 2

36 pages

Databases

Databases

44 pages

Load more
Download Lecture 18 Query optimization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 18 Query optimization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 18 Query optimization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?