Unformatted text preview:

Chapter 14 Query Optimization Amol Deshpande adapted from the slides at db book com Database System Concepts 5th Ed Silberschatz Korth and Sudarshan See www db book com for conditions on re use Query Planning Optimization Generation of query evaluation plans for an expression involves several steps 1 Generating logically equivalent expressions using equivalence rules 2 Annotating resultant expressions to get alternative query plans 3 Choosing the cheapest plan based on estimated cost The overall process is called cost based optimization Database System Concepts 5th Edition Aug 27 2005 14 2 Silberschatz Korth and Sudarshan Transformation of Relational Expressions Two relational algebra expressions are said to be equivalent if on every legal database instance the two expressions generate the same set of tuples Note order of tuples is irrelevant Database System Concepts 5th Edition Aug 27 2005 14 3 Silberschatz Korth and Sudarshan Equivalence Rules 1 Conjunctive selection operations can be deconstructed into a sequence of individual selections 1 2 E 1 2 E 2 Selection operations are commutative 1 2 E 2 1 E 3 Only the last in a sequence of projection operations is needed the others can be omitted L1 L2 Ln E L1 E 4 Selections can be combined with Cartesian products and theta joins a E1 X E2 E1 b 1 E1 2 E2 E2 E1 Database System Concepts 5th Edition Aug 27 2005 1 2 E2 14 4 Silberschatz Korth and Sudarshan Equivalence Rules Cont 5 Theta join operations and natural joins are commutative E1 E2 E2 E1 6 a Natural join operations are associative E1 E2 E3 E1 E2 E3 b Theta joins are associative in the following manner E1 1 E2 2 3 E3 E1 1 3 E2 2 E3 where 2 involves attributes from only E2 and E3 Database System Concepts 5th Edition Aug 27 2005 14 5 Silberschatz Korth and Sudarshan Pictorial Depiction of Equivalence Rules Database System Concepts 5th Edition Aug 27 2005 14 6 Silberschatz Korth and Sudarshan Equivalence Rules Cont 7 The selection operation distributes over the theta join operation under the following two conditions a When all the attributes in 0 involve only the attributes of one of the expressions E1 being joined 0 E1 E2 0 E1 E2 b When 1 involves only the attributes of E1 and 2 involves only the attributes of E2 1 E1 Database System Concepts 5th Edition Aug 27 2005 E2 1 E1 14 7 E2 Silberschatz Korth and Sudarshan Equivalence Rules Cont 8 The projections operation distributes over the theta join operation as follows a if involves only attributes from L1 L2 L1 L2 E1 b Consider a join E1 E2 L1 E1 L2 E2 E2 Let L1 and L2 be sets of attributes from E1 and E2 respectively Let L3 be attributes of E1 that are involved in join condition but are not in L1 L2 and let L4 be attributes of E2 that are involved in join condition but are not in L1 L2 L L E1 1 2 Database System Concepts 5th Edition Aug 27 2005 E2 L L L L E1 1 2 14 8 1 3 L 2 L4 E2 Silberschatz Korth and Sudarshan Equivalence Rules Cont 9 The set operations union and intersection are commutative E1 E2 E2 E1 E1 E2 E2 E1 set difference is not commutative 10 Set union and intersection are associative E1 E2 E3 E1 E2 E3 E1 E2 E3 E1 E2 E3 11 The selection operation distributes over and E1 E2 E1 E2 and similarly for and in place of Also E1 E2 E1 E2 and similarly for in place of but not for 12 The projection operation distributes over union L E1 E2 L E1 L E2 Database System Concepts 5th Edition Aug 27 2005 14 9 Silberschatz Korth and Sudarshan Transformation Example Query Find the names of all customers who have an account at some branch located in Brooklyn customer name branch city Brooklyn branch account depositor Transformation using rule 7a customer name branch city Brooklyn branch account depositor Performing the selection as early as possible reduces the size of the relation to be joined Database System Concepts 5th Edition Aug 27 2005 14 10 Silberschatz Korth and Sudarshan Example with Multiple Transformations Query Find the names of all customers with an account at a Brooklyn branch whose account balance is over 1000 customer name branch city Brooklyn balance 1000 branch account depositor Transformation using join associatively Rule 6a customer name branch city Brooklyn balance 1000 branch account Second form provides an opportunity to apply the perform selections early rule resulting in the subexpression branch city Brooklyn branch depositor balance 1000 account Thus a sequence of transformations can be useful Database System Concepts 5th Edition Aug 27 2005 14 11 Silberschatz Korth and Sudarshan Multiple Transformations Cont Database System Concepts 5th Edition Aug 27 2005 14 12 Silberschatz Korth and Sudarshan Join Ordering Example For all relations r1 r2 and r3 r1 If r2 r2 r3 r1 r3 is quite large and r1 r1 r2 r2 r3 r2 is small we choose r3 so that we compute and store a smaller temporary relation Database System Concepts 5th Edition Aug 27 2005 14 13 Silberschatz Korth and Sudarshan Cost Estimation Cost of each operator computer as described in Chapter 13 Need statistics of input relations Inputs can be results of sub expressions Need to estimate statistics of expression results To do so we require additional statistics E g number of tuples sizes of tuples E g number of distinct values for an attribute More on cost estimation later Database System Concepts 5th Edition Aug 27 2005 14 14 Silberschatz Korth and Sudarshan Statistical Information for Cost Estimation nr number of tuples in a relation r br number of blocks containing tuples of r lr size of a tuple of r fr blocking factor of r i e the number of tuples of r that fit into one block V A r number of distinct values that appear in r for attribute A same as the size of A r If tuples of r are stored together physically in a file then nr br fr Database System Concepts 5th Edition Aug 27 2005 14 15 Silberschatz Korth and Sudarshan Histograms Histogram on attribute age of relation person Equi width histograms Equi depth histograms Database System Concepts 5th Edition Aug 27 2005 14 16 Silberschatz Korth and Sudarshan Selection Size Estimation A v r nr V A r number of records that will satisfy the selection Equality condition on a key attribute size estimate 1 A V r case of A V r is symmetric Let c denote the estimated number of tuples satisfying the condition If min A r and max A r are available in catalog c 0 if v min A r c nr v min A r max A r min A r If histograms available can refine above estimate In absence of statistical information c is assumed to be nr 2 Database System


View Full Document

UMD CMSC 424 - Chapter 14: Query Optimization

Documents in this Course
Lecture 2

Lecture 2

36 pages

Databases

Databases

44 pages

Load more
Loading Unlocking...
Login

Join to view Chapter 14: Query Optimization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 14: Query Optimization and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?