Query OptimizationQuery Optimization Process (simplified a bit)Building BlocksKey Lessons in OptimizationOperations (revisited)Algebraic LawsSlide 7Slide 8Slide 9Query Rewrites: Sub-queriesThe Un-Nested QueryConverting Nested QueriesSlide 13Slide 14Slide 15Semi-Joins, Magic SetsRewrites: Magic SetsRewrites: SIPsSupporting ViewsAnd Finally…Rewrites: Group By and JoinSchema for Some ExamplesRewrites:Operation IntroductionQuery Rewriting: Predicate PushdownQuery Rewrites: Predicate Pushdown (through grouping)Query Rewrite: Predicate MovearoundSlide 27Slide 28Query Rewrite SummaryCost EstimationStatistics and CatalogsSize Estimation and Reduction FactorsHistogramsHistogramsSlide 35Slide 36Slide 37Plans for Single-Relation Queries (Prep for Join ordering)ExampleDetermining Join OrderingTypes of Join TreesSlide 42Slide 43ProblemDynamic ProgrammingSlide 46Slide 47Slide 48Slide 49Slide 50Motivating ExampleAlternative Plans 1Alternative Plans 2 With IndexesQuery OptimizationMarch 7th, 2003Query Optimization Process(simplified a bit)•Parse the SQL query into a logical tree:–identify distinct blocks (corresponding to nested sub-queries or views). •Query rewrite phase:–apply algebraic transformations to yield a cheaper plan.–Merge blocks and move predicates between blocks. •Optimize each block: join ordering.•Complete the optimization: select scheduling (pipelining strategy).Building Blocks•Algebraic transformations (many and wacky).•Statistical model: estimating costs and sizes.•Finding the best join trees:–Bottom-up (dynamic programming): System-R•Newer architectures:–Starburst: rewrite and then tree find–Volcano: all at once, top-down.Key Lessons in Optimization•There are many approaches and many details to consider in query optimization–Classic search/optimization problem!–Not completely solved yet!•Main points to take away are:–Algebraic rules and their use in transformations of queries.–Deciding on join ordering: System-R style (Selinger style) optimization.–Estimating cost of plans and sizes of intermediate results.Operations (revisited)•Scan ([index], table, predicate):–Either index scan or table scan.–Try to push down sargable predicates.•Selection (filter)•Projection (always need to go to the data?)•Joins: nested loop (indexed), sort-merge, hash, outer join.•Grouping and aggregation (usually the last).Algebraic Laws•Commutative and Associative Laws–R U S = S U R, R U (S U T) = (R U S) U T–R ∩ S = S ∩ R, R ∩ (S ∩ T) = (R ∩ S) ∩ T–R S = S R, R (S T) = (R S) T•Distributive Laws–R (S U T) = (R S) U (R T)Algebraic Laws•Laws involving selection:– C AND C’(R) = C( C’(R)) = C(R) ∩ C’(R)– C OR C’(R) = C(R) U C’(R)– C (R S) = C (R) S •When C involves only attributes of R– C (R – S) = C (R) – S– C (R U S) = C (R) U C (S)– C (R ∩ S) = C (R) ∩ SAlgebraic Laws•Example: R(A, B, C, D), S(E, F, G)– F=3 (R S) = ?– A=5 AND G=9 (R S) = ?D=ED=EAlgebraic Laws•Laws involving projections– M(R S) = N(P(R) Q(S))•Where N, P, Q are appropriate subsets of attributes of M– M(N(R)) = M,N(R)•Example R(A,B,C,D), S(E, F, G)– A,B,G(R S) = ? (?(R) ?(S)) D=ED=EQuery Rewrites: Sub-queriesSELECT Emp.NameFROM EmpWHERE Emp.Age < 30 AND Emp.Dept# IN (SELECT Dept.Dept# FROM Dept WHERE Dept.Loc = “Seattle” AND Emp.Emp#=Dept.Mgr)The Un-Nested QuerySELECT Emp.NameFROM Emp, DeptWHERE Emp.Age < 30 AND Emp.Dept#=Dept.Dept# AND Dept.Loc = “Seattle” AND Emp.Emp#=Dept.MgrConverting Nested QueriesSelect distinct x.name, x.makerFrom product xWhere x.color= “blue” AND x.price >= ALL (Select y.price From product y Where x.maker = y.maker AND y.color=“blue”)Select distinct x.name, x.makerFrom product xWhere x.color= “blue” AND x.price >= ALL (Select y.price From product y Where x.maker = y.maker AND y.color=“blue”)Converting Nested QueriesSelect distinct x.name, x.makerFrom product xWhere x.color= “blue” AND x.price < SOME (Select y.price From product y Where x.maker = y.maker AND y.color=“blue”)Select distinct x.name, x.makerFrom product xWhere x.color= “blue” AND x.price < SOME (Select y.price From product y Where x.maker = y.maker AND y.color=“blue”)Let’s compute the complement first:Converting Nested QueriesSelect distinct x.name, x.makerFrom product x, product yWhere x.color= “blue” AND x.maker = y.maker AND y.color=“blue” AND x.price < y.priceSelect distinct x.name, x.makerFrom product x, product yWhere x.color= “blue” AND x.maker = y.maker AND y.color=“blue” AND x.price < y.priceThis one becomes a SFW query:This returns exactly the products we DON’T want, so…Converting Nested Queries(Select x.name, x.maker From product x Where x.color = “blue”)EXCEPT(Select x.name, x.maker From product x, product y Where x.color= “blue” AND x.maker = y.maker AND y.color=“blue” AND x.price < y.price)(Select x.name, x.maker From product x Where x.color = “blue”)EXCEPT(Select x.name, x.maker From product x, product y Where x.color= “blue” AND x.maker = y.maker AND y.color=“blue” AND x.price < y.price)Semi-Joins, Magic Sets•You can’t always un-nest sub-queries (it’s tricky).•But you can often use a semi-join to reduce the computation cost of the inner query.•A magic set is a superset of the possible bindings in the result of the sub-query.•Also called “sideways information passing”.•Great idea; reinvented every few years on a regular basis.Rewrites: Magic SetsCreate View DepAvgSal AS (Select E.did, Avg(E.sal) as avgsal From Emp E Group By E.did)Select E.eid, E.salFrom Emp E, Dept D, DepAvgSal VWhere E.did=D.did AND D.did=V.did And E.age < 30 and D.budget > 100k And E.sal > V.avgsalRewrites: SIPsSelect
View Full Document