UW CSE 444 - Query Optimization- Transformations - D1735045

Home> Schools> University of Washington> Computer Science and Engineering (CSE) > CSE 444> Query Optimization- Transformations

DOC PREVIEW

UW CSE 444 - Query Optimization- Transformations

School name University of Washington

Course Cse 444- Database Internals

Pages 27

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Query Optimization: TransformationsSchema for Some ExamplesMotivating ExampleAlternative Plans 1Alternative Plans 2 With IndexesBuilding Blocks for OptimizationQuery Optimization Process (simplified a bit)Key Lessons in OptimizationOperations (revisited)Algebraic LawsSlide 11Slide 12Slide 13Query Rewrites: Sub-queriesThe Un-Nested QueryConverting Nested QueriesSlide 17Slide 18Slide 19Rewrites: Group By and JoinRewrites:Operation IntroductionQuery Rewriting: Predicate PushdownQuery Rewrites: Predicate Pushdown (through grouping)Query Rewrite: Predicate MovearoundSlide 25Slide 26Query Rewrite SummaryQuery Optimization: TransformationsMay 29th, 2002Schema for Some Examples•Reserves:–Each tuple is 40 bytes long, 100 tuples per page, 1000 pages (4000 tuples)•Sailors:–Each tuple is 50 bytes long, 80 tuples per page, 500 pages (4000 tuples). Sailors (sid: integer, sname: string, rating: integer, age: real)Reserves (sid: integer, bid: integer, day: dates, rname: string)Motivating Example•Cost: 500+500*1000 I/Os•By no means the worst plan! •Misses several opportunities: selections could have been `pushed’ earlier, no use is made of any available indexes, etc.•Goal of optimization: To find more efficient plans that compute the same answer. SELECT S.snameFROM Reserves R, Sailors SWHERE R.sid=S.sid AND R.bid=100 AND S.rating>5ReservesSailorssid=sidbid=100 rating > 5snameReservesSailorssid=sidbid=100 rating > 5sname(Simple Nested Loops)(On-the-fly)(On-the-fly)RA Tree:Plan:Alternative Plans 1 •Main difference: push selects.•With 5 buffers, cost of plan:–Scan Reserves (1000) + write temp T1 (10 pages, if we have 100 boats, uniform distribution).–Scan Sailors (500) + write temp T2 (250 pages, if we have 10 ratings).–Sort T1 (2*2*10), sort T2 (2*3*250), merge (10+250), total=1800 –Total: 3560 page I/Os.•If we used BNL join, join cost = 10+4*250, total cost = 2770.•If we `push’ projections, T1 has only sid, T2 only sid and sname:–T1 fits in 3 pages, cost of BNL drops to under 250 pages, total < 2000.ReservesSailorssid=sidbid=100 sname(On-the-fly)rating > 5(Scan;write to temp T1)(Scan;write totemp T2)(Sort-Merge Join)Alternative Plans 2With Indexes•With clustered index on bid of Reserves, we get 100,000/100 = 1000 tuples on 1000/100 = 10 pages.•INL with pipelining (outer is not materialized). Decision not to push rating>5 before the join is based on availability of sid index on Sailors. Cost: Selection of Reserves tuples (10 I/Os); for each, must get matching Sailors tuple (1000*1.2); total 1210 I/Os. Join column sid is a key for Sailors.–At most one matching tuple, unclustered index on sid OK.ReservesSailorssid=sidbid=100 sname(On-the-fly)rating > 5(Use hashindex; donot writeresult to temp)(Index Nested Loops,with pipelining )(On-the-fly)Building Blocks for Optimization•Algebraic transformations (many and wacky).•Statistical model: estimating costs and sizes.•Finding the best join trees:–Bottom-up (dynamic programming): System-R•Newer architectures:–Starburst: rewrite and then tree find–Volcano: all at once, top-down.Query Optimization Process(simplified a bit)•Parse the SQL query into a logical tree:–identify distinct blocks (corresponding to nested sub-queries or views). •Query rewrite phase:–apply algebraic transformations to yield a cheaper plan.–Merge blocks and move predicates between blocks. •Optimize each block: join ordering.•Complete the optimization: select scheduling (pipelining strategy).Key Lessons in Optimization•There are many approaches and many details to consider in query optimization–Classic search/optimization problem!–Not completely solved yet!•Main points to take away are:–Algebraic rules and their use in transformations of queries.–Deciding on join ordering: System-R style (Selinger style) optimization.–Estimating cost of plans and sizes of intermediate results.Operations (revisited)•Scan ([index], table, predicate):–Either index scan or table scan.–Try to push down sargable predicates.•Selection (filter)•Projection (always need to go to the data?)•Joins: nested loop (indexed), sort-merge, hash, outer join.•Grouping and aggregation (usually the last).Algebraic Laws•Commutative and Associative Laws–R U S = S U R, R U (S U T) = (R U S) U T–R ∩ S = S ∩ R, R ∩ (S ∩ T) = (R ∩ S) ∩ T–R S = S R, R (S T) = (R S) T•Distributive Laws–R (S U T) = (R S) U (R T)Algebraic Laws•Laws involving selection:–  C AND C’(R) =  C( C’(R)) =  C(R) ∩  C’(R)–  C OR C’(R) =  C(R) U  C’(R)–  C (R S) =  C (R) S •When C involves only attributes of R–  C (R – S) =  C (R) – S–  C (R U S) =  C (R) U  C (S)–  C (R ∩ S) =  C (R) ∩ SAlgebraic Laws•Example: R(A, B, C, D), S(E, F, G)–  F=3 (R S) = ?–  A=5 AND G=9 (R S) = ?D=ED=EAlgebraic Laws•Laws involving projections– M(R S) = N(P(R) Q(S))•Where N, P, Q are appropriate subsets of attributes of M– M(N(R)) = M,N(R)•Example R(A,B,C,D), S(E, F, G)– A,B,G(R S) =  ? (?(R) ?(S)) D=ED=EQuery Rewrites: Sub-queriesSELECT Emp.NameFROM EmpWHERE Emp.Age < 30 AND Emp.Dept# IN (SELECT Dept.Dept# FROM Dept WHERE Dept.Loc = “Seattle” AND Emp.Emp#=Dept.Mgr)The Un-Nested QuerySELECT Emp.NameFROM Emp, DeptWHERE Emp.Age < 30 AND Emp.Dept#=Dept.Dept# AND Dept.Loc = “Seattle” AND Emp.Emp#=Dept.MgrConverting Nested QueriesSelect distinct x.name, x.makerFrom product xWhere x.color= “blue” AND x.price >= ALL (Select y.price From product y Where x.maker = y.maker AND y.color=“blue”)Select distinct x.name, x.makerFrom product xWhere x.color= “blue” AND x.price >= ALL (Select y.price From product y Where x.maker = y.maker AND y.color=“blue”)How do we convert this one to logical plan ?Converting Nested QueriesSelect distinct x.name, x.makerFrom product xWhere x.color= “blue” AND x.price < SOME (Select y.price

View Full Document