1Chapters 15-16a 1(Slides by Hector Garcia-Molina,http://www-db.stanford.edu/~hector/cs245/notes.htm)Chapters 15 and 16:Query ProcessingChapters 15-16a 2Query ProcessingQ → Query PlanFocus: Relational SystemChapters 15-16a 3ExampleSelect B,DFrom R,SWhere R.A = “c” AND S.E = 2 AND R.C=S.C2Chapters 15-16a 4 R: A B C S: C D Ea 1 10 10 x 2b 1 20 20 y 2c 2 10 30 z 2d 2 35 40 x 1e 3 45 50 y 3Answer B D2 xChapters 15-16a 5• How do we execute query?- Do Cartesian product- Select tuples- Do projectionOne ideaChapters 15-16a 6RXS R.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 . . C 2 10 10 x 2 . .Bingo!Got one...3Chapters 15-16a 7Relational Algebra - can be used to describe plans...Ex: Plan IΠB,D σR.A=“c”∧ S.E=2 ∧ R.C=S.C XR SOR: ΠB,D [ σR.A=“c”∧ S.E=2 ∧ R.C = S.C (RXS)]Chapters 15-16a 8Another idea:ΠB,D σR.A = “c” σS.E = 2R SPlan II natural joinChapters 15-16a 9 R SA B C σ (R) σ(S) C D Ea 1 10 A B C C D E 10 x 2b 1 20 c 2 10 10 x 2 20 y 2c 2 10 20 y 2 30 z 2d 2 35 30 z 2 40 x 1e 3 45 50 y 34Chapters 15-16a 10Plan IIIUse R.A and S.C Indexes(1) Use R.A index to select R tuples with R.A = “c”(2) For each R.C value found, use S.C index to find matching tuples(3) Eliminate S tuples S.E ≠ 2(4) Join matching R,S tuples, project B,D attributes and place in resultChapters 15-16a 11 R SA B C C D Ea 1 10 10 x 2b 1 20 20 y 2c 2 10 30 z 2d 2 35 40 x 1e 3 45 50 y 3A CI1 I2=“c”<c,2,10><10,x,2>check=2?output: <2,x>next tuple:<c,7,15>Chapters 15-16a 12Overview of Query Optimization5Chapters 15-16a 13parseconvertapply lawsestimate result sizesconsider physical plansestimate costspick bestexecute{P1,P2,…..}{(P1,C1),(P2,C2)...}Pi answerSQL queryparse treelogical query plan“improved” l.q.pl.q.p. +sizesstatisticsChapters 15-16a 14Example: SQL querySELECT titleFROM StarsInWHERE starName IN (SELECT nameFROM MovieStarWHERE birthdate LIKE ‘%1960’);(Find the movies with stars born in 1960)Chapters 15-16a 15Example: Parse Tree<Query><SFW>SELECT <SelList> FROM <FromList> WHERE <Condition><Attribute> <RelName> <Tuple> IN <Query>title StarsIn <Attribute> ( <Query> )starName <SFW>SELECT <SelList> FROM <FromList> WHERE <Condition><Attribute> <RelName> <Attribute> LIKE <Pattern>name MovieStar birthDate ‘%1960’6Chapters 15-16a 16Example: Generating Relational AlgebraΠtitleσStarsIn <condition><tuple> IN Πname<attribute> σbirthdate LIKE ‘%1960’starName MovieStarAn expression using a two-argument σ, midway between a parse tree andrelational algebraChapters 15-16a 17Example: Logical Query PlanΠtitleσstarName=nameStarsIn Πname σbirthdate LIKE ‘%1960’ MovieStarApplying the rule for IN conditions×Chapters 15-16a 18Example: Improved Logical Query PlanΠtitlestarName=nameStarsIn Πname σbirthdate LIKE ‘%1960’ MovieStarFig. 7.20: An improvement on fig. 7.18.Question:Push project toStarsIn?7Chapters 15-16a 19Example: Estimate Result Sizes Need expected size StarsInMovieStarΠσChapters 15-16a 20Example: One Physical Plan Parameters: join order, memory size, ...Hash joinSEQ scan index scanParameters:Select Condition,...StarsIn MovieStarChapters 15-16a 21Example: Estimate costsL.Q.PP1 P2 …. PnC1 C2 …. CnPick best!8Chapters 15-16a 22Textbook outlineChapter 5: Algebra for queries[bags vs sets]- Select, project, join, ….[project lista,a+b->x,…]- Duplicate elimination, grouping, sortingChapter 15:- Physical operators- Scan,sort, …- Implementing operators and estimating their costChapters 15-16a 23Chapter 16:- Parsing- Algebraic laws- Parse tree -> logical query plan- Estimating result sizes- Cost based optimizationChapters 15-16a 24Query Optimization• Relational algebra level• Detailed query plan level– Estimate Costs• without indexes• with indexes– Generate and compare plans9Chapters 15-16a 25Relational algebra optimization• Transformation rules(preserve equivalence)• What are good transformations?Chapters 15-16a 26R x S = S x R(R x S) x T = R x (S x T)R U S = S U RR U (S U T) = (R U S) U TRules: Natural joins & cross products & unionR S = S R(R S) T = R (S T)Chapters 15-16a 27Note:• Can also write as trees, e.g.: T RR S S T10Chapters 15-16a 28Rules: Selectsσp1∧p2(R) =σp1vp2(R) =σp1 [ σp2 (R)][ σp1 (R)] U [ σp2 (R)]Chapters 15-16a 29Let p = predicate with only R attribs q = predicate with only S attribsσp (R S) =σq (R S) =Rules: σ + combined [σp (R)] S R [σq (S)]Chapters 15-16a 30σp1∧p2 (R) → σp1 [σp2 (R)]σp (R S) → [σp (R)] SR S → S RWhich are “good” transformations?11Chapters 15-16a 31Conventional wisdom:do projects earlyExample: R(A,B,C,D,E) x={E} P: (A=3) ∧ (B=“cat”)πx {σp (R)} vs. πE {σp{πABE(R)}}Chapters 15-16a 32What if we have A, B indexes?B = “cat” A=3Intersect pointers to getpointers to matching tuplesButChapters 15-16a 33Bottom line:• No transformation is always good• Usually good: early selections12Chapters 15-16a 34In textbook: more transformations• Eliminate common sub-expressions• Other operations: duplicate eliminationChapters 15-16a 35Outline - Query Processing• Relational algebra level– transformations– good transformations• Detailed query plan level– estimate costs– generate and compare plansChapters 15-16a 36• Estimating cost of query plan(1) Estimating size of results(2) Estimating # of IOs13Chapters 15-16a 37Estimating result size• Keep statistics for relation R– T(R) : # tuples in R– S(R) : # of bytes in each R tuple– B(R): # of blocks to hold all R tuples– V(R, A) : # distinct values in Rfor attribute AChapters 15-16a 38Example R A: 20 byte stringB: 4 byte integerC: 8 byte dateD: 5 byte stringA B C Dcat 1 10 acat 1 20 bdog 1 30 adog 1 40 cbat 1 50 dT(R) = 5
View Full Document