UW CSE 444 - Query Execution - D2940530

Home> Schools> University of Washington> Computer Science and Engineering (CSE) > CSE 444> Query Execution

DOC PREVIEW

UW CSE 444 - Query Execution

School name University of Washington

Course Cse 444- Database Internals

Pages 30

This preview shows page 1-2-14-15-29-30 out of 30 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 30 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 30 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 30 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 30 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 30 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 30 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 30 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1Lecture 24:Query Execution Monday, November 27, 20062Outline• Query optimization: algebraic laws 16.23ExampleProduct(pname, maker), Company(cname, city)• How do we execute this query ?Select Product.pnameFrom Product, CompanyWhere Product.maker=Company.cnameand Company.city = “Seattle”Select Product.pnameFrom Product, CompanyWhere Product.maker=Company.cnameand Company.city = “Seattle”4ExampleProduct(pname, maker), Company(cname, city)Assume:Clustered index: Product.pname, Company.cnameUnclustered index: Product.maker, Company.city5σcity=“Seattle”Product(pname,maker)Company(cname,city)maker=cnameLogical Plan:6σcity=“Seattle”Product(pname,maker)Company(cname,city)cname=makerPhysical plan 1:Index-basedselectionIndex-basedjoin7σcity=“Seattle”Product(pname,maker)Company(cname,city)maker=cnamePhysical plans 2a and 2b:Index-scanMerge-joinScan and sort (2a)index scan (2b)Which one is better ??Which one is better ??8σcity=“Seattle”Product(pname,maker)Company(cname,city)cname=makerPhysical plan 1:Index-basedselectionIndex-basedjoinT(Company) / V(Company, city) × T(Product) / V(Product, maker)Total cost:T(Company) / V(Company, city) × T(Product) / V(Product, maker)Total cost:T(Company) / V(Company, city) × T(Product) / V(Product, maker)9σcity=“Seattle”Product(pname,maker)Company(cname,city)maker=cnamePhysical plans 2a and 2b:Table-scanMerge-joinScan and sort (2a)index scan (2b)B(Company) 3B(Product)T(Product)No extra cost(why ?)Total cost:(2a): 3B(Product) + B(Company) (2b): T(Product) + B(Company)Total cost:(2a): 3B(Product) + B(Company) (2b): T(Product) + B(Company)10Plan 1: T(Company)/V(Company,city) ×T(Product)/V(Product,maker)Plan 2a: B(Company) + 3B(Product)Plan 2b: B(Company) + T(Product)Plan 1: T(Company)/V(Company,city) ×T(Product)/V(Product,maker)Plan 2a: B(Company) + 3B(Product)Plan 2b: B(Company) + T(Product)Which one is better ??Which one is better ??It depends on the data !!It depends on the data !!11Example•Case 1: V(Company, city) ≈ T(Company) •Case 2: V(Company, city) << T(Company)T(Company) = 5,000 B(Company) = 500 M = 100T(Product) = 100,000 B(Product) = 1,000We may assume V(Product, maker) ≈ T(Company) (why ?)T(Company) = 5,000 B(Company) = 500 M = 100T(Product) = 100,000 B(Product) = 1,000We may assume V(Product, maker) ≈ T(Company) (why ?)V(Company,city) = 2,000V(Company,city) = 2,000V(Company,city) = 20V(Company,city) = 2012Which Plan is Best ?Plan 1: T(Company)/V(Company,city) × T(Product)/V(Product,maker)Plan 2a: B(Company) + 3B(Product)Plan 2b: B(Company) + T(Product)Plan 1: T(Company)/V(Company,city) × T(Product)/V(Product,maker)Plan 2a: B(Company) + 3B(Product)Plan 2b: B(Company) + T(Product)Case 1:Case 2:13Lessons• Need to consider several physical plan– even for one, simple logical plan• No magic “best” plan: depends on the data• In order to make the right choice– need to have statistics over the data– the B’s, the T’s, the V’s14Query Optimzation• Have a SQL query Q• Create a plan P• Find equivalent plans P = P’ = P’’ = …• Choose the “cheapest”. HOW ??15Logical Query PlanSELECT P.buyerFROM Purchase P, Person QWHERE P.buyer=Q.name ANDP.city=‘seattle’ ANDQ.phone > ‘5430000’SELECT P.buyerFROM Purchase P, Person QWHERE P.buyer=Q.name ANDP.city=‘seattle’ ANDQ.phone > ‘5430000’PurchasePersonBuyer=nameCity=‘seattle’ phone>’5430000’buyerσIn class:find a “better” plan P’P=Purchasse(buyer, city)Person(name, phone)16Logical Query PlanSELECT city, sum(quantity)FROM salesGROUP BY cityHAVING sum(quantity) < 100SELECT city, sum(quantity)FROM salesGROUP BY cityHAVING sum(quantity) < 100sales(product, city, quantity)γ city, sum(quantity)→pσ p < 100T1(city,p)T2(city,p)In class:find a “better” plan P’Q=P=17The three components of an optimizerWe need three things in an optimizer:• Algebraic laws• An optimization algorithm• A cost estimator18Algebraic Laws• Commutative and Associative LawsR ∪ S = S ∪ R, R ∪ (S ∪ T) = (R ∪ S) ∪ TR |×|S = S |×|R, R |×|(S |×|T) = (R |×|S) |×|TR |×|S = S |×|R, R |×|(S |×|T) = (R |×|S) |×|T• Distributive LawsR |×|(S ∪ T) = (R |×|S) ∪ (R |×|T)19Algebraic Laws• Laws involving selection:σC AND C’(R) = σC(σC’(R)) = σC(R) ∩σC’(R)σC OR C’(R) = σC(R) ∪σC’(R)σC (R |×|S) = σC (R) |×|S • When C involves only attributes of RσC (R – S) = σC (R) – SσC (R ∪ S) = σC (R) ∪σC (S)σC (R |×|S) = σC (R) |×|S20Algebraic Laws• Example: R(A, B, C, D), S(E, F, G)σF=3 (R |×|D=ES) = ?σA=5 AND G=9 (R |×|D=ES) = ?21Algebraic Laws• Laws involving projectionsΠM(R |×|S) = ΠM(ΠP(R) |×| ΠQ(S))ΠM(ΠN(R)) = ΠM,N(R)• Example R(A,B,C,D), S(E, F, G)ΠA,B,G(R |×|D=ES) = Π? (Π?(R) |×|D=EΠ?(S))22Algebraic Laws• Laws involving grouping and aggregation:δ(γA, agg(B)(R)) = γA, agg(B)(R)γA, agg(B)(δ(R)) = γA, agg(B)(R) if agg is “duplicate insensitive”• Which of the following are “duplicate insensitive” ?sum, count, avg, min, maxγA, agg(D)(R(A,B) |×|B=CS(C,D)) = γA, agg(D)(R(A,B) |×|B=C(γC, agg(D)S(C,D)))23Optimizations Based on SemijoinsTHIS IS ADVANCED STUFF; NOT ON THE FINAL• R S = ΠA1,…,An(R S)• Where the schemas are:– Input: R(A1,…An), S(B1,…,Bm)– Output: T(A1,…,An)24Optimizations Based on SemijoinsSemijoins: a bit of theory (see [AHV])• Given a query:• A full reducer for Q is a program:• Such that no dangling tuples remain in any relationQ :- Π (σ (R1 |x| R2|x| . . . |x| Rn))Q :- Π (σ (R1 |x| R2|x| . . . |x| Rn))Ri1:= Ri1Rj1Ri2:= Ri2Rj2. . . . .Rip:= RipRjpRi1:= Ri1Rj1Ri2:= Ri2Rj2. . . . .Rip:= RipRjp25Optimizations Based on Semijoins•Example:• A full reducer is:Q(A,E) :- R1(A,B) |x| R2(B,C) |x| R3(C,D,E)Q(A,E) :- R1(A,B) |x| R2(B,C) |x| R3(C,D,E)R2(B,C) := R2(B,C) |x R1(A,B)R3(C,D,E) := R3(C,D,E) |x R2(B,C)R2(B,C) := R2(B,C) |x R3(C,D,E)R1(A,B) := R1(A,B) |x R2(B,C)R2(B,C) := R2(B,C) |x R1(A,B)R3(C,D,E) := R3(C,D,E) |x R2(B,C)R2(B,C) := R2(B,C) |x R3(C,D,E)R1(A,B) := R1(A,B) |x R2(B,C)The new tables have only the tuples necessary to compute Q(E)26Optimizations Based on Semijoins• Example: • Doesn’t have a full reducer (we can reduce forever)Theorem a query has a full reducer iff it is “acyclic”Q(E) :- R1(A,B) |x| R2(B,C) |x|

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-14-15-29-30 out of 30 pages.

UW CSE 444 - Query Execution

Sign up for free to view:

Please select your school