Introduction to Database SystemsCSE 444CSE 444Lecture 3: SQL (part 2)CSE 444 - Summer 2010 1Outline• Aggregations (6.4.3 – 6.4.6)•Examples, examples, examples…ppp• Nulls (6.1.6 - 6.1.7)• Outer joins (6.3.8)j()2CSE 444 - Summer 2010AggregationSELECT count(*)FROM ProductWHEREyear > 1995SELECT avg(price)FROM ProductWHEREmaker=‘Toyota’WHEREyear > 1995WHERE maker ToyotaSQL supports several aggregation operations:SQL supports several aggregation operations:sum, count, min, max, avgExcept count, all aggregations apply to a single attribute3Except count, all aggregations apply to a single attributeCSE 444 - Summer 2010Aggregation: CountCOUNT applies to duplicates, unless otherwise stated:Ct(*)SELECT Count(category) FROM ProductWHEREyear > 1995same as Count(*)yWe probably want:SELECT Count(DISTINCT category)FROM Product4WHERE year > 1995More ExamplesPurchase(product, date, price, quantity)SELECT Sum(price * quantity)FROMPhFROMPurchaseSELECTSum(price*quantity)What dothey mean ?SELECTSum(price quantity)FROM PurchaseWHERE product = ‘bagel’they mean ?5CSE 444 - Summer 2010Simple AggregationsPurchaseProductPriceQuantityProductPriceQuantityBagel 3 20lBagel1.50 20Banana 0.5 50Banana 2 10Banana410SELECT Sum(price * quantity)FROMPurchase90 (=60+30)Banana4106FROMPurchaseWHERE product = ‘Bagel’90 ( 60+30)Grouping and Aggregationh(dii)Purchase(product, price, quantity)Find total quantities for all sales over $1, by product.SELECTdtS ( tit)ASTtlSlq$,ypSELECT product, Sum(quantity) AS TotalSalesFROM PurchaseWHERE price > 1GROUP BY productLt’ htthi7Let’s see what this means…CSE 444 - Summer 2010Grouping and Aggregation1. Compute the FROM and WHERE clauses.p2. Group by the attributes in the GROUP BY3. Compute the SELECT clause: grouped attributes and aggregates.8CSE 444 - Summer 20101&2. FROM-WHERE-GROUPBYProductPriceQuantityProductPriceQuantityBagel 3 20Bagel15020Bagel1.5020Banana 0.5 50Banana210Banana210Banana 4 10SELECT product, Sum(quantity) AS TotalSalesFROM PurchaseWHERE price > 19CSE 444 - Summer 2010GROUP BYproduct3. SELECTProductPriceQuantityProduct TotalSalesBagel40ProductPriceQuantityBagel 3 20Bagel15020age0Banana 20Bagel1.5020Banana 0.5 50Banana210Banana210Banana 4 10SELECT product, Sum(quantity) AS TotalSalesFROM Purchase10WHEREprice > 1GROUP BY productGROUP BY v.s. Nested QueriesSELECT product, Sum(quantity) AS TotalSalesFROM PurchaseWHERE price > 1GROUP BYproductGROUP BYproductSELECT DISTINCT x.product, (SELECT Sum(y.quantity)FROM Purchase yWHERE x.product = y.productAND price > 1)AND price > 1)AS TotalSalesFROM Purchase xWHEREi>111WHEREprice > 1Why twice ?Another ExampleWhat doesit mean ?SELECT product,sum(quantity) AS SumQuantity,(i)ASimax(price) AS MaxPriceFROM PurchaseGROUP BYproductp12CSE 444 - Summer 2010HAVING ClauseSame query as earlier, except that we consider only productsthat had at least 30 sales.SELECT product, Sum(quantity)FROM PurchaseWHERE price > 1GROUP BY productHAVINGSum(quantity)>30HAVINGSum(quantity) > 30HAVING clause contains conditions on aggregates13HAVING clause contains conditions on aggregates.CSE 444 - Summer 2010General form ofGroupingGeneral form of Groupingand AggregationSELECT SFROM R1,…,RnWHEREC1Why ?WHEREC1GROUP BY a1,…,akHAVINGC2S = may contain attributes a1,…,akand/or any aggregates b t NO OTHER ATTRIBUTESHAVINGC2but NO OTHER ATTRIBUTESC1 = is any condition on the attributes in R1,…,RnC2 = is any condition on aggregate expressions14ygggpand on attributes a1,…,akGeneral form ofGroupingGeneral form of Groupingand AggregationSELECT SFROM R1,…,RnWHEREC1WHEREC1GROUP BY a1,…,akHAVINGC2Evaluation steps:1Evaluate FROMWHERE apply condition C1HAVINGC21.Evaluate FROM-WHERE, apply condition C12. Group by the attributes a1,…,ak3Apply condition C2 to each group (may have aggregates)153.Apply condition C2 to each group (may have aggregates)4. Compute aggregates in S and return the resultAdvanced SQLizing1. Getting around INTERSECT and EXCEPT2. Unnesting Aggregates3. Finding witnesses16CSE 444 - Summer 2010INTERSECT and EXCEPT: not in some DBMSsINTERSECT and EXCEPT:Can unnest(SELECT R.A, R.BFROM R)SELECT R.A, R.BFROM RCan unnest.How ?)INTERSECT(SELECT S.A, S.BFROM S)WHEREEXISTS(SELECT *FROM S)WHERE R.A=S.A and R.B=S.B)(SELECT R.A, R.BFROMR)SELECT R.A, R.BFROMRFROMR)EXCEPT(SELECT S.A, S.BFROMS)FROMRWHERENOT EXISTS(SELECT *OS17FROMS)FROMSWHERE R.A=S.A and R.B=S.B)Unnesting Aggregatesd(i)Product ( pname, price, company)Company(cname, city)Find the number of companies in each cityFind the number of companies in each citySELECT DISTINCT city, (SELECT count(*) FROMCompany YFROMCompany Y WHERE X.city = Y.city)FROM Company XSELECT city, count(*)FROM CompanyEquivalent queriesNote: no need for DISTINCT18pyGROUP BY cityNote: no need for DISTINCT(DISTINCT is the same as GROUP BY)Unnesting Aggregatesd(i)What if thereFind the number of products made in each cityProduct ( pname, price, company)Company(cname, city)What if thereare no productsfor a city?Find the number of products made in each citySELECT DISTINCT X.city, (SELECT count(*) FROM Product Y, Company ZWHERE Z.cname=Y.companyAND Z.city = X.city)FROM Company XSELECT X.city, count(*)FROM Company X, Product YThey are NOTequivalent !19WHERE X.cname=Y.companyGROUP BY X.cityequivalent !(WHY?)More UnnestingAuthor(loginname)•Find authors who wrote≥10 documents:Author(login,name)Wrote(login,url)This is•Find authors who wrote ≥10 documents:• Attempt 1: with nested queriesSELECTDISTINCTAuthor nameSQL bya noviceSELECTDISTINCTAuthor.nameFROM AuthorWHERE (SELECT count(Wrote.url)FROM WroteWHERE Author.login=Wrote.login) > 1020CSE 444 - Summer 2010More Unnesting• Find all authors who wrote at least 10 documents:• Attempt 2: SQL style (with GROUP BY)SELECT Author.nameFROM Author, WroteWHERE Author.login=Wrote.loginGROUP BYAuthor nameThis isSQL byan expertGROUP BYAuthor.nameHAVING count(wrote.url) > 10an expert21CSE 444 - Summer 2010Finding WitnessesStore(sid, sname)Product(pidpnamepricesid)Product(pid, pname, price, sid)For each store, find its most expensive products22CSE 444 - Summer 2010Finding WitnessesFinding the maximum price is easy…SELECT Store.sid, max(Product.price)FROM Store, Product,WHERE Store.sid = Product.sidGROUP BY Store.sidBut we need the witnesses, i.e. the products with max price23,p pCSE 444 - Summer 2010Finding WitnessesTo find the witnesses, compute the maximum priceSELECT Store.sname, Product.pname,p pin a subquery,pFROM
View Full Document