Introduction to Database Systems CSE 444 Lecture 3: SQL (part 2) Magda Balazinska - CSE 444, Fall 2010Outline • Aggregations (6.4.3 – 6.4.6) • Examples, examples, examples… • Nulls (6.1.6 - 6.1.7) [Old edition: 6.1.5-6.1.6] • Outer joins (6.3.8) Magda Balazinska - CSE 444, Fall 2010Aggregation SELECT count(*) FROM Product WHERE year > 1995 Except count, all aggregations apply to a single attribute SELECT avg(price) FROM Product WHERE maker=‘Toyota’ SQL supports several aggregation operations: sum, count, min, max, avg Magda Balazinska - CSE 444, Fall 2010COUNT applies to duplicates, unless otherwise stated: SELECT Count(category) FROM Product WHERE year > 1995 same as Count(*) We probably want: SELECT Count(DISTINCT category) FROM Product WHERE year > 1995 Aggregation: Count Magda Balazinska - CSE 444, Fall 2010Purchase(product, date, price, quantity) More Examples SELECT Sum(price * quantity) FROM Purchase SELECT Sum(price * quantity) FROM Purchase WHERE product = ‘bagel’ What do they mean ? Magda Balazinska - CSE 444, Fall 2010Simple Aggregations Purchase SELECT Sum(price * quantity) FROM Purchase WHERE product = ‘Bagel’ 90 (= 60+30) Product Price Quantity Bagel 3 20 Bagel 1.50 20 Banana 0.5 50 Banana 2 10 Banana 4 10 Magda Balazinska - CSE 444, Fall 2010Grouping and Aggregation Purchase(product, price, quantity) SELECT product, Sum(quantity) AS TotalSales FROM Purchase WHERE price > 1 GROUP BY product Let’s see what this means… Find total quantities for all sales over $1, by product. Magda Balazinska - CSE 444, Fall 2010Grouping and Aggregation 1. Compute the FROM and WHERE clauses. 2. Group by the attributes in the GROUPBY 3. Compute the SELECT clause: grouped attributes and aggregates. Magda Balazinska - CSE 444, Fall 20101&2. FROM-WHERE-GROUPBY Product Price Quantity Bagel 3 20 Bagel 1.50 20 Banana 0.5 50 Banana 2 10 Banana 4 10 Magda Balazinska - CSE 444, Fall 20103. SELECT SELECT product, Sum(quantity) AS TotalSales FROM Purchase WHERE price > 1 GROUP BY product Product TotalSales Bagel 40 Banana 20 Product Price Quantity Bagel 3 20 Bagel 1.50 20 Banana 0.5 50 Banana 2 10 Banana 4 10GROUP BY v.s. Nested Queries SELECT product, Sum(quantity) AS TotalSales FROM Purchase WHERE price > 1 GROUP BY product SELECT DISTINCT x.product, (SELECT Sum(y.quantity) FROM Purchase y WHERE x.product = y.product AND price > 1) AS TotalSales FROM Purchase x WHERE price > 1 Why twice ?Another Example SELECT product, sum(quantity) AS SumQuantity, max(price) AS MaxPrice FROM Purchase GROUP BY product What does it mean ? Magda Balazinska - CSE 444, Fall 2010HAVING Clause SELECT product, Sum(quantity) FROM Purchase WHERE price > 1 GROUP BY product HAVING Sum(quantity) > 30 Same query as earlier, except that we consider only products that had at least 30 sales. HAVING clause contains conditions on aggregates. Magda Balazinska - CSE 444, Fall 2010General form of Grouping and Aggregation S = may contain attributes a1,…,ak and/or any aggregates but NO OTHER ATTRIBUTES C1 = is any condition on the attributes in R1,…,Rn C2 = is any condition on aggregate expressions and on attributes a1,…,ak Why ? SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2General form of Grouping and Aggregation Evaluation steps: 1. Evaluate FROM-WHERE, apply condition C1 2. Group by the attributes a1,…,ak 3. Apply condition C2 to each group (may have aggregates) 4. Compute aggregates in S and return the result SELECT S FROM R1,…,Rn WHERE C1 GROUP BY a1,…,ak HAVING C2 Magda Balazinska - CSE 444, Fall 2010Advanced SQLizing 1. Getting around INTERSECT and EXCEPT 2. Unnesting Aggregates 3. Finding witnesses Magda Balazinska - CSE 444, Fall 2010INTERSECT and EXCEPT: (SELECT R.A, R.B FROM R) INTERSECT (SELECT S.A, S.B FROM S) SELECT R.A, R.B FROM R WHERE EXISTS(SELECT * FROM S WHERE R.A=S.A and R.B=S.B) (SELECT R.A, R.B FROM R) EXCEPT (SELECT S.A, S.B FROM S) SELECT R.A, R.B FROM R WHERE NOT EXISTS(SELECT * FROM S WHERE R.A=S.A and R.B=S.B) Can unnest. How ? INTERSECT and EXCEPT: not in some DBMSsUnnesting Aggregates Product ( pname, price, company) Company(cname, city) Find the number of companies in each city SELECT DISTINCT city, (SELECT count(*) FROM Company Y WHERE X.city = Y.city) FROM Company X SELECT city, count(*) FROM Company GROUP BY city Equivalent queries Note: no need for DISTINCT (DISTINCT is the same as GROUP BY) Magda Balazinska - CSE 444, Fall 2010Unnesting Aggregates Find the number of products made in each city SELECT DISTINCT X.city, (SELECT count(*) FROM Product Y, Company Z WHERE Z.cname=Y.company AND Z.city = X.city) FROM Company X SELECT X.city, count(*) FROM Company X, Product Y WHERE X.cname=Y.company GROUP BY X.city They are NOT equivalent ! (WHY?) Product ( pname, price, company) Company(cname, city) What if there are no products for a city?More Unnesting • Find authors who wrote ≥ 10 documents: • Attempt 1: with nested queries SELECT DISTINCT Author.name FROM Author WHERE (SELECT count(Wrote.url) FROM Wrote WHERE Author.login=Wrote.login) > 10 This is SQL by a novice Author(login,name) Wrote(login,url) Magda Balazinska - CSE 444, Fall 2010More Unnesting • Find all authors who wrote at least 10 documents: • Attempt 2: SQL style (with GROUP BY) SELECT Author.name FROM Author, Wrote WHERE Author.login=Wrote.login GROUP BY Author.name HAVING count(wrote.url) > 10 This is SQL by an expert Magda Balazinska - CSE 444, Fall 2010Finding Witnesses Store(sid, sname) Product(pid, pname, price, sid) For each store, find its most expensive products Magda Balazinska - CSE 444, Fall 2010Finding Witnesses SELECT Store.sid, max(Product.price) FROM Store, Product WHERE Store.sid = Product.sid GROUP BY Store.sid Finding the maximum price is easy… But we need the witnesses, i.e. the products with max price Magda Balazinska - CSE 444, Fall 2010Finding Witnesses SELECT Store.sname, Product.pname FROM Store, Product, (SELECT Store.sid AS sid, max(Product.price) AS p FROM Store, Product WHERE Store.sid
View Full Document