Grouping, E/R, and updatesCSE 444 section, July 1, 2010Today• Practice with grouping and aggregation• Database design with E/R diagrams• Modifying the databaseDocument index databaseAuthor (aid, name)Auth_Doc (aid, did)Document (did, title)Doc_Word (did, word)Word (word)Underlined = key (unique identifier for a tuple)AUTHOR DOCUMENT WORDAUTH_DOC DOC_WORDSFind authors who wrote ≥ 20 docsFind authors who wrote ≥ 20 docsThis could work:SELECT nameFROM Author aWHERE 20 <= (SELECT COUNT(*) FROMAuth_Doc ad WHERE ad.aid = a.aid)Find authors who wrote ≥ 20 docsUse grouping to eliminate the subquery:SELECT nameFROM Author a, Auth_Doc ad WHERE a.aid = ad.aidGROUP BY a.aid, a.nameHAVING COUNT(*) >= 20Find authors who wrote ≥ 20 docsUse grouping to eliminate the subquery:SELECT nameFROM Author a, Auth_Doc ad WHERE a.aid = ad.aidGROUP BY a.aid, a.nameHAVING COUNT(*) >= 20One row per(a.aid, a.name) pairFind authors who wrote ≥ 20 docsUse grouping to eliminate the subquery:SELECT nameFROM Author a, Auth_Doc ad WHERE a.aid = ad.aidGROUP BY a.aid, a.nameHAVING COUNT(*) >= 20Only groups that combine ≥ 20 tuples will matchFind authors who wrote ≥ 20 docsUse grouping to eliminate the subquery:SELECT nameFROM Author a, Auth_Doc ad WHERE a.aid = ad.aidGROUP BY a.aid, a.nameHAVING COUNT(*) >= 20If aid is the key, why group by name?If we deleted a.name…ERROR: Column 'name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.Finding literate authorsHow can we find authors who use more than 10,000 distinct words?Authors who use > 10,000 wordsSELECT name FROM Author a, Auth_Doc ad, Doc_Words dwWHERE a.aid = ad.aid AND ad.did = dw.didGROUP BY a.aid, a.nameHAVING COUNT(DISTINCT word) > 10000Authors who use > 10,000 wordsSELECT name FROM Author a, Auth_Doc ad, Doc_Words dwWHERE a.aid = ad.aid AND ad.did = dw.didGROUP BY a.aid, a.nameHAVING COUNT(DISTINCT word) > 10000 What does DISTINCT mean within COUNT?More examples• For each author, give the total number of words in all documents he has (co-)written.• For each author, give the average length in words of his documents.• Give the author with the longest average documents.Total word count by authorAverage word count by authorWordiest-on-average authorTry these at home• All words used by at least 10 authors• The most frequently used word• The longest document• Authors of the longest documentToday• Practice with grouping and aggregation• Database design with E/R diagrams• Modifying the databaseWhy use E/R diagrams?E/R basics• Concepts and symbols– Entity vs. entity set– Attributes– Relationship– Arrows• ISA– Difference from OOP in C++/JavaFrom English to E/R diagrams• Each project is managed by one professor (principal investigator)• A professor can manage multiple projectsExample courtesy: Database Management Systems, 3rd E, R. Ramakrishnan and J. GehrkeFrom English to E/R diagrams• Each project is managed by one professor (principal investigator)• A professor can manage multiple projectsProfessorProjectrankSpecialtyagessnpidend-datesponsorstart-datebudgetExample courtesy: Database Management Systems, 3rd E, R. Ramakrishnan and J. GehrkeFrom English to E/R diagrams• Each project is managed by one professor (principal investigator)• A professor can manage multiple projectsProfessorProjectrankSpecialtyagessnpidend-datesponsorstart-datebudgetManagesExample courtesy: Database Management Systems, 3rd E, R. Ramakrishnan and J. GehrkeFrom English to E/R diagrams• Each project is worked on by one or more professors• Professors can work on multiple projectsProfessorProjectrankSpecialtyagessnpidend-datesponsorstart-datebudgetManagesExample courtesy: Database Management Systems, 3rd E, R. Ramakrishnan and J. GehrkeFrom English to E/R diagrams• Each project is worked on by one or more professors• Professors can work on multiple projectsProfessorProjectrankSpecialtyagessnpidend-datesponsorstart-datebudgetManagesWork-onExample courtesy: Database Management Systems, 3rd E, R. Ramakrishnan and J. GehrkeToday• Practice with grouping and aggregation• Database design with E/R diagrams• Modifying the database28Modifying the databaseThree kinds of modifications in SQL:• insertions• updates• deletionsSometimes they are all called “updates”InsertionsGeneral form:INSERT INTO R(A1,…., An) VALUES (v1,…., vn)InsertionsProduct (name, listPrice, category)Purchase (buyer, seller, product, price)Missing attributes NULL.May drop attribute names if you give them in order.INSERT INTO Purchase (buyer, seller, product, price)VALUES ('Joe', 'Fred', 'wakeup-clock-espresso-machine',199.99)Example: Insert a new purchase to the database:31Inserting results of a queryINSERT INTO Product (name)SELECT DISTINCT Purchase.productFROM PurchaseWHERE Purchase.date > "10/26/01";The query replaces the VALUES keyword.Here we insert many tuples into Product32UpdatesUPDATE ProductSET price = price/2WHERE Product.name IN(SELECT productFROM PurchaseWHERE Date =‘Oct, 25, 1999’);Example:WHERE works the same as in a query (SELECT).It chooses the tuples whose values are to be updated33DeletionsDELETE FROM PurchaseWHERE seller = ‘Joe’ ANDproduct = ‘Brooklyn Bridge’Similar to UPDATE but without the SET clause:Always specify a WHERE clause (in fact, write it first!)Otherwise, every tuple will be
View Full Document