MIT 11 521 - Handling One-to-Many Relations - Grouping and Aggregation - D2013960

Home> Schools> Massachusetts Institute of Technology> Urban Studies and Planning (11) > 11 521> Handling One-to-Many Relations - Grouping and Aggregation

DOC PREVIEW

MIT 11 521 - Handling One-to-Many Relations - Grouping and Aggregation

School name Massachusetts Institute of Technology

Course 11 521- Spatial Database Management and Advanced Geographic Information Systems

Pages 12

This preview shows page 1-2-3-4 out of 12 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 12 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Handling One-to-Many Relations - Grouping and Aggregation • No class/lab next Tuesday o Today's lab is also Problem Set #1 - due two weeks from today o We won't cover all of today's lab exercises in this lab preparation talk - the rest will be covered this Thursday • Handling One-to-Many Relations o Today's lab prep will review SQL queries from last Thursday's lecture notes o Key concept is use of GROUP BY statements to handle one-to-many relations o We also introduce another database (of URISA proceedings) to learn more about SQL queries and relational database design • SQL queries and GROUP BY expressions o See the Sample Parcel Database* o See SQL Notes* and Oracle help o See Lab 1* examples o The basic SELECT statement to query one or more tables: SELECT [DISTINCT] column_name1[, column_name2,...] FROM table_name1[, table_name2, ...]WHERE search_condition1 [AND search_condition2 ...][OR search_condition3...][GROUP BY column_names][ORDER BY column_names]; o Note that the order of the clauses matters! The clauses, if included, must appear in the order shown! Oracle will report an error if you make a mistake, but the error message (e.g., "ORA-00933: SQL command not properly ended") may not be very informative. * Kindly refer to Lecture Notes SectionAggregration: GROUP BY, Group Functions Simple GROUP BY Example From the Parcels Database When a SQL statement contains a GROUP BY clause, the rows are selected using the criteria in the WHERE clause and are then aggregated into groups that share common values for the GROUP BY expressions. The HAVING clause may be used to eliminate groups after the aggregation has occurred. These examples draw on the PARCELS sample database* that we have used previously. 1. List all the fires, including the date of the fire: 2. List the count of fires by parcel: SELECT PARCELID, FDATE FROM FIRES ORDER BY PARCELID, FDATE; SELECT PARCELID, COUNT(FDATE)FIRE_COUNT FROM FIRES GROUP BY PARCELID ORDER BY PARCELID; Groups are shown in color, but this query does not collapse groups into a single row. Groups and summary functions have been calculated; notice that no FDATE values are shown. PARCELID FDATE 2 02-AUG-88 2 02-APR-89 3 26-JUL-89 3 26-JUL-90 7 01-AUG-87 20 02-JUL-89 PARCELID FIRE_COUNT 2 2 3 2 7 1 20 1 The Different Roles of the WHERE Clause and the HAVING Clause The WHERE clause restricts the rows that are processed before any grouping occurs. The HAVING clause is used with GROUP BY to limit the groups returned after grouping has occurred. * Kindly refer to Lecture Notes Section3. List all the fires that occurred on or after 1 August 1988: 4. List the count of fires that occurred on or after 1 August 1988 by parcel: 5. List the count of fires that occurred on or after 1 August 1988 by parcel for parcels that had more than one fire: SELECT PARCELID,FDATE FROM FIRES WHERE FDATE >= TO_DATE('01-AUG-1988', 'DD-MON-YYYY')ORDER BY PARCELID,FDATE; SELECT PARCELID,COUNT(FDATE)FIRE_COUNT FROM FIRES WHERE FDATE >= TO_DATE('01-AUG-1988', 'DD-MON-YYYY')GROUP BY PARCELID ORDER BY PARCELID; SELECT PARCELID,COUNT(FDATE)FIRE_COUNT FROM FIRES WHERE FDATE >= TO_DATE('01-AUG-1988', 'DD-MON-YYYY')GROUP BY PARCELID HAVING COUNT(FDATE) > 1ORDER BY PARCELID; Groups are shown in color, but this query does not actually perform grouping. Note that the fire at parcel 7 on 1 August 1987 has been excluded by the WHERE clause. This query shows the result of grouping, but no HAVING clause is applied. Groups that satisfy the HAVING clause of Query 5 are shown in bold. Final result, after groups that fail the HAVING clause have been eliminated. PARCELI D FDAT E 2 02-AUG-88 2 02-APR-89 3 26-JUL-89 3 26-JUL-90 20 02-JUL-89 PARCELI D FIRE_COU NT 2 2 3 2 20 1 PARCELI D FIRE_COU NT 2 2 3 2Rules for GROUP BY Queries • In a GROUP BY query, all expressions in the SELECT list not containing group functions (SUM, AVG, COUNT, etc.) must appear in the GROUP BY clause. The following query is invalid because it includes a column (FDATE) that is not in the GROUP BY clause. This query will fail with the Oracle error "ORA-00979: not a GROUP BY expression": SELECT PARCELID, FDATE, COUNT(FDATE)FIRE_COUNT FROM FIRES GROUP BY PARCELID ORDER BY PARCELID, FDATE; Think for a minute why this query does not make sense. For each distinct value of PARCELID, there may be multiple values of FDATE. For example, the parcel with PARCELID = 2 had fires on both 2 Aug. 1988 and 2 Apr. 1989. When we group by PARCELID alone, the results of the query will have at most one row for each value of PARCELID. If we include FDATE in the SELECT list, which FDATE should Oracle pick for PARCELID = 2? The answer is undefined, and that is why the query is invalid. Oracle is unable to pick a single value of an unaggregated item to represent a group. To fix this query, we must ensure that the SELECT list and the GROUP BY clause contain the same expressions (excluding expressions that use group functions such as COUNT(FDATE)).We have two choices. First, we can remove FDATE from the SELECT list (and the ORDER BY clause): SELECT PARCELID, COUNT(FDATE) FIRE_COUNTFROM FIRES GROUP BY PARCELID ORDER BY PARCELID; Second, we can add FDATE to the GROUP BY clause:SELECT PARCELID, FDATE, COUNT(FDATE) FIRE_COUNTFROM FIRES GROUP BY PARCELID, FDATEORDER BY PARCELID, FDATE; Be careful when picking this second option! Adding a column to the GROUP BY clause may change the meaning of your groups, as in this example. Notice that all the FIRE_COUNT values for this last query are 1.That's because by adding FDATE to the GROUP BY clause we have effectively made each group a single row from the FIRES table--not very interesting! • All GROUP BY expressions should appear in the SELECT list. (How else will you know what group is being shown?) To find the names of the owners of exactly one parcel, we can use this query: SELECT OWNERS.ONAME, COUNT(*) PARCEL_COUNTFROM PARCELS, OWNERSWHERE PARCELS.ONUM = OWNERS.OWNERNUM GROUP BY OWNERS.ONAME HAVING COUNT(*) = 1; The query below is valid, but uninformative. Who are the single-parcel owners? SELECT COUNT(*) PARCEL_COUNTFROM PARCELS, OWNERSWHERE PARCELS.ONUM = OWNERS.OWNERNUM GROUP BY OWNERS.ONAME HAVING COUNT(*) = 1; Showing the count of parcels when we've restricted the parcel count to 1 is not very interesting. We can simply leave the count out of the SELECT list. Note that you can use a HAVING condition without including the group function in the SELECT list: SELECT

View Full Document