DOC PREVIEW
UW CSE 444 - Study Notes

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Computer Science & Engineering 444 – Final ExamJune 10, 2002Closed book & notes – 120 minutes100 points totalName: ____________________________ StuID:_____________________ Part Score1234Total 1. [30 points] Consider the following relational data:Id Time Title Composer Conductor3 1:01 Mad Rush J. Sibelius L. Bernstein4 1:47 Andante L. Beethoven NULL5 1:57Upon EnchantedGroundA. Hovhaness NULLId Parent Soloist13 3 A. Karis55 5 Y. Kondonassis56 5 F. Hendrickx57 5 H. CorynName:_________________________________________________________ a. [10 points] Write the XML document wnyc.xml obtained by exporting therelational data into the DTD specified below. <!ELEMENT wnyc (piece*)><!ELEMENT piece (time, title, composer, conductor?, soloist*><!ELEMENT time (#PCDATA)><!ELEMENT title (#PCDATA)><!ELEMENT composer (#PCDATA)><!ELEMENT conductor (#PCDATA)><!ELEMENT soloist (#PCDATA)><!ATTLIST piece id ID REQUIRED>Name:_________________________________________________________ b. [5 points] Write an XQuery to retrieve all conductors that conduct "Mad Rush" by"J. Sibelius". The result should not contain duplicates. c. [5 points] Write an XQuery to retrieve all soloists that conduct some pieces thatare broadcast at the same time as “Andante” (including “Andante”). The resultshould not contain duplicates.Name:_________________________________________________________ d. [10 points] What’s the result of the following XQuery when it is applied to theXML document in (a)<wnyc-by-solo> {FOR $s IN distinct-values (/wnyc/piece/soloist)RETURN <solo><name>{$s/text()}</name>{FOR $c IN distinct-values (//piece[soloist = $s/text()]/conductor)RETURN <conductor><name>{$c/text()}</name>{FOR $p IN (//piece[conductor = $c/text() AND soloist = $s/text()])RETURN <piece> { $p/title,$p/composer}</piece>}</conductor>}</solo> } </wnyc-by-solo>Name:_________________________________________________________ 2. [20 points] Consider the following relational data: Purchase (pname, date, buyer, seller) Product (name, price, manufacturer, category)o Relation Purchase contains 10000 tuples and has 50 tuples per page.o Relation Product contains 2000 tuples and has 10 tuples per page.o Attribute name is the primary key for Produceo Both relations are stored as simple heap files.o Neither relation has any indexes built on it.o 102 buffer pages are availableConsider the following query: Select name, seller, buyer From Product, Purchase Where Product.name = Purchase.pnamePlease answer the following questions:a. [5 points] What is the cost of joining Purchase and Product using a Page-oriented Nested Loops Join?b. [5 points] What is the cost of joining Purchase and Product using a HashJoin?Name:_________________________________________________________ c. [5 points] Suppose we had 205 pages of main memory buffer pages available.How would you do the join of Purchase and Product? What would be thecost of the join? d. [5 points] Now suppose we have a hash-index on pname of Purchase. Theindex is non-clustered. The records are NOT stored together with the keyvalues in the index. We need 1.2 I/O to get data entry in index page onaverage. What’s the cost of joining Purchase and Product using Index NestedLoop Join?Name:_________________________________________________________ 3. [26 points] Consider the following relational schema, that includes two relations: Author(pid, name) Paper(pid, title, year, citations) The Paper relation stores a set of published papers, including their publication ID,title, year of publication, and the number of citations by other papers. The Authorrelation stores for every paper ID, the set of author names (so there is a row in theAuthor table for every author). a. [7 points] Write an SQL query over the above schema that returns for everyauthor, the maximal number of citations on any of his/her publications, butonly considering papers that have at most 3 authors. The query returns a set ofpairs of the form (author, n), where n is the maximal number of citations on apaper of author.Name:_________________________________________________________ b. [6 points] Consider the following query, that returns for every year, themaximal number of citations for papers published that year, but only if thenumber of citations is more than 20: Select year, Max(citations) From Paper Group By year Having Max(citations) > 20. Can you propose a transformation on the above query that is likely to reduce thecost of evaluating the query?c. [5 points] Suppose the query was modified to also return the number of paperspublished in those years, i.e., Select year, Max(citations), Count(*) From Paper Group By year Having Max(citations) > 20. Would the optimization you proposed above still be valid? Why or why not?Name:_________________________________________________________ d. [8 points] Suppose we had a third relation in the schema, Cites(pid1, pid2),storing the actual citation references. That is, if the tuple (paper1, paper2) is inthe relation Cites, then paper1 cites paper2. Write a trigger that enforces thatwhenever a tuple (paper1, paper2) is inserted into the Cites relation, then theyear of publication of paper2 is not greater than the year of publication ofpaper1. When the condition is not satisfied, the tuple should not be added tothe relation.Name:_________________________________________________________ 4. [24 points] Answer the following short questions in 4 lines or less.a. What are the key differences between the Entity/Relationship and the ODL modeling formalisms?b. Is it always a good idea to push selections before joins in query execution plans? Why or why not? c. Give an example of where knowledge of a key constraint is useful for speeding up query processing.Name:_________________________________________________________ d. When is a sort-merge join preferable over a hash-join?e. When is a relation in Boyce-Codd Normal form?f. What is the difference between a sparse and dense


View Full Document

UW CSE 444 - Study Notes

Documents in this Course
XML

XML

48 pages

SQL

SQL

25 pages

SQL

SQL

42 pages

Recovery

Recovery

30 pages

SQL

SQL

36 pages

Indexes

Indexes

35 pages

Security

Security

36 pages

Wrap-up

Wrap-up

6 pages

SQL

SQL

37 pages

More SQL

More SQL

48 pages

SQL

SQL

35 pages

XML

XML

46 pages

Triggers

Triggers

26 pages

Load more
Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?