DOC PREVIEW
UW CSE 444 - Study Notes

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSE 444, Winter 2003CSE 444, Winter 2003Assignment #4: due Wednesday, March 12Objectives: To understand basic storage techniques, hash tables, query execution andoptimization. Number of points: 100 points Questions: 1. [40 points] Consider the following relational data: Purchase (pname, date, buyer, seller) Product (name, price, manufacturer, category)o Relation Purchase contains 10000 tuples and has 50 tuples per page.o Relation Product contains 3000 tuples and has 10 tuples per page.o Attribute name is the primary key for Producto Both relations are stored as simple heap files.o Neither relation has any indexes built on it.o 102 buffer pages are available.Consider the following query:SELECT name, seller, buyerFROM Product, PurchaseWHERE Product.name = Purchase.pnameAnswer the following questions:a.[5 points] What is the cost of joining Purchase and Product using a Page-oriented Nested Loops Join? b. [5 points] What is the cost of joining Purchase and Product using a BlockNested Loops Join? c.[5 points] What is the cost of joining Purchase and Product using a Sort-MergeJoin?d. [5 points] What is the cost of joining Purchase and Product using a Hash-Join?e.[10 points] What would be the lowest possible I/O cost for joining Purchase andProduct using any join algorithm, supposing we have enough buffer space?And how much buffer space would be needed to achieve this cost?f. [5 points] Now suppose we have a hash-index on pname of Purchase. The indexis non-clustered. The records are NOT stored together with the key values in theindex. We need 1.2 I/O to get data entry in index page on average. What’s thecost of joining Purchase and Product using Index Nested Loop Join?g. [5 points] How many tuples will the join of Purchase and Product produce, atmost?2. [20 points] Consider the following relational schema, that includes two relations: Author(pid, name) Paper(pid, title, year, citations)The Paper relation stores a set of published papers, including their publication ID, title, year of publication, and the number of citations by other papers. The Author relation stores for every paper ID, the set of author names (so there is a row in the Author table for every author).a. [10 points] Consider the following query, that returns for every year, the maximal number of citations for papers published that year, but only if the number of citations is more than 20: Select year, Max(citations) From Paper Group By year Having Max(citations) > 20.Can you propose a transformation on the above query that is likely to reduce the cost of evaluating the query?b. [10 points] Suppose the query was modified to also return the number of papers published in those years, i.e., Select year, Max(citations), Count(*) From Paper Group By year Having Max(citations) > 20. Would the optimization you proposed above still be valid? Why or why not? 3. [40 points] In SQL Server, when you write a query you can also view theexecution plan rather than executing the query. Hit “Ctrl + L”, or press the button, you will see the execution plan. If you squint just right, it looks like a tree.After you see the plan, you can hover over any of the operators and find out whatthe estimated I/O costs are, what the estimated CPU costs are, etc. To figure outthe total estimated I/O cost, etc, add together the costs for all of the operators.Don't forget to include the Table Scans, etc. If there's nothing given for a select,don't worry about it. Using this tool, and the database in homework 1, answer the following questions:a. [8 points] Find a query that the optimizer will pick a hash join for. (i) What is the query? (ii) What is the total estimated I/O cost? (iii) What is the total estimated CPU cost? (iv) Why does the optimizer use hash-join instead of nested-loop join?b. [8 points] Find a query that the optimizer will pick a nested-loop join for. (i) What is the query? (ii) What is the total estimated I/O cost? (iii) What is the total estimated CPU cost? (iv) Why does the optimizer use nested-loop join instead of hash-join?c. [14 points] For the following query:SELECT university, department, count(*) AS numFROM EducationGROUP BY university, departmentHAVING department != 'Computer Science and Engineering'ORDER BY university, department DESC(i) Does the optimizer apply the selection predicate in HAVING clausebefore or after grouping (GROUP BY)? Give the reason why theoptimizer chooses this order.(ii) Does the optimizer do sorting (ORDER BY) before or after grouping?Give the reason why the optimizer chooses this order.d. [10 points] Examine the execution plan for the following query.SELECT inFriendFROM IfriendWHERE inFriend NOT IN (SELECT outFriendFROM Ofriend) To optimize it, the optimizer transforms this query to an equivalent queryfor execution. What is that new query? Hint: Refer to the “Argument” contents in the popped up


View Full Document

UW CSE 444 - Study Notes

Documents in this Course
XML

XML

48 pages

SQL

SQL

25 pages

SQL

SQL

42 pages

Recovery

Recovery

30 pages

SQL

SQL

36 pages

Indexes

Indexes

35 pages

Security

Security

36 pages

Wrap-up

Wrap-up

6 pages

SQL

SQL

37 pages

More SQL

More SQL

48 pages

SQL

SQL

35 pages

XML

XML

46 pages

Triggers

Triggers

26 pages

Load more
Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?