DOC PREVIEW
UW CSE 444 - Relational Algebra and Query Execution

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Relational algebra and query executionCSE 444, fall 2010 — section 8 worksheetNovember 18, 20101 Relational algebra warm-up1. Given this database schema:• Product (pid, name, price)• Purchase (pid, cid, store)• Customer (cid, name, city)draw the logical query plan for each of the following SQL queries.(a) SELECT DISTINCT x.storeFROM Purchase x, Customer yWHERE x.cid = y.cidand y.city = ’Seattle’1(b) SELECT z.city, sum(x.price)FROM Product x, Purchase y, Customer zWHERE x.pid = y.pid and y.cid = z.cidand y.store = ’Wal-Mart’GROUP BY z.cityHAVING count(*) > 10022. Write a SQL query that is equivalent to the logical plan below:! "#$%&!'()*+!,!-./!01+(2!*3,*!)4!+01)5,6+7*!*8!*3+!689):,6 !;6,7!%+68<=! !!!%>!V,?@!A$,B%B>&! -$%B:B3 & C$:BDB 9&!%::9!V9EF!AG%H-G%-G:HCG:>9G!33. Consider two tables R(A, B, C) and S (D, E, F), and the logical query plan P:P = σA>9γA,sum(F)(R 1C=DS)Indicate which of the following three query plans are equivalent to P.P1= γA,sum(F)(σA>9R)1C=DγD,sum(F)SP2= γA,sum(F)(σA>9R)1C=DγD,E,sum(F)SP3= γA,sum(F)(σA>9R)1C=DγD,E,sum(F)((σA>9R)1C=D(S))Hint: Draw out each plan in tree form before solving.42 Spring 2009 final, problem 3 (parts a-b)Consider four tables R(a, b, c), S(d, e, f ), T(g, h) , U(i, j, k).(a) Consider the following SQL query:SELECT R.b, avg(U.k) as avgFROM R, S, T, UWHERE R.a = S.dAND S.e = T.gAND T.h = U.iAND U.j = 5AND (R.c + S.f) < 10GROUP BY R.bDraw a logical plan for the query. You may choose any plan as long as it is correct (i.e. no need toworry about efficiency).5(b) Consider the following two physical query plans. Give two reasons why plan B may be faster thanplan A. Explain each reason.A B A.b = B.b σ B.c < 2000 π A.d,B.d (Index nested loop) (B+ tree index on B.b) (On the fly) σ a>10 (B+ tree index on A.a) (Use B+ tree index) (On the fly) A B A.b = B.b σ B.c < 2000 π A.d,B.d (Hash join) (File scan) (On the fly) σ a>10 (File scan) Plan A Plan B63 Summer 2009 final, problem 2This problem and the next concern two relations R(a, b, c) and S(x, y, z) that have the following character-istics:B(R) = 600 B(S) = 800T(R) = 3000 T(S) = 4000V(R, a) = 300 V(S, x) = 100V(R, b) = 100 V(S, y) = 400V(R, c) = 50 V(S, z) = 40We also have M = 1000 (number of memory blocks).Relation R has a clustered index on attribute a and an unclustered index on attribute c. Relation S hasa clustered index on attribute x and an unclustered index on attribute z. All indices are B+ trees.Now consider the following logical query plan for a query involving these two relations.!"# $$$ %&'()* +,-,./ 01* 0223 4(-5 6 78 12 !"#$%&'( )* 97-&:() ;,5<= >)('. ?0@ >7&'/.A BC&. ><7D)5E ('F /C5 '5G/ :7':5<' /H7 <5)(/&7'. I?(* D* :A ('F "?G* =* JA /C(/ C(K5 /C5 87))7H&'- :C(<(:/5<&./&:.L M?IA N O22 M?"A N P22 B?IA N 6222 B?"A N $222 Q?I* (A N 622 Q?"* GA N 122 Q?I* DA N 122 Q?"* =A N $22 Q?I* :A N @2 Q?"* JA N $2 R5 ().7 C(K5 S N 1222 ?',ED5< 78 E5E7<= D)7:T.AU I5)(/&7' I C(. ( :),./5<5F &'F5G 7' (//<&D,/5 ( ('F (' ,':),./5<5F &'F5G 7' (//<&D,/5 :U I5)(/&7' " C(. ( :),./5<5F &'F5G 7' (//<&D,/5 G ('F (' ,':),./5<5F &'F5G 7' (//<&D,/5 JU +)) &'F&:5. (<5 MV /<55.U W7H :7'.&F5< /C5 87))7H&'- )7-&:() ;,5<= >)(' 87< ( ;,5<= &'K7)K&'- /C5.5 /H7 <5)(/&7'.U ?+'.H5< /C5 ;,5./&7'. (D7,/ /C&. ;,5<= 7' /C5 87))7H&'- >(-5A!"#$%&'(J!"#$%()*+,"-.o&'(V!"#%/%01%&'2%,"3%/%04վ%!"5%/%,"6 ! ,(Answer the questions about this query on the following page.)7(a) Write a SQL query that is equivalent to the logical query plan on the previous page.(b) Change or rearrange the original logical query plan to produce one that is equivalent (has the samefinal results), but which is estimated to be significantly faster, if possible. Recall that logical queryoptimization does not consider the final physical operators used to execute the query, but only thingsat the logical level, such as the sizes of relations and estimated sizes of intermediate results.You should include a brief but specific explanation of how much you expect your changes to improvethe speed of the query and why.Draw your new query plan and write your explanation below. If you can clearly and easily showthe changes on the original diagram, you may do so, otherwise draw a new diagram here.84 Summer 2009 final, problem 3This problem uses the same two relations and statistics as the previous one, but for a different operationnot related to the previous query.As before, the relations are R(a, b, c) and S(x, y, z). Here are the statistics again:B(R) = 600 B(S) = 800T(R) = 3000 T(S) = 4000V(R, a) = 300 V(S, x) = 100V(R, b) = 100 V(S, y) = 400V(R, c) = 50 V(S, z) = 40M = 1000 (number of memory blocks)Also as before, relation R has a clustered index on attribute a and an unclustered index on attribute c.Relation S has a clustered index on attribute x and an unclustered index on attribute z. All indices are B+trees.Your job for this problem is to specify and justify a good physical plan for performing the joinR 1a=zS (i.e., join R and S using the condition R.a = S.z)Your answer should specify the physical join operator used (hash, nested loop, sort-merge, or other)and the access methods used to read relations R and S (sequential scan, index, etc.). Be sure to giveessential details: i.e., if you use a hash join, which relations(s) are included in hash tables; if you usenested loops, which relation(s) are accessed in the inner and outer loops, etc.Give the estimated cost of your solution in terms of number of disk I/O operations needed (you shouldignore CPU time and the cost to read any index blocks).You should give a brief justification why this is the best (cheapest) way to implement the join. You donot need to exhaustively analyze all the other possible join implementations and access methods, but youshould give a brief discussion of why your solution is the preferred one compared to the other possibilities.(Write your answer on the next page. You may remove this page for reference if you wish.)9Give your solution and explanation


View Full Document

UW CSE 444 - Relational Algebra and Query Execution

Documents in this Course
XML

XML

48 pages

SQL

SQL

25 pages

SQL

SQL

42 pages

Recovery

Recovery

30 pages

SQL

SQL

36 pages

Indexes

Indexes

35 pages

Security

Security

36 pages

Wrap-up

Wrap-up

6 pages

SQL

SQL

37 pages

More SQL

More SQL

48 pages

SQL

SQL

35 pages

XML

XML

46 pages

Triggers

Triggers

26 pages

Load more
Download Relational Algebra and Query Execution
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Relational Algebra and Query Execution and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Relational Algebra and Query Execution 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?