DOC PREVIEW
MIT 6 830 - Quiz I Solutions - 6.830

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Department of Electrical Engineering and Computer ScienceMASSACHUSETTS INSTITUTE OF TECHNOLOGY6.830 Database Systems: Fall 2007Quiz I SolutionsThere are 19 questions and 15 pages in this quiz booklet. To receive credit for a question, answerit according to the instructions given. You can receive partial credit on questions. You have 80minutes to answer the questions.Write your name on this cover sheet AND at the bottom of each page of this booklet.Some questions may be harder than others. Attack them in the order that allows you to makethe most progress. If you find a question ambiguous, be sure to write down any assumptions youmake. Be neat. If we can’t understand your answer, we can’t give you credit!THIS IS AN OPEN BOOK, OPEN NOTES QUIZ.NO PHONES, NO LAPTOPS, NO PDAS, ETC.Do not write in the boxes below1-4 (xx/26) 5-10 (xx/26) 11-13 (xx/14) 14-16(xx/20) 17-19 (xx/14) Total (xx/100)Mean(Score): 62Median(Score): 60.5Stddev(Score): 12Name: Solutions6.830 Fall 2007, Quiz 1 Solutions Page 2 of 15I Short Answer Reading Questions1. [8 points]: The DBMIN paper (“An Evaluation of Buffer Management Strategies for RelationalDatabase Systems” by DeWitt and Chou) proposes using a different buffer management strategy foreach instance of an open database file (table, index, etc.) For each of the following access patterns,table sizes, and amount of free memory, indicate whether an LRU or MRU memory management strat-egy should be used and indicate the optimal number of pages of memory that should be allocated to thebuffer for that file. If the either memory management strategy is equally good, circle, “either”.(Indicate page management strategy buffer size.)A. A sequential scan of a file with 1000 pages, with 100 pages free memory.LRU MRUEitherPages of buffer memory for file: 1 pageB. A 100 page file being used as the inner loop of a nested loops join with 101 pages free memory.LRU MRUEitherPages of buffer memory for file: 100 pagesC. A 100 page file being used as the inner loop of a nested loops join with 99 pages free memory.LRUMRU EitherPages of buffer memory for file: 99 pagesD. A 100 page file being used as the outer loop of a nested loops join with 99 pages free memory.LRU MRUEitherPages of buffer memory for file: 1 pageName: Solutions6.830 Fall 2007, Quiz 1 Solutions Page 3 of 152. [6 points]: Assume that a database uses two-phase locking, but releases both read and write locksbefore end-of-transaction. What implications does this have on the ability of the database to ensure theatomicity of transactions? (Hint: Consider what happens on transaction ABORT.)(Write your answer below.)There are two problems here: cascading rollback and lack of recoverability. Suppose T1modifies some object A, and then releases its lock, later T2 reads A, and finally T1 decidesto abort. At this point, if T2 has not committed, then T2 also has to be aborted, causingcascading rollback. If T2 has already committed, then it is not “recoverable”.3. [6 points]: A, B and C are relations in a DBMS, each containing fields f1 and f2. Field f2 is theprimary key for C. B is many orders of magnitude larger than A, and C is many orders of magnitudelarger than B. A page in each of these relations can hold p records, where p is a small integer of theorder of 10. The whole of A fits in memory, but B and C do not. Disk seeks in this system are k timesslower than sequential page reads, where k is also a small integer of the order of 10. Only I/O costs aresignificant; CPU costs can be safely ignored.Assume that the database uses a block nested loops join: first the system reads a page P of the outerrelation, then either scans the entire inner relation (if not using an index) or reads all the index pagesmatching tuples in the page P.Which of the following statements are true? For each question, briefly explain your reasoning in thespace provided.(Circle T for true or F for false for each option.)T /kF For block nested loop joins of the form A.f1 = B.f1, a clustered index on B.f1 always improves perfor-mance, relative to not using an index.Since A can fit in memory, the I/O cost for a block nested loop join is |B| + |A| (the cost unitbeing 1 seq read), where |A| is the number of pages in A. On the other hand, using a clusteredindex on B.f1, the I/O cost could be as large as 2k|A| (for seeks to read the outer page) + (p|A|* ksp|B|) (to read matching tuples from B), where s ∈ [0, 1] is the join selectivity. Clearly, if s islarge, not using an index is better.kT / F For the join B.f2 = C.f2, a nested loops join using a primary clustered hash index on C.f2 will alwaysbe faster than a hash join.Since f2 is the primary key for C, there is at most one record in C that can match with eachrecord in B. Hence the nested loop join will have I/O cost at most 2k|B| + kp|B| (the cost unitbeing 1 sequential read). On the other hand, a hash join requires two reads and one write foreach table, so its I/O cost is at least 3 ∗ (|B| + |C|) in our units, and very likely more becausesome I/O is random. We know that k and p are small integers of the order of 10, and C ismany orders of magnitude larger than B. Hence, a nested loop join using a primary clusteredhash index on C.f2 will always outperform a hash join.Name: Solutions6.830 Fall 2007, Quiz 1 Solutions Page 4 of 154. [6 points]: The Selinger optimizer uses several heuristics to limit the number of plans it considers.Suppose you are joining three tables T 1, T 2, and T 3, with join predicates T 1.a = T 2.b and T 2.c =T 3.d, and that there is a selection predicate P 1 over T 1.e. Describe or show three plans that the Selingeroptimizer would not consider.(Write your answer in the spaces below.)Plan 1. Right deep quer y plan:Plan 2. Cross product join:Plan 3. Predicate not pushed down:Name: Solutions6.830 Fall 2007, Quiz 1 Solutions Page 5 of 15II Optimistic Concurrency Control5. [4 points]: True or False: In the OCC protocol presented in the Kung and Robinson Paper (“OnOptimistic Methods for Concurrency Control”, Red Book), every successfully committed transactionhas three phases. If false, state why.(Circle TRUE or FALSE, and explain, if needed.)TRUEFALSEFALSE, because read-only transactions do not have a write phase in Kung and Robinson’sOCC protocol.In the OCC protocol presented in the Kung and Robinson Paper (Red Book), transactions are “validated”based on the theory of serial equivalence. The paper


View Full Document
Download Quiz I Solutions - 6.830
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Quiz I Solutions - 6.830 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Quiz I Solutions - 6.830 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?