DOC PREVIEW
UH COSC 6340 - COSC 6340 Final Exam

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Final ExamCOSC 6340 (Data Management)May 9, 2000Your Name:Your SSN:I agree that my grades are posted using the last 4 digits of my ssn………………….(signature, if you like us to post your grades)Problem 1 [18]:Problem 2 [10]:Problem 3 [8]: Problem 4 [9]:Problem 5 [14]:Problem 6 [22]:Problem 7: [7]Problem 8: [12]:Grade:The exam is “open books” and you have 145 minutes to complete the exam.11) Relational Database Design [18]Assume we have a relation R(A,B,C,D,E) with the following dependencies:(1) AB CDE (2) CD ABE(3) E DBAnswer the following questions giving reasons for your answers:a) Is R in BCNF? [4]b) Does ABE  D hold for R? [2]c) Does CD  B hold for R? [2]d) Does ED hold for R (either show that this dependency can be inferred from the given 3 dependencies, or give a counter example of a relation that satisfies (1),(2), (3) but violates ED)? ** [10]22) Design an ODL Schema [10]Assume that the following E/R schema is given: Husband (0,1) wife (0,1) (0,n) performsTransform the E/R schema into an equivalent ODL-schema. Give the schema using ODL-syntax (not a diagram!). Assume that name and city are of type STRING, ssn is of type INTEGER, and date is of type DATE. Define inverse relationship for every relationship you define in your ODL schema!3PERSONFEMALEMALE PRIESTWEDDINGssnnamedatecity3) Write SQL Queries [8]The following relational schema is given: person(ssn, name, address), works-for(employee, employer, salary) and company(C#, c-name, location) with ssn being used as a foreign key for employee in works-for, and C# being used as a foreign key for employer in works-for; name in person stores a person’s last name. Create a table that gives the social security number, name, and number of employments for each person that has at least 2 employments (if you prefer you can write a sequence ofSQL-queries instead of a single query)! [8] 4) Transaction Management [9]a) What is isolation of transactions and why is it important for modern database systems? What techniques do database systems employ to guarantee isolation? [5]b) Assume we have two transactions T1 and T2T1: A=A+10; B=B-10;T2: A=A-30; B=B+30;Give a schedule that interleaves the execution of T1 and T2 that is not serializable! [4]45) Data Mining [14]a) What is the purpose of clustering? Why do scientists employ clustering (what do they try to find out)? [5]b) Assume you have to apply the APRIORI algorithm assuming that the minimum support is 40% (4 out of 10) to the following set of 10 transactions that involve purchases of items A, B, C, D, E, F, G. T1={A, D, E} T6={A, D, E, G} T2={A, D, F} T7={A, B, D, F} T3={A, E, F} T8={A, B, D, F, G} T4={A, B, D, F} T9={B, D, E, G} T5={B, D, F} T10={A, D, E} Indicate how Apriori’s Large Item Set Generation algorithm works for the example. Indicate what candidate itemsets will be generated in each pass, and which remain in the candidate set after pruning (use notations on page 11 of the survey paper and assume A>B>C>D>E>F>G). [9]56) Decision Support [22]a) What are the features of multi-dimensional data model? Why is it quite popular for writing OLAP queries; why do decision maker prefer the multi-dimensional data model over SQL/Relational data model? [8]b) Give an example of a Top-N query. Why is it difficult to implement Top-N queries using SQL? [4]c) Now assume you are the leader of ORACLE development team and your task is to extend ORACLE to provide a better solutions for the TOP-N query problem. What approach would you suggest your programmers should use? Describe your ideas in sufficient detail [10 + up to 4 extra points] **6More Space for Problem 67) Internet Databases/Information Retrieval [7]What are signature files and how can they be used to index free text documents on the web? What is the purpose and justification of using bit strings and of using a hashing function in signature files? [7] 78) Physical Database Design [12]Assume a relation R(A, B, C, D) is given; R is stored as an unordered file and contains 1000000 (1 million) tuples. Attributes A, B, C, D need 4 byte of storage each, and blocks have a size of 4096 Byte. Moreover, we assume that static hashing is used to implement index structures, and that index pointers require 4 byte of storage; furthermore, you can assume that pages of index blocks are 80% full and do not contain any overflow pages. What index structures would you create to speed up the following 3 queries?Q1: Select A, C Q2: Select D Q3: Select sum(R.D) from R from R from R where B=12; where C=12; where C=12; returns 200 answers returns 30000 answers returns one answerDescribe which index structures you would create (justify your design!), how they would be stored, and compute the cost for executing Q1, Q2, and Q3 for your chosen design (Hint: look for unusual


View Full Document
Download COSC 6340 Final Exam
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view COSC 6340 Final Exam and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view COSC 6340 Final Exam 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?