Unformatted text preview:

11/4/08 1 CMSC424: Database Design Instructor: Amol Deshpande [email protected]  Data$Models$ Conceptual$representa1on$of$the$data$ Data$Retrieval$ How$to$ask$ques1ons$of$the$database$ How$to$answer$those$ques1ons$ Data$Storage$ How/where$to$store$data,$how$to$access$it$ Data$Integrity$ Manage$crashes,$concurrency$ Manage$seman1c$inconsistencies$Databases11/4/08 2 Query Processing  Overview  Selection operation  Join operators  Sorting  Other operators  Putting it all together… Overview User select * from R, S where … R, B+Tree on R.a S, Hash Index on S.a … Results Query Parser Resolve the references, Syntax errors etc. Converts the query to an internal format relational algebra like Query Optimizer Find the best way to evaluate the query Which index to use ? What join method to use ? … Query Processor Read the data from the files Do the query processing joins, selections, aggregates …11/4/08 3 “Cost”  Complicated to compute  We will focus on disk:  Number of I/Os ?  Not sufficient  Number of seeks matters a lot… why ?  tT – time to transfer one block  tS – time for one seek  Cost for b block transfers plus S seeks b * tT + S * tS  Measured in seconds Query Processing  Overview  Selection operation  Join operators  Sorting  Other operators  Putting it all together…11/4/08 4 Selection Operation  select * from person where SSN = “123”  Option 1: Sequential Scan  Read the relation start to end and look for “123”  Can always be used (not true for the other options)  Cost ?  Let br = Number of relation blocks  Then:  1 seek and br block transfers  So:  tS + br * tT sec  Improvements:  If SSN is a key, then can stop when found  So on average, br/2 blocks accessed Selection Operation  select * from person where SSN = “123”  Option 2 : Binary Search:  Pre-condition:  The relation is sorted on SSN  Selection condition is an equality  E.g. can’t apply to “Name like ‘%424%’”  Do binary search  Cost of finding the first tuple that matches  log2(br) * (tT + tS)  All I/Os are random, so need a seek for all  The last few are closeby, but we ignore such small effects  Not quite: What if 10000 tuples match the condition ?  Incurs additional cost11/4/08 5 Selection Operation  select * from person where SSN = “123”  Option 3 : Use Index  Pre-condition:  An appropriate index must exist  Use the index  Find the first leaf page that contains the search key  Retrieve all the tuples that match by following the pointers  If primary index, the relation is sorted by the search key  Go to the relation and read blocks sequentially  If secondary index, must follow all pointers using the index Selection w/ B+-Tree Indexes n * (tT + tS) n = number of records that match This can be bad hi * (tT + tS) secondary index, not a key, equality 1 * (tT + tS) hi * (tT + tS) secondary index, candidate key, equality 1 * (tT + tS) + (b – 1) * tT Note: primary == sorted b = number of pages that contain the matches hi * (tT + tS) primary index, not a key, equality 1 * (tT + tS) hi * (tT + tS) primary index, candidate key, equality cost of retrieving the tuples cost of finding the first leaf hi = height of the index11/4/08 6 Selection Operation  Selections involving ranges  select * from accounts where balance > 100000  select * from matches where matchdate between ’10/20/06’ and ’10/30/06’  Option 1: Sequential scan  Option 2: Using an appropriate index  Can’t use hash indexes for this purpose  Cost formulas:  Range queries == “equality” on “non-key” attributes  So rows 3 and 5 in the preceding page Selection Operation  Complex selections  Conjunctive: select * from accounts where balance > 100000 and SSN = “123”  Disjunctive: select * from accounts where balance > 100000 or SSN = “123”  Option 1: Sequential scan  Option 2 (Conjunctive only): Using an appropriate index on one of the conditions  E.g. Use SSN index to evaluate SSN = “123”. Apply the second condtion to the tuples that match  Or do the other way around (if index on balance exists)  Which is better ?  Option 3 (Conjunctive only) : Choose a multi-key index  Not commonly available11/4/08 7 Selection Operation  Complex selections  Conjunctive: select * from accounts where balance > 100000 and SSN = “123”  Disjunctive: select * from accounts where balance > 100000 or SSN = “123”  Option 4: Conjunction or disjunction of record identifiers  Use indexes to find all RIDs that match each of the conditions  Do an intersection (for conjunction) or a union (for disjunction)  Sort the records and fetch them in one shot  Called “Index-ANDing” or “Index-ORing”  Heavily used in commercial systems Query Processing  Overview  Selection operation  Join operators  Sorting  Other operators  Putting it all together…11/4/08 8 Join  select * from R, S where R.a = S.a  Called an “equi-join”  select * from R, S where |R.a – S.a | < 0.5  Not an “equi-join”  Option 1: Nested-loops for each tuple r in R for each tuple s in S check if r.a = s.a (or whether |r.a – s.a| < 0.5)  Can be used for any join condition  As opposed to some algorithms we will see later  R called outer relation  S called inner relation Nested-loops Join  Cost ? Depends on the actual values of parameters, especially memory  br, bs  Number of blocks of R and S  nr, ns  Number of tuples of R and S  Case 1: Minimum memory required = 3 blocks  One to hold the current R block, one for current S block, one for the result being produced  Blocks transferred:  Must scan R tuples once: br  For each R tuple, must scan S: nr * bs  Seeks ?  nr + br11/4/08 9 Nested-loops Join  Case 1: Minimum memory required = 3 blocks  Blocks transferred: nr * bs +


View Full Document

UMD CMSC 424 - Database Design

Documents in this Course
Lecture 2

Lecture 2

36 pages

Databases

Databases

44 pages

Load more
Download Database Design
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Database Design and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Database Design 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?