DOC PREVIEW
UT Dallas CS 6360 - query processing example

This preview shows page 1-2-3-26-27-28 out of 28 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

ECS 165A WQ 11 136 8 Query Processing Goals Understand the basic concepts underlying the steps in query processing and optimization and estimating query processing cost apply query optimization techniques Contents Overview Catalog Information for Cost Estimation Measures of Query Cost Selection Join Operations Other Operations Evaluation and Transformation of Expressions Query Processing Optimization Task Find an efficient physical query plan aka execution plan for an SQL query Goal Minimize the evaluation time for the query i e compute query result as fast as possible Cost Factors Disk accesses read write operations I O page transfer CPU time is typically ignored Dept of Computer Science UC Davis 8 Query Processing and Optimization ECS 165A WQ 11 137 Basic Steps in Processing an SQL Query Parser Translator SQL Query Relational Algebra Expression Statistics Optimizer Query Result Evaluation Engine Execution Plan System Catalogs Data Files Parsing and Translating Translate the query into its internal form parse tree This is then translated into an expression of the relational algebra Parser checks syntax validates relations attributes and access permissions Evaluation The query execution engine takes a physical query plan aka execution plan executes the plan and returns the result Optimization Find the cheapest execution plan for a query Dept of Computer Science UC Davis 8 Query Processing and Optimization ECS 165A WQ 11 138 A relational algebra expression may have many equivalent expressions e g CName Price 5000 CUSTOMERS 1 ORDERS 1 OFFERS CName CUSTOMERS 1 ORDERS 1 Price 5000 OFFERS Representation as logical query plan a tree CName o Price 5000 CName OFFERS o Price 5000 CUSTOMERS ORDERS CUSTOMERS ORDERS OFFERS Non leaf nodes operations of relational algebra with parameters Leaf nodes relations A relational algebra expression can be evaluated in many ways An annotated expression specifying detailed evaluation strategy is called the execution plan includes e g whether index is used join algorithms Among all semantically equivalent expressions the one with the least costly evaluation plan is chosen Cost estimate of a plan is based on statistical information in the system catalogs Dept of Computer Science UC Davis 8 Query Processing and Optimization ECS 165A WQ 11 139 Catalog Information for Cost Estimation Information about relations and attributes NR number of tuples in the relation R BR number of blocks that contain tuples of the relation R SR size of a tuple of R FR blocking factor number of tuples from R that fit into one block FR dNR BRe V A R number of distinct values for attribute A in R SC A R selectivity of attribute A average number of tuples of R that satisfy an equality condition on A SC A R NR V A R Information about indexes HTI number of levels in index I B tree LBI number of blocks occupied by leaf nodes in index I first level blocks ValI number of distinct values for the search key Some relevant tables in the Oracle system catalogs USER TABLES NUM ROWS BLOCKS EMPTY BLOCKS AVG SPACE CHAIN CNT AVG ROW LEN USER TAB COLUMNS NUM DISTINCT LOW VALUE HIGH VALUE DENSITY NUM BUCKETS LAST ANALYZED Dept of Computer Science UC Davis USER INDEXES BLEVEL LEAF BLOCKS DISTINCT KEYS AVG LEAF BLOCKS PER KEY 8 Query Processing and Optimization ECS 165A WQ 11 140 Measures of Query Cost There are many possible ways to estimate cost e g based on disk accesses CPU time or communication overhead Disk access is the predominant cost in terms of time relatively easy to estimate therefore number of block transfers from to disk is typically used as measure Simplifying assumption each block transfer has the same cost Cost of algorithm e g for join or selection depends on database buffer size more memory for DB buffer reduces disk accesses Thus DB buffer size is a parameter for estimating cost We refer to the cost estimate of algorithm S as cost S We do not consider cost of writing output to disk Selection Operation A a R where a is a constant value A an attribute of R File Scan search algorithms that locate and retrieve records that satisfy a selection condition S1 Linear search cost S1 BR Dept of Computer Science UC Davis 8 Query Processing and Optimization ECS 165A WQ 11 141 Selection Operation cont S2 Binary search i e the file ordered based on attribute A primary index SC A R cost S2 dlog2 BR e 1 FR dlog2 BR e cost to locate the first tuple using binary search Second term blocks that contain records satisfying the selection If A is primary key cost S2 dlog2 BR e Dept of Computer Science UC Davis then SC A R 1 hence 8 Query Processing and Optimization ECS 165A WQ 11 142 Example for Employee DB FEmployee 10 V Deptno Employee 50 different departments NEmployee 10 000 Relation Employee has 10 000 tuples Assume selection Deptno 20 Employee and Employee is sorted on search key Deptno 10 000 50 200 tuples in Employee belong to Deptno 20 assuming an equal distribution 200 10 20 blocks for these tuples A binary search finding the first block would require dlog2 1 000 e 10 block accesses Total cost of binary search is 10 20 block accesses versus 1 000 for linear search and Employee not sorted by Deptno Dept of Computer Science UC Davis 8 Query Processing and Optimization ECS 165A WQ 11 143 Index scan search algorithms that use an index here a B tree selection condition is on search key of index S3 Primary index I for A A primary key equality A a cost S3 HTI 1 only 1 tuple satisfies condition S4 Primary index I on non key A equality A a SC A R cost S4 HTI FR S5 Non primary non clustered index on non key A equality A a cost S5 HTI SC A R Worst case each matching record resides in a different block Example Cont Assume primary B tree index for attribute Deptno 200 10 20 blocks accesses are required to read Employee tuples If B tree index stores 20 pointers per inner node then the B tree index must have between 3 and 5 leaf nodes and the entire tree has a depth of 2 a total of 22 blocks must be read Dept of Computer Science UC Davis 8 Query Processing and Optimization ECS 165A WQ 11 144 Selections Involving Comparisons Selections of the form A v R or A v R are implemented using a file scan or binary search or by using either a S6 A primary index on A or S7 A secondary index on A in this case typically a linear file scan may be cheaper but this depends on the selectivity of A Complex Selections General pattern Conjunction 1 n R Disjunction 1 n R Negation R The selectivity of a condition i is the


View Full Document

UT Dallas CS 6360 - query processing example

Documents in this Course
Ch22(1)

Ch22(1)

44 pages

Ch21

Ch21

38 pages

Ch19

Ch19

46 pages

Ch18

Ch18

25 pages

Ch17

Ch17

21 pages

Ch15

Ch15

42 pages

Ch09

Ch09

42 pages

Ch05

Ch05

34 pages

Ch22

Ch22

45 pages

Ch21

Ch21

38 pages

Ch19

Ch19

48 pages

Ch18

Ch18

24 pages

Ch17

Ch17

22 pages

Ch16

Ch16

17 pages

Ch15

Ch15

42 pages

Ch09

Ch09

42 pages

Ch08

Ch08

39 pages

Ch07

Ch07

34 pages

Ch06

Ch06

43 pages

Ch05

Ch05

34 pages

Ch04

Ch04

39 pages

Ch03(2)

Ch03(2)

36 pages

Ch02

Ch02

33 pages

Ch08

Ch08

28 pages

Ch07

Ch07

31 pages

Ch06

Ch06

43 pages

Ch05

Ch05

39 pages

Ch04(1)

Ch04(1)

39 pages

Ch03(1)

Ch03(1)

38 pages

Ch02

Ch02

38 pages

Ch01

Ch01

36 pages

Ch24

Ch24

36 pages

Ch21

Ch21

54 pages

Ch19

Ch19

48 pages

Ch18

Ch18

24 pages

Ch17

Ch17

22 pages

Ch03(1)

Ch03(1)

38 pages

Ch02

Ch02

38 pages

Ch01

Ch01

36 pages

Ch24

Ch24

36 pages

Ch21

Ch21

54 pages

Ch19

Ch19

48 pages

Ch18

Ch18

24 pages

Ch17

Ch17

22 pages

Ch08

Ch08

28 pages

Ch07

Ch07

31 pages

Ch06

Ch06

43 pages

Ch05

Ch05

39 pages

Ch04(1)

Ch04(1)

39 pages

Ch08

Ch08

39 pages

Ch07

Ch07

40 pages

Ch06

Ch06

47 pages

Ch05

Ch05

41 pages

Ch04

Ch04

43 pages

Ch03

Ch03

41 pages

Ch02

Ch02

38 pages

Ch01

Ch01

36 pages

Ch21

Ch21

54 pages

Ch19

Ch19

51 pages

Ch18

Ch18

24 pages

Ch08

Ch08

39 pages

Ch07

Ch07

40 pages

Ch06

Ch06

47 pages

Ch05

Ch05

41 pages

Ch04

Ch04

43 pages

Ch03

Ch03

41 pages

Ch02

Ch02

38 pages

Ch01

Ch01

36 pages

Ch21

Ch21

54 pages

Ch19

Ch19

51 pages

Ch18

Ch18

24 pages

Ch17

Ch17

25 pages

lab-manual

lab-manual

215 pages

Ch08

Ch08

39 pages

Ch07

Ch07

40 pages

Ch06

Ch06

47 pages

Ch05

Ch05

41 pages

Ch04

Ch04

43 pages

Ch03

Ch03

41 pages

Ch02

Ch02

38 pages

Ch01

Ch01

36 pages

Ch21

Ch21

54 pages

Ch19

Ch19

51 pages

Ch17

Ch17

25 pages

Ch21

Ch21

54 pages

Ch19

Ch19

51 pages

Ch18

Ch18

24 pages

Ch17

Ch17

25 pages

Ch08

Ch08

39 pages

Ch07

Ch07

40 pages

Ch06

Ch06

47 pages

Ch05

Ch05

41 pages

Ch04

Ch04

43 pages

Ch03

Ch03

41 pages

Ch02

Ch02

38 pages

Ch01

Ch01

36 pages

Ch04(1)

Ch04(1)

43 pages

Ch07

Ch07

40 pages

Ch03

Ch03

42 pages

Ch01

Ch01

36 pages

Ch02

Ch02

38 pages

Ch05

Ch05

41 pages

Ch06

Ch06

47 pages

Ch08

Ch08

39 pages

Ch17

Ch17

25 pages

Ch18

Ch18

24 pages

Ch09

Ch09

42 pages

Ch21

Ch21

54 pages

Ch19

Ch19

51 pages

Ch21

Ch21

54 pages

Ch19

Ch19

51 pages

Ch18

Ch18

24 pages

Ch17

Ch17

25 pages

Ch09

Ch09

42 pages

Ch08

Ch08

39 pages

Ch07

Ch07

40 pages

Ch06

Ch06

47 pages

Ch05

Ch05

41 pages

Ch04(1)

Ch04(1)

43 pages

Ch03

Ch03

42 pages

Ch02

Ch02

38 pages

Ch01

Ch01

36 pages

Load more
Download query processing example
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view query processing example and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view query processing example and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?