New version page

NYU CSCI-GA 2433 - Query Processing

Upgrade to remove ads
Upgrade to remove ads
Unformatted text preview:

Chapter 13: Query ProcessingChapter 13: Query ProcessingBasic Steps in Query ProcessingBasic Steps in Query Processing (Cont.)Basic Steps in Query Processing : OptimizationBasic Steps: Optimization (Cont.)Measures of Query CostMeasures of Query Cost (Cont.)Selection OperationSelection Operation (Cont.)Selections Using IndicesSelections Involving ComparisonsImplementation of Complex SelectionsAlgorithms for Complex SelectionsSortingExternal Sort-MergeExternal Sort-Merge (Cont.)Slide 18Example: External Sorting Using Sort-MergeExternal Merge Sort (Cont.)Slide 21Join OperationNested-Loop JoinNested-Loop Join (Cont.)Block Nested-Loop JoinBlock Nested-Loop Join (Cont.)Indexed Nested-Loop JoinExample of Nested-Loop Join CostsMerge-JoinMerge-Join (Cont.)Hash-JoinHash-Join (Cont.)Slide 33Hash-Join AlgorithmHash-Join algorithm (Cont.)Handling of OverflowsCost of Hash-JoinExample of Cost of Hash-JoinHybrid Hash–JoinComplex JoinsOther OperationsOther Operations : AggregationOther Operations : Set OperationsOther Operations : Outer JoinEvaluation of ExpressionsMaterializationMaterialization (Cont.)PipeliningPipelining (Cont.)Slide 50Evaluation Algorithms for PipeliningEnd of ChapterFigure 13.2Slide 54Database System Concepts, 5th Ed.©Silberschatz, Korth and SudarshanSee for conditions on re-use Chapter 13: Query ProcessingChapter 13: Query Processing©Silberschatz, Korth and Sudarshan13.2Database System Concepts - 5th Edition, Aug 27, 2005.Chapter 13: Query ProcessingChapter 13: Query ProcessingOverview Measures of Query CostSelection Operation Sorting Join Operation Other OperationsEvaluation of Expressions©Silberschatz, Korth and Sudarshan13.3Database System Concepts - 5th Edition, Aug 27, 2005.Basic Steps in Query ProcessingBasic Steps in Query Processing1. Parsing and translation2. Optimization3. Evaluation©Silberschatz, Korth and Sudarshan13.4Database System Concepts - 5th Edition, Aug 27, 2005.Basic Steps in Query Processing Basic Steps in Query Processing (Cont.)(Cont.)Parsing and translationtranslate the query into its internal form. This is then translated into relational algebra.Parser checks syntax, verifies relationsEvaluationThe query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.©Silberschatz, Korth and Sudarshan13.5Database System Concepts - 5th Edition, Aug 27, 2005.Basic Steps in Query Processing : Basic Steps in Query Processing : OptimizationOptimizationA relational algebra expression may have many equivalent expressionsE.g., balance2500(balance(account)) is equivalent to balance(balance2500(account))Each relational algebra operation can be evaluated using one of several different algorithmsCorrespondingly, a relational-algebra expression can be evaluated in many ways. Annotated expression specifying detailed evaluation strategy is called an evaluation-plan.E.g., can use an index on balance to find accounts with balance < 2500,or can perform complete relation scan and discard accounts with balance  2500©Silberschatz, Korth and Sudarshan13.6Database System Concepts - 5th Edition, Aug 27, 2005.Basic Steps: Optimization (Cont.)Basic Steps: Optimization (Cont.)Query Optimization: Amongst all equivalent evaluation plans choose the one with lowest cost.  Cost is estimated using statistical information from the database cataloge.g. number of tuples in each relation, size of tuples, etc.In this chapter we studyHow to measure query costsAlgorithms for evaluating relational algebra operationsHow to combine algorithms for individual operations in order to evaluate a complete expressionIn Chapter 14We study how to optimize queries, that is, how to find an evaluation plan with lowest estimated cost©Silberschatz, Korth and Sudarshan13.7Database System Concepts - 5th Edition, Aug 27, 2005.Measures of Query CostMeasures of Query CostCost is generally measured as total elapsed time for answering queryMany factors contribute to time costdisk accesses, CPU, or even network communicationTypically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into accountNumber of seeks * average-seek-costNumber of blocks read * average-block-read-costNumber of blocks written * average-block-write-costCost to write a block is greater than cost to read a block –data is read back after being written to ensure that the write was successful©Silberschatz, Korth and Sudarshan13.8Database System Concepts - 5th Edition, Aug 27, 2005.Measures of Query Cost (Cont.)Measures of Query Cost (Cont.)For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measurestT – time to transfer one blocktS – time for one seekCost for b block transfers plus S seeks b * tT + S * tS We ignore CPU costs for simplicityReal systems do take CPU cost into accountWe do not include cost to writing output to disk in our cost formulaeSeveral algorithms can reduce disk IO by using extra buffer space Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during executionWe often use worst case estimates, assuming only the minimum amount of memory needed for the operation is availableRequired data may be buffer resident already, avoiding disk I/OBut hard to take into account for cost estimation©Silberschatz, Korth and Sudarshan13.9Database System Concepts - 5th Edition, Aug 27, 2005.Selection OperationSelection OperationFile scan – search algorithms that locate and retrieve records that fulfill a selection condition.Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition.Cost estimate = br block transfers + 1 seekbr denotes number of blocks containing records from relation rIf selection is on a key attribute, can stop on finding recordcost = (br /2) block transfers + 1 seekLinear search can be applied regardless of selection condition orordering of records in the file, or availability of indices©Silberschatz, Korth and Sudarshan13.10Database System Concepts - 5th Edition, Aug 27, 2005.Selection Operation (Cont.)Selection Operation (Cont.)A2 (binary search). Applicable if selection is an equality comparison on the

View Full Document
Download Query Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...

Join to view Query Processing and access 3M+ class-specific study document.

We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Query Processing 2 2 and access 3M+ class-specific study document.


By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?