Hiram CPSC 356 - Query Optimization

Unformatted text preview:

Query Optimization (CB Chapter 23.1-23.3)SQLSQL is Declarative, RA is ProceduralQuery ProcessingQuery OptimizationSlide 6More Detail of Query ProcessingSteps in Query ProcessingParts of OptimizationQuery DecompositionQuery Transformation (Selection)Query Transformation (Projection)Query Transformation (Join)Pushing Select ExampleQuery Processing ExampleSlide 16Execution PlansChoosing TransformationsHeuristics (Rules of Thumb)Consequences for SQL ProgrammerQuery Optimization(CB Chapter 23.1-23.3)CPSC 356 DatabaseEllen WalkerHiram College(Includes figures from Database Systems: An Application Oriented Approach 2ed by Kifer, Bernstein & Lewis, © Addison Wesley 2005)SQL•Widely used (only?) standard query language for relational databases•Once SEQUEL (Structured English QUEry Language), now Structured Query Language•Objectives–Easy to learn, easy to use–Create and modify the database and query from it•DDL defines•DML manipulatesSQL is Declarative, RA is Procedural•SQL Statements describe the desired results, but do not specify a sequence of operations to get those results•Relational Algebra expressions describe a specific sequence of operations to perform•To evaluate a SQL statement, it needs to be translated into (a computer implementation of) RA first!Query Processing•Query Processing is the translation of SQL into RA-like nested function calls•One query can have multiple translationsSELECT roomNo, HotelName FROM Room, Hotel WHERE HotelName = ‘Savoy’ and Room.hotelNo=Hotel.hotelNo;roomNo,HotelName ( hotelName=Savoy and Room.hotelNo=Hotel.hotelNo (Room x Hotel) )roomNo,HotelName ( Room.hotelNo=Hotel.hotelNo ( (hotelName=Savoy (Hotel)) x Room) )Query Optimization•Choose the translation that minimizes resource use (time, space)•The second translation below is better. (Why?)SELECT roomNo, HotelName FROM Room, Hotel WHERE HotelName = ‘Savoy’ and Room.hotelNo=Hotel.hotelNo;roomNo,HotelName ( hotelName=Savoy and Room.hotelNo=Hotel.hotelNo (Room x Hotel) )roomNo,HotelName ( (hotelName=Savoy (Hotel))  Room.hotelNo=Hotel.hotelNo Room)Query ProcessingUserSQLDecom-positionRelationalAlgebraOptimi-zationProcessingEngineResult(table) Efficient Rel. AlgebraMore Detail of Query ProcessingSteps in Query Processing•Query Decomposition (create relational algebra expression)•Query Optimization (create execution plan)•Code Generation •Query ExecutionParts of Optimization•Query Plan Generator–Comes up with viable relational algebra expressions to improve the initial naïve one•Cost Estimator–Estimates the cost (time / space) of each plan•Optimization–Choosing the plan with the lowest cost, or at least “reasonably cheap”Query Decomposition•Check Syntax•Build a relational algebra treeHotel RoomhotelName=Savoy Room.hotelNo=Hotel.hotelNoXQuery Transformation (Selection)•Select with multiple AND conditions can be sequence of selectshotelName=Savoy and Room.hotelNo=Hotel.hotelNo( …) = hotelName=Savoy (Room.hotelNo=Hotel.hotelNo ( …))•Order of Select operations doesn’t matter=  Room.hotelNo=Hotel.hotelNo ( hotelName=Savoy ( …))Query Transformation (Projection)•Extra intermediate projections don’t matterName(Name, Status(Student))= Name (Student)•Order of select and project doesn’t matterStatus=‘SR’(Status(Student)) = Status(Status=‘SR’(Student))Query Transformation (Join)•Push Select through Join–Replace a select on a cross-product with a join–Joins can be implemented at the lowest level more efficiently than “materialized cross-product”•Push Select through Product–If the attributes of the condition all belong to one table of the join, put the select on only the one table, so a smaller table is joined–Joins on smaller tables are faster than on larger ones •More rules pp. 640-642Pushing Select Example•Find all seniors that take CPSC356stu_id=id & crs=‘CPSC356’(Student x Transcript)•Separate the selectsstu_id=id ( crs=‘CPSC356’(Student x Transcript))•Push the inner selectstu_id=id (Student x (crs=‘CPSC356’(Transcript)))•Replace Select/Product by JoinStudent |x| stu_id=id (crs=‘CPSC356’(Transcript)))Query Processing ExampleQuery Processing ExampleExecution Plans•Add specific algorithms to each relational algebra step•Determine whether/how indices will be used•Add pipelining (not storing intermediate data) where possibleChoosing Transformations•Estimate cost of each tree based on–Table sizes–Numbers of distinct attribute values–Average number of tuples for selection condition–Methods used for join (e.g. indexed, hashed)•Choose lowest cost tree•Because estimates aren’t perfect, the absolute best tree might not be chosen!Heuristics (Rules of Thumb)•Perform Selection as early as possible–Unless doing it later lets you use an index•Combine X and Selection into join operation•Execute most restrictive Selections first•Perform Projection as early as possible•Compute common expressions once–Creating a view is a way to do this!Consequences for SQL Programmer•Using RA operations in SQL (e.g. explicit Join) constrains optimization–Good when “programmer knows best”–Bad when programmer prevents a better optimization•Intermediate tables (i.e. views) can constrain optimization–Use views to compute common subexpressions•When performance is substandard, tweaking the SQL can help! (Remember the


View Full Document

Hiram CPSC 356 - Query Optimization

Download Query Optimization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Query Optimization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Query Optimization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?