CORNELL CS 632 - Dynamic Query Optimization - D1878997

Home> Schools> Cornell University> Computer Science (CS) > CS 632> Dynamic Query Optimization

CORNELL CS 632 - Dynamic Query Optimization

Pages 7

Download Save

Unformatted text preview:

Dynamic Query Optimization1. Introduction2. Background2.1 Query OptimizerFigure 1: the structure of query optimizer. [7]2.2 Cost Estimation3. Previous Work3.1 Parametric Query OptimizationParametric Query PlanEfficient Search Strategy3.2 Dynamic Query PlansModified cost modelTechniques to reduce the overhead3.3 Dynamic Re-OptimizationModified Query PlansRe-Optimization AlgorithmWhen to Re-OptimizeTechniques to control the overhead3.4 Comparison and Comments4. Future DirectionReference:Dynamic Query OptimizationLantian Zheng ([email protected])March 6, 19991. IntroductionThe ability to optimize queries is considered one of the key enabling technologies forrelational database. The query optimizer is a module that examines equivalent queryplans and chooses the plan that needs the least amount of cost. Obviously, the premise ofeffective optimization is to accurately estimate the cost of a plan before execution.Unfortunately, it is often impossible or too expensive to collect all the information aboutrelations and run-time environment which is necessary for accurate cost estimation. As aresult, an optimizer often produces sub-optimal plans. To solve this problem, dynamicquery optimization was proposed. In this survey, we will give an overview of thistechnology, discuss the various methods, and comment on previous and present research.Finally we examine some of the possible future research topics in this area.2. Background2.1 Query OptimizerQuery optimization can be viewed as a difficult search problem[6]. In order to solve the problem, we need to provide- A search space of plans : the set of all plans that can be used to answer a given query.- A cost estimation model, so that a cost may be assigned to each plan in the search space. - A search strategy, which can search through the execution space.According to [7], an abstraction structure of a query optimizer is shown as Figure 1.Figure 1: the structure of query optimizer. [7]- Rewriter. This module applies transformations to a given query and produces equivalent queries intended to be more efficient, e.g. replacement of views by their definition, flattening out of nested queries, and the like.- Planner. This module employs a search strategy that explores the space of access plans determined by the Algebraic Space and the Method-Structure Space modules for each RewriterPlannerAlgebraic spaceMethod-structure SpaceCost ModelSize-Distribution Estimatorquery produced in the previous stage. It compares these plans based on estimates of their cost derived by the Cost Model and the Size-Distribution Estimator modules and selects the overall cheapest one.2.2 Cost EstimationIn general, an optimizer uses some specified arithmetic formulas and the statistics of the involved relations to compute the estimate cost. Because a query is often optimized once and executed many times, some run-time parameters, such as buffer size and multi-programming level, are not known at compile-time.1 Moreover, some statistics might change between the compile-time and the run-time, and it’s too expensive to keep up-to-date statistics for a database which is frequently updated. These factors make it difficult to accurately estimate the cost of a query plan. The situation becomes even worse as we consider ADT or distributed and heterogeneous queries. Due to the inaccuracy of cost estimation, query plan produced by optimizer is often sub-optimal. Motivated by this problem, people have been interested in dynamically adjusting query plans to the cost-model data collected at start-up-time or run-time. Another way is posed in [8] to improve the overall efficiency of a static query plan: instead of finding the plan of the lowest cost based on the expected (assumed) values of various parameters, the goal of the query optimizer is to find the plan of the lowest expected cost (LEC). Since LEC plan is still a static query plan, it will show stable performance only if it’s close to the corresponding optimal plans of different parameter values. But this condition doesn’t always hold. In fact, we desire that every query processing can return result in tolerable response time. Dynamic query optimization introduces certain overhead in each query processing, and avoids extremely bad cases to guarantee stable performance. So it satisfies our need better than LEC query optimization. On the other hand, dynamic query optimization increases the complexity of implementation.3. Previous WorkThere are several different approaches for dynamic query optimization:- The first approach is to produce a combination of a number of different query plans, each of which is optimal for a given set of values of run-time parameters. At the start-up time, the accurate values of those run-time parameters are known and one specific plan is chosen to execute. This approach is represented by parametric query optimization[1] and dynamic query plan [2]. - The second approach is to produce only one query plan, like traditional static optimizer. The difference is that the correspondent assumptions are annotated with each plan and some statistics-collect operators are included. At run-time, the specifiedstatistics is collected and are compared with the annotated assumed values. Then we can decide if the current plan is sub-optimal and if it is necessary to re-optimize the query using new data. The re-optimization approach emerged in [10] and was further explored in [3]. 1 In this paper, compile-time and optimize-time are the same concept.- Another way is represented by the competition model of Antoshenkov [4]. In his approach, multiple query plan are executed in parallel and compete to each other. After a while, it may turn out that one of the plans is better than others, and then othersub-optimal plans will be killed. Since the progress of plans with different join orders are not comparable, this approach cannot be extended easily to the case where the join order for a complex query might be sub-optimal. Furthermore, some resources are wasted to run those sub-optimal candidates. So this approach is not as influential as the above two.In the next three sections, three most influential dynamic optimization algorithms are described and compared.3.1 Parametric Query OptimizationParametric Query PlanLet p to denote the parameters that can change, and P denotes the domain of p. The task of parametric query optimization is to optimize the cost for all

View Full Document


School:
Email:
New Password:
Confirm Password:

CORNELL CS 632 - Dynamic Query Optimization

Sign up for free to view:

Please select your school