1 Lecture 17: Query execution Wednesday, May 12, 2010 Dan Suciu -- 444 Spring 20102 Outline of Next Few Lectures • Query execution • Query optimization Dan Suciu -- 444 Spring 2010Steps of the Query Processor Parse & Rewrite Query Select Logical Plan Select Physical Plan Query Execution Disk SQL query Query optimization Logical plan Physical plan 3Dan Suciu -- 444 Spring 2010 Example Database Schema Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price) View: Suppliers in Seattle CREATE VIEW NearbySupp AS SELECT sno, sname FROM Supplier WHERE scity='Seattle' AND sstate='WA' 4Dan Suciu -- 444 Spring 2010 Example Query Find the names of all suppliers in Seattle who supply part number 2 SELECT sname FROM NearbySupp WHERE sno IN ( SELECT sno FROM Supplies WHERE pno = 2 ) 5Dan Suciu -- 444 Spring 2010 Steps in Query Evaluation • Step 0: Admission control – User connects to the db with username, password – User sends query in text format • Step 1: Query parsing – Parses query into an internal format – Performs various checks using catalog • Correctness, authorization, integrity constraints • Step 2: Query rewrite – View rewriting, flattening, etc. 6Dan Suciu -- 444 Spring 2010 Rewritten Version of Our Query Original query: SELECT sname FROM NearbySupp WHERE sno IN ( SELECT sno FROM Supplies WHERE pno = 2 ) Rewritten query: SELECT S.sname FROM Supplier S, Supplies U WHERE S.scity='Seattle' AND S.sstate='WA’ AND S.sno = U.sno AND U.pno = 2; 7Dan Suciu -- 444 Spring 2010 Continue with Query Evaluation • Step 3: Query optimization – Find an efficient query plan for executing the query • A query plan is – Logical query plan: an extended relational algebra tree – Physical query plan: with additional annotations at each node • Access method to use for each relation • Implementation to use for each relational operator 8Dan Suciu -- 444 Spring 2010 Extended Algebra Operators • Union ∪, intersection ∩, difference - • Selection σ$• Projection π$• Join ⨝ • Duplicate elimination δ$• Grouping and aggregation γ$• Sorting τ$• Rename ρ$9Dan Suciu -- 444 Spring 2010 Logical Query Plan Suppliers Supplies sno = sno σ sscity=‘Seattle’ ∧sstate=‘WA’ ∧ pno=2 Π sname 10Dan Suciu -- 444 Spring 2010 Query Block • Most optimizers operate on individual query blocks • A query block is an SQL query with no nesting – Exactly one • SELECT clause • FROM clause – At most one • WHERE clause • GROUP BY clause • HAVING clause 11Dan Suciu -- 444 Spring 2010 Typical Plan for Block (1/2) R S join condition σ selection condition π fields join condition … SELECT-PROJECT-JOIN Query ... 12Dan Suciu -- 444 Spring 2010 Typical Plan For Block (2/2) γ fields, sum/count/min/max(fields) σhaving-ondition σ selection condition join condition … … 13Dan Suciu -- 444 Spring 2010 How about Subqueries? SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and not exists SELECT * FROM Supply P WHERE P.sno = Q.sno and P.price > 100 14 Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price)Dan Suciu -- 444 Spring 2010 How about Subqueries? SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and not exists SELECT * FROM Supply P WHERE P.sno = Q.sno and P.price > 100 15 Correlation !!Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price)Dan Suciu -- 444 Spring 2010 How about Subqueries? SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and not exists SELECT * FROM Supply P WHERE P.sno = Q.sno and P.price > 100 16 De-Correlation!SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and Q.sno not in SELECT P.sno FROM Supply P WHERE P.price > 100 Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price)Dan Suciu -- 444 Spring 2010 How about Subqueries? 17 Un-nesting!SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and Q.sno not in SELECT P.sno FROM Supply P WHERE P.price > 100 (SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’) EXCEPT (SELECT P.sno FROM Supply P WHERE P.price > 100) Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price)Dan Suciu -- 444 Spring 2010 How about Subqueries? Supply σsstate=‘WA’ Supplier σPrice > 100 18 (SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’) EXCEPT (SELECT P.sno FROM Supply P WHERE P.price > 100) −"Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price) Finally…!Dan Suciu -- 444 Spring 2010 Physical Query Plan • Logical query plan with extra annotations • Access path selection for each relation – Use a file scan or use an index • Implementation choice for each operator • Scheduling decisions for operators 19Dan Suciu -- 444 Spring 2010 Physical Query Plan Suppliers Supplies sno = sno σ sscity=‘Seattle’ ∧sstate=‘WA’ ∧ pno=2 π sname (File scan) (File scan) (Nested loop) (On the fly) (On the fly) 20 Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price)Dan Suciu -- 444 Spring 2010 Final Step in Query Processing • Step 4: Query execution – How to synchronize operators? – How to pass data between operators? • What techniques are possible? – One thread per process – Iterator interface – Pipelined execution – Intermediate result materialization 21Dan Suciu -- 444 Spring 2010 Iterator Interface • Each operator implements this interface • Interface has only three methods • open() – Initializes operator state – Sets parameters such as selection condition • get_next() – Operator invokes get_next() recursively on its inputs – Performs processing and produces an output tuple • close(): cleans-up state 22Dan Suciu -- 444 Spring 2010 Pipelined Execution • Applies parent operator to tuples directly as they are produced by child operators • Benefits – No operator synchronization issues – Saves cost of writing intermediate data to disk – Saves cost of reading
View Full Document