1Query Processing: A Systems ViewCPS 116Introduction to Database Systems2Announcements (November 13) Homework #3 sample solution available Homework #4 due in 1½ weeks3A query’s trip through the DBMSParserVa l i d a t o rSQL querySELECT title, SIDFROM Enroll, CourseWHERE Enroll.CID =Course.CID;Parse tree<SFW><select-list><from-list><where-cond><table><table><Query>……OptimizerExecutorResult<table><table>Enroll CoursePhysical planPROJECT (title, SID)MERGE-JOIN (CID)SCAN (Enroll)SCAN (Course)SORT (CID)Logical planπtitle, SIDσEnroll.CID = Course.CIDEnroll Course×24Parsing and validation Parser: SQL → parse tree Good old lex & yacc will do Detect and reject syntax errors Validator: parse tree → logical plan Detect and reject semantic errors• Nonexistent tables/views/columns?• Insufficient access privileges?• Type mismatches?– Examples: AVG(name), name + GPA, Student UNION Enroll Also•Expand *• Expand view definitions Information required for semantic checking is found in system catalog (contains all schema information)5Logical plan Nodes are logical operators (often relational algebra operators) There are many equivalent logical plansπtitleσStudent.name=“Bart” ∧ Student.SID = Enroll.SID ∧ Enroll.CID = Course.CID×EnrollCourse×StudentAn equivalent plan:πtitleEnroll.CID = Course.CIDEnrollCourseStudentStudent.SID = Enroll.SIDσname = “Bart”6Physical (execution) plan A complex query may involve multiple tables and various query processing algorithms E.g., table scan, index nested-loop join, sort-merge join, hash-based duplicate elimination… A physical plan for a query tells the DBMS query processor how to execute the query A tree of physical plan operators Each operator implements a query processing algorithm Each operator accepts a number of input tables/streams and produces a single output table/stream37Examples of physical plansPROJECT (title)INDEX-NESTED-LOOP-JOIN (CID)Inde on C(CID)PROJECT (title)MERGE-JOIN (CID)SELECT Course.titleFROM Student, Enroll, CourseWHERE Student.name = ‘Bart’AND Student.SID = Enroll.SID AND Enroll.CID = Course.CID; Many physical plans for a single query Equivalent results, but different costs and assumptions!)DBMS query optimizer picks the “best” possible physical planIndex on Enroll(SID)Index on Course(CID)Index on Student(name)INDEX-SCAN (name = “Bart”)INDEX-NESTED-LOOP-JOIN (SID)SCAN (Course)SORT (CID)MERGE-JOIN (SID)SCAN (Enroll)SORT (SID)SCAN (Student)FILTER (name = “Bart”)8Physical plan execution How are intermediate results passed from child operators to parent operators? Temporary files• Compute the tree bottom-up• Children write intermediate results to temporary files• Parents read temporary files Iterators• Do not materialize intermediate results• Children pipeline their results to parents9Iterator interface Every physical operator maintains its own execution state and implements the following methods: open(): Initialize state and get ready for processing getNext(): Return the next tuple in the result (or a null pointer if there are no more tuples); adjust state to allow subsequent tuples to be obtained close(): Clean up410An iterator for table scan State: a block of memory for buffering input R; a pointer to a tuple within the block open(): allocate a block of memory getNext()If bl k f Rh b d d h fi bl k f h If no block of Rhas been read yet, read the first block from the disk and return the first tuple in the block• Or the null pointer if R is empty If there is no more tuple left in the current block, read the next block of R from the disk and return the first tuple in the block• Or the null pointer if there are no more blocks in R Otherwise, return the next tuple in the memory block close(): deallocate the block of memory11An iterator for nested-loop joinR: An iterator for the left subtreeS: An iterator for the right subtree open()R.open(); S.open(); r = R.getNext();getNext()NESTED-LOOP-JOINRSgetNext()do {s = S.getNext();if (s == null) {S.close(); S.open(); s = S.getNext(); if (s == null) return null;r = R.getNext(); if (r == null) return null;}} until (r joins with s);return rs; close()R.close(); S.close();Is this tuple-based or block-based nested-loop join?12An iterator for 2-pass merge sort open() Allocate a number of memory blocks for sorting Call open() on child iterator getNext() If called for the first time• Call getNext() on child to fill all blocks, sort the tuples, and output a run• Repeat until getNext() on child returns null• Read one block from each run into memory, and initialize pointers to point to the beginning tuple of each block Return the smallest tuple and advance the corresponding pointer; if a block is exhausted bring in the next block in the same run close() Call close() on child Deallocate sorting memory and delete temporary runs513Blocking vs. non-blocking iterators A blocking iterator must call getNext()exhaustively (or nearly exhaustively) on its children before returning its first output tuple Examples: A non-blocking iterator expects to make only a few getNext() calls on its children before returning its first (or next) output tuple Examples:14Execution of an iterator tree Call root.open() Call root.getNext() repeatedly until it returns null Call root.close()) Requests go down the tree) Intermediate result tuples go up the tree) No intermediate files are
View Full Document