DOC PREVIEW
U of I CS 511 - LECTURE - System R

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Lecture 3 System R Sept 1 2006 ChengXiang Zhai Most slides are adapted from Kevin Chang s lecture slides CS511 Advanced Database Management Systems 1 System R System R 1974 1978 IBM San Jose Labs lots of PhD researchers Gray coming from OS first CS PhD of Berkeley lots of influence in RSS ACM SIGMOD Innovation Award 1992 Turing Award 1998 Won Kim is UIUC alum Dissertation Query Processing for Relational Database Systems CS511 Advanced Database Management Systems 2 INGRES INGRES 1973 1977 U C Berkeley faculty graduate students Mike Stonebraker then an asst prof ACM SIGMOD Innovation Award 1991 Eugene Wong Postgres PostgreSQL CS511 Advanced Database Management Systems 3 System R and INGRES Gray Jim Gray see System R 25th Reunion page Hostility developed between the San Jose IBM group and the Berkeley group because they were working on very very similar things and had very very similar ideas As a consequence we came to the conclusion that the best thing was not to talk to each other The Berkeley folks thought the IBM guys were ripping off ideas from the INGRES project We had a strained relationship CS511 Advanced Database Management Systems 4 Joint ACM Software System Award 1988 System R Donald Chamberlin James Gray Raymond Lorie Gianfranco Putzolu Patrici Selinger Irving Traiger INGRES Gerald Held Michael Stonebraker Eugene Wong Citation The INGRES and System R systems demonstrated that a practical and efficient database management system DBMS could be implemented based on the relational data model These systems were full function DBMS s that supported non procedural query languages QUEL and SQL automatic query optimization alternative storage structures transactions crash recovery views integrity and protection They have revolutionized the database system industry by showing how data stored in a computer can be conveniently accessed by end users and while at the same time it can be used by production application programs http awards acm org software system CS511 Advanced Database Management Systems 5 Contributions of System R CS511 Advanced Database Management Systems 6 Contributions of System R Bringing theory to practice nice theory implemented into practical system High level query language SQL Codd s relational algebra calculus were criticized as too mathematical System research in action macro design a complete system architecture micro identify key problems and provide solutions Defining database landscape industry product spec and research directions CS511 Advanced Database Management Systems 7 Complete System Study Phase 0 1974 1975 initial single user prototype try out ideas and find issues felt a good idea to plan to throw away ver 1 0 Phase 1 1976 1977 full function multi user prototype Phase 2 1978 1979 evaluation and feedback lots of good lessons learned Very similar process took place in INGRES CS511 Advanced Database Management Systems 8 Relational System Modules subsystems CS511 Advanced Database Management Systems 9 System Modules Identified view management query parser rewriter query optimizer query executor data storage access methods buffer manager lock manager log recovery system CS511 Advanced Database Management Systems 10 System R Architecture RDS RSS divide remains in many systems RDS query processing logical view query parser rewriter optimizer executor RSS storage access methods physical storage access methods buffer manager lock log recovery CS511 Advanced Database Management Systems 11 Views View defined as a query another consistent use of SQL no separate DDL Query on views query rewriter to flatten view unfold def form a composite query tree View transparency Almost any queries on any views Not fully transparent though update only for single relation views no right meanings in some cases many to one nature of view def ambiguity even none to one some view state has no correspondence CS511 Advanced Database Management Systems 12 SQL as Query Language High level declarative English based language declarative language what not how well founded simple semantics based on relational algebra small set of well understood operators so optimizer knows how operators can be interchanged transformed what equivalent implementations are for each op Consistent for different functionalities data definition e g table creation view definition data manipulation e g queries updates Uniform for different usage scenarios embedding from different host languages canned queries ad hoc user queries from command lines Unexpected benefit Standardized DB interface mid 80 s CS511 Advanced Database Management Systems 13 What makes SQL possible Query parse access path selection code gen execute Cost based access path selection optimization Pre compilation for canned queries remove preprocessing optimization from run time data indexes and statistics may change reoptimize and recompile by observing dependencies alternative approaches caching of recent used query plans trigger to invalidate cached plans on relevant events e g on rebuilding system statistics on index creation in contrast interpreted QUEL in INGRES admitted mistake CS511 Advanced Database Management Systems 14 Query Optimizer Cost Based Cost based optimizer set up paradigm largely unchanged since Cost model C weighted sum CPU time IO CPU time modeled as number of RSS calls CS511 Advanced Database Management Systems 15 Query Optimizer Access Path Selection Access path selection based on expected costs select people where job programmer and city champaign path 1 job index check city path 2 city index check job more paths Data independence what are hidden from users Cost estimation based on index selectivity job programmer more selective or city chamapign index clustering records of same neighboring key are packed physically together minimize IO to fetch records of same key or a range a relation can typically has at most one clustering index Why CS511 Advanced Database Management Systems 16 Query Optimizer Join Strategies To evaluate R a S a Nested loop join for each tuple r in R use index fetch S tuples s s t s a r a Q B tree or hashing index better Sort merge join sort R sort S merge tuples in R and S in order Q Use B tree index to speed up Hashing All joins two way n ary joins as binary trees prune away lots of alternative plans for n ary joins CS511 Advanced Database Management Systems 17 Storage Phase 0 Tuples have TID containing page number direct access by TID to fetch the page Tuples contain pointers to values in


View Full Document

U of I CS 511 - LECTURE - System R

Download LECTURE - System R
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LECTURE - System R and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LECTURE - System R and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?