1SQL: Part ICPS 116Introduction to Database Systems2Announcements (September 13) Homework #1 due next Tuesday Do we need a help session on Monday? Course project assigned today Choice of “standard” or “open” One- to three-person teams Two milestones + demo/report Milestone #1 due in 4 weeks, right after fall break3SQL SQL: Structured Query Language Pronounced “S-Q-L” or “sequel” The standard query language supported by most commercial DBMS A brief history IBM System R ANSI SQL89 ANSI SQL92 (SQL2) ANSI SQL99 (SQL3) ANSI SQL 2003 (+OLAP, XML, etc.)4Creating and dropping tables CREATE TABLE table_name(…, column_nameicolumn_typei, …); DROP TABLE table_name; Examplescreate table Student (SID integer,name varchar(30) email varchar(30)name varchar(30), email varchar(30),age integer, GPA float);create table Course (CID char(10), title varchar(100));create table Enroll (SID integer, CID char(10));drop table Student;drop table Course;drop table Enroll;-- everything from -- to the end of the line is ignored.-- SQL is insensitive to white space.-- SQL is insensitive to case (e.g., ...Course... is equivalent to-- ...COURSE...)5Basic queries: SFW statement SELECT A1, A2, …, AnFROM R1, R2, …, RmWHERE condition; Also called an SPJ (select-project-join) query Equivalent (not really!) to relational algebra queryπA1, A2, …, An( σcondition(R1× R2× … × Rm))6Example: reading a table SELECT * FROM Student; Single-table query, so no cross product here WHERE clause is optional * is a short hand for “all columns”27Example: selection and projection Name of students under 18 SELECT name FROM Student WHERE age < 18; When was Lisa born? SELECT 2007 – ageFROM StudentWHERE name = ’Lisa’; SELECT list can contain expressions• Can also use built-in functions such as SUBSTR, ABS, etc. String literals (case sensitive) are enclosed in singlequotes8Example: join SID’s and names of students taking courses with the word “Database” in their titles SELECT Student.SID, Student.nameFROM Student, Enroll, CourseWHERE Student SID = Enroll SIDWHERE Student.SID = Enroll.SIDAND Enroll.CID = Course.CIDAND title LIKE ’%Database%’; LIKE matches a string against a pattern• % matches any sequence of 0 or more characters Okay to omit table_name in table_name.column_name if column_name is unique9Example: rename SID’s of all pairs of classmates Relational algebra query:πe1.SID, e2.SID( ρe1Enroll !e1.CID = e2.CID∧ e1.SID > e2.SIDρe2Enroll )SQLSQL:SELECT e1.SID AS SID1, e2.SID AS SID2FROM Enroll AS e1, Enroll AS e2WHERE e1.CID = e2.CIDAND e1.SID > e2.SID; AS keyword is completely optional10A more complicated example Titles of all courses that Bart and Lisa are taking togetherFROM Student sb, Student sl, Enroll eb, Enroll el, Course cSELECT c.titleTip: Write the FROM clause first, then WHERE, and then SELECTWHERE sb.name = ’Bart’ AND sl.name = ’Lisa’AND eb.SID = sb.SID AND el.SID = sl.SIDAND eb.CID = c.CID AND el.CID = c.CID;11Why SFW statements? Out of many possible ways of structuring SQL statements, why did the designers choose SELECT-FROM-WHERE? A large number of queries can be written using only lddselection, projection, and cross product (or join) Any query that uses only these operators can be written in a canonical form: πL(σp(R1× … × Rm))•Example: πR.A, S.B(R !p1S) !p2(πT.Cσp3T) =πR.A, S.B, T.Cσp1 ∧ p2 ∧ p3( R × S × T ) SELECT-FROM-WHERE captures this canonical form12Set versus bag semantics Set No duplicates Relational model and algebra use set semantics Bag Duplicates allowed Number of duplicates is significant SQL uses bag semantics by default313Set versus bag exampleπSIDEnrollEnrollSID CID142 CPS116142CPS114SID142123857456…SELECT SIDFROM Enroll;142CPS114123 CPS116857 CPS116857 CPS130456 CPS114……SID142142123857857456…14A case for bag semantics Efficiency Saves time of eliminating duplicates Which one is more useful? πGPAStudentG SELECT GPA FROM Student; The first query just returns all possible GPA’s The second query returns the actual GPA distribution Besides, SQL provides the option of set semantics with DISTINCT keyword15Forcing set semantics SID’s of all pairs of classmates SELECT e1.SID AS SID1, e2.SID AS SID2FROM Enroll AS e1, Enroll AS e2WHERE e1.CID = e2.CIDAND e1.SID > e2.SID;• Say Bart and Lisa both take CPS116 and CPS114 SELECT DISTINCT e1.SID AS SID1, e2.SID AS SID2...•With DISTINCT, all duplicate (SID1, SID2) pairs are removed from the output16Operational semantics of SFW SELECT [DISTINCT] E1, E2, …, EnFROM R1, R2, …, RmWHERE condition; For each t1in R1:For eacht2inR2:For each t2in R2: … …For each tmin Rm:If condition is true over t1, t2, …, tm:Compute and output E1, E2, …, Enas a rowIf DISTINCT is presentEliminate duplicate rows in output t1, t2, …, tmare often called tuple variables17SQL set and bag operations UNION, EXCEPT, INTERSECT Set semantics• Duplicates in input tables, if any, are first eliminated Exactly like set ∪, −, and ∩ in relational algebraUNION ALLEXCEPT ALLINTERSECT ALLUNION ALL, EXCEPT ALL, INTERSECT ALL Bag semantics Think of each row as having an implicit count (the number of times it appears in the table) Bag union: sum up the counts from two tables Bag difference: proper-subtract the two counts Bag intersection: take the minimum of the two counts18Examples of bag operationsBag1 Bag2fruitappleappleorangefruitappleorangeorangeBag1 UNION ALL Bag2Bag1 EXCEPT ALL Bag2Bag1 INTERSECT ALL Bag2fruitappleappleorangeappleorangeorangefruitapplefruitappleorange419Examples of set versus bag operations Enroll(SID, CID), ClubMember(club, SID) (SELECT SID FROM ClubMember)EXCEPT(SELECT SID FROM Enroll);•SID’ftd t h i lbbt ttki l•SIDs of students who are in clubs but not taking any classes (SELECT SID FROM ClubMember)EXCEPT ALL(SELECT SID FROM Enroll);• SID’s of students who are in more clubs than classes20Summary of SQL features covered so far SELECT-FROM-WHERE statements (select-project-join queries) Set and bag operationsgp) Next: how to nest SQL queries21Table expression Use query result as a table In set and bag operations, FROM clauses, etc. A way to “nest” queries Example: names of students who are in more
View Full Document