1Relational Model & AlgebraCPS 116Introduction to Database Systems2Announcements Please read news: duke.cs.cps116 Please sign up for DB2 accounts and indicate your availability to attend discussion sessions (sign-up sheet is circulating) Lectures slides on Web “Notes” version available in hardcopy and online an hour before lecture Complete version available online after the lecture Homework #1 will be assigned next Tuesday Office hours: see course Web page Book update: Prentice-Hall sent something out yesterday3Relational data model A database is a collection of relations (or tables) Each relation has a list of attributes (or columns) Set-valued attributes not allowed Each attribute has a domain (or type) Each relation contains a set of tuples (or rows) Duplicate tuples are not allowed Simplicity is a virtue!24ExampleStudent CourseEnrollOrdering of rows doesn’t matter(even though the output isalways in some order)SID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1857 Lisa 8 4.3456 Ralph 8 2.3... ... ... ...CID titleCPS116 Intro. to Database SystemsCPS130 Analysis of AlgorithmsCPS114 Computer Networks... ...SID C ID142 CPS116142 CPS114123 CPS116857 CPS116857 CPS130456 CPS114... ...5Schema versus instance Schema (metadata) Specification of how data is to be structured logically Defined at set-up Rarely changes Instance Content Changes rapidly, but always conforms to the schema Compare to type and objects of type in a programming language6Example Schema Student (SID integer, name string, age integer, GPA float) Course (CID string, title string) Enroll (SID integer, CID integer)Instance { h142, Bart, 10, 2.3i, h123, Milhouse, 10, 3.1i, ...} { hCPS116, Intro. to Database Systemsi, ...} { h142, CPS116i, h142, CPS114i, ...}37Relational algebra operators Core set of operators: Selection, projection, cross product, union, difference, and renaming Additional, derived operators: Join, natural join, intersection, etc.RelOpRelOp8Selection Input: a table R Notation: σp(R) p is called a selection condition/predicate Purpose: filter rows according to some criteria Output: same columns as R, but only rows of R that satisfy p9Selection example Students with GPA higher than 3.0σGPA > 3.0( Student )σGPA > 3.0SID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1857 Lisa 8 4.3456 Ralph 8 2.3SID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1857 Lisa 8 4.3456 Ralph 8 2.3410More on selection Selection predicate in general can include any column of R, constants, comparisons such as =, ·, etc., and Boolean connectives ∧, ∨, and ¬ Example: straight A students under 18 or over 21σGPA ≥ 4.0 ∧ (age < 18 ∨ age > 21)( Student ) But you must be able to evaluate the predicate over a single row of the input table Example: student with the highest GPAσGPA ≥ all GPA in Student table( Student )11Projection Input: a table R Notation: πL( R ) L is a list of columns in R Purpose: select columns to output Output: same rows, but only the columns in L12Projection example ID’s and names of all studentsπSID, name( Student )πSID, nameSID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1857 Lisa 8 4.3456 Ralph 8 2.3SID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1857 Lisa 8 4.3456 Ralph 8 2.3513More on projection Duplicate output rows must be removed Example: student agesπage( Student )πageSID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1857 Lisa 8 4.3456 Ralph 8 2.3SID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1857 Lisa 8 4.3456 Ralph 8 2.314Cross product Input: two tables R and S Notation: R × S Purpose: pairs rows from two tables Output: for each row r in R and each row s in S, output a row rs (concatenation of r and s)15Cross product example Student × EnrollSID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1... ... ... ...SID CID142 CPS116142 CPS114123 CPS116... ...×SID name age GPA SID CID142 Bart 10 2.3 142 CPS116142 Bart 10 2.3 142 CPS114142 Bart 10 2.3 123 CPS116123 Milhouse 10 3.1 142 CPS116123 Milhouse 10 3.1 142 CPS114123 Milhouse 10 3.1 123 CPS116... ... ... ... ... ...616A note on column ordering The ordering of columns in a table is considered unimportant (so is the ordering of rows)SID name age GPA SID CID142 Bart 10 2.3 142 CPS116142 Bart 10 2.3 142 CPS114142 Bart 10 2.3 123 CPS116123 Milhouse 10 3.1 142 CPS116123 Milhouse 10 3.1 142 CPS114123 Milhouse 10 3.1 123 CPS116... ... ... ... ... ...SID CID SID name age GPA142 CPS116 142 Bart 10 2.3142 CPS114 142 Bart 10 2.3123 CPS116 142 Bart 10 2.3142 CPS116 123 Milhouse 10 3.1142 CPS114 123 Milhouse 10 3.1123 CPS116 123 Milhouse 10 3.1... ... ... ... ... ... That means cross product is commutative, i.e.,R × S = S × R for any R and S=17Derived operator: join Input: two tables R and S Notation: R pS p is called a join condition/predicate Purpose: relate rows from two tables according to some criteria Output: for each row r in R and each row s in S, output a row rs if r and s satisfy p Shorthand for18Join example Info about students, plus CID’s of their coursesStudent Student.SID = Enroll.SIDEnrollSID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1... ... ... ...SID CID142 CPS116142 CPS114123 CPS116... ...×SID name age GPA SID CID142 Bart 10 2.3 142 CPS116142 Bart 10 2.3 142 CPS114142 Bart 10 2.3 123 CPS116123 Milhouse 10 3.1 142 CPS116123 Milhouse 10 3.1 142 CPS114123 Milhouse 10 3.1 123 CPS116... ... ... ... ... ...Student.SID = Enroll.SIDUse table_name.column_name syntaxto disambiguate identically named columns from different input tables719Derived operator: natural join Input: two tables R and S Notation: R S Purpose: relate rows from two tables, and Enforce equality on all common attributes Eliminate one copy of common attributes Shorthand for πL( R pS ) L is the union of all attributes from R and S, with duplicate attributes removed p equates all attributes common to R and S20SID name age GPA SID CID142 Bart 10 2.3 142 CPS116142 Bart 10 2.3 142 CPS114142 Bart 10 2.3 123 CPS116123 Milhouse 10 3.1 142 CPS116123 Milhouse 10 3.1 142 CPS114123 Milhouse 10 3.1 123 CPS116... ... ... ... ... ...Student.SID = Enroll.SIDSID name age GPA142 Bart 10 2.3123 Milhouse 10 3.1... ... ... ...SID CID142 CPS116142 CPS114123 CPS116... ...Natural join example Student Enroll = π?( Student ?Enroll
View Full Document