Introduction to Database SystemsCSE 444Lecture 17: Relational Algebra1CSE 444 - Summer 2010Outline• Motivation and sets vs. bags• Relational Algebrag• Translation from SQL to the Relational Algebra• Read Sections 2.4, 5.1, and 5.22CSE 444 - Summer 2010The WHAT and the HOW• In SQL, we write WHAT we want to get form the data• The database system needs to figure out HOW to get the data we want• The passage from WHAT to HOW goes through the Relational Algebra3CSE 444 - Summer 2010SQL = WHATProduct(pid, name, price)Ph (ididt)Purchase(pid, cid, store)Customer(cid, name, city)SELECT DISTINCT x.name, z.nameFROM Product x, Purchase y, Customer z,y,WHERE x.pid = y.pid and y.cid = z.cid andx.price > 100 and z.city = ‘Seattle’It’s clear WHAT we want, unclear HOW to get it4CSE 444 - Summer 2010Relational Algebra = HOWgδProduct(pid, name, price)Ph (idid)T4(name name)Final answerx.name,z.namePurchase(pid, cid, store)Customer(cid, name, city)ΠT4(name,name)price>100 and city=‘Seattle’x.name,z.nameσT2( . . . .)T3(. . . )cid=cidT1(pid,name,price,pid,cid,store)pid=pidcid cidCustomerTemporary tablesT1, T2, . . .5ProductPurchaseCustomer,,Relational Algebra = HOWThe order is now clearly specified:• Iterate over PRODUCT…• …join with PURCHASE…• …join with CUSTOMER…• …select tuples with Price>100 and City=‘Seattle’…• …eliminate duplicates…•and that’s the final answer !•…and that s the final answer !6CSE 444 - Summer 2010Sets v.s. Bags• Sets: {a,b,c}, {a,d,e,f}, { }, . . .•Bags: {a, a, b, c}, {b, b, b, b, b}, . . .g{}{}Relational Algebra has two flavors:g• Over sets: theoretically elegant but limited• Over bags: needed for SQL queries + more efficient– Example: Compute average price of all productsWe discuss set semantics• We mention bag semantics only where needed7CSE 444 - Summer 2010Relational Algebra•Query languageassociated with relational model•Query languageassociated with relational model•Queries specified in an operational manner•Queries specified in an operational manner– A query gives a step-by-step procedure• Relational operators–Take one or two relation instances as argumentTake one or two relation instances as argument– Return one relation instance as result– Easy to compose into relational algebra expressionsCSE 444 - Summer 2010 8Relational Algebra (1/3)Five basic operators:• Union (∪) and Set difference (–)()()• Selection: : σcondition(S)– Condition is Boolean combination (∧,∨) of terms– Term is: attribute op constant, attr. op attr.– Op is: <, <=, =, ≠, >=, or > • Projection: πlist-of-attributes(S)• Cross-product or cartesian product (×)CSE 444 - Summer 2010 9Relational Algebra (2/3)Derived or auxiliary operators:• Intersection(∩), Division(R/S)()()•Join: R θS = σθ(R × S)• Variations of joinsj– Natural, equijoin, theta-join – Outer join and semi-join• Rename ρB1,…,Bn(S)CSE 444 - Summer 2010 10Relational Algebra (3/3)Extensions for bags•Duplicate elimination: δp• Group by: γ [Same symbol as aggregation]– Partitions tuples of a relation into “groups”• Sorting: τOther extensions•Aggregation: γ(min, max, sum, average, count)gg gγ(,,, g, )CSE 444 - Summer 2010 11Union and Difference•R1 ∪ R2•Example: p– ActiveEmployees ∪ RetiredEmployees•R1 –R2• Example:– AllEmployees – RetiredEmployeesBe careful when applying to bags!CSE 444 - Summer 2010 12Be careful when applying to bags!What about Intersection ?• It is a derived operator•R1 ∩ R2 = R1 –(R1 –R2)()• Also expressed as a join (will see later)•Examplep– UnionizedEmployees ∩ RetiredEmployeesCSE 444 - Summer 2010 13Sl tiSelection•Returns alltuplesthat satisfya condition•Returns all tuplesthat satisfy a condition• Notation: σc(R)•Examples•Examples– σSalary > 40000(Employee)–σ(Employee)σname = “Smith”(Employee)• The condition c can be–Boolean combination (∧∨)of termsBoolean combination (∧,∨) of terms– Term is: attribute op constant, attr. op attr.– Op is: <, <=, =, ≠, >=, or > 14CSE 444 - Summer 2010SSN Name Salary1234545 John 2000005423341 Smith 6000004352342 Fred 500000σSalary> 40000(Employee)SSN Name SalaryS5423341Smith 6000004352342 Fred 50000015CSE 444 - Summer 2010Projection• Eliminates columns• Notation: ΠA1 An(R)A1,…,An ()• Example: project social-security number and names:– ΠSSN, Name(Employee)– Output schema: Answer(SSN, Name)Semantics differs over set or over bagsCSE 444 - Summer 2010 16SSN Name Salary1234545 John 2000005423341John6000005423341John6000004352342 John 200000ΠName,Salary(Employee)Name SalaryJohn20000John20000John 6000017Set semantics: duplicate elimination automaticSSN Name Salary1234545 John 2000005423341John6000005423341John6000004352342 John 200000ΠName,Salary(Employee)Name SalaryJohn 20000John60000John60000John 2000018Bag semantics: no duplicate elimination; need explicit δCartesian Product• Each tuple in R1 with each tuple in R2• Notation: R1 ×R2• Example: – Employee × Dependents• Very rare in practice; mainly used to express joins19CSE 444 - Summer 2010Cartesian Product Example ElEmployeeName SSN John 999999999 T777777777Tony 777777777 Dependents E l SSNDEmployeeSSNDname 999999999 Emily 777777777 Joe Employee x Dependents Name SSN EmployeeSSN Dname John 999999999 999999999 EmilyJohn 999999999 777777777 Joe Tony 777777777 999999999 Emily 20Tony 777777777 777777777 Joe CSE 444 - Summer 2010Renaming• Changes the schema, not the instance• Notation: ρB1Bn(R)ρB1,…,Bn()• Example:– ρLastName,SocSocNo(Employee)LastName, SocSocNo– Output schema: Answer(LastName, SocSocNo)21CSE 444 - Summer 2010Different Types of Join•Theta-join:RθS=σθ(RxS)Thetajoin: R θS σθ(R xS)– Join of R and S with a join condition θ– Cross-product followed by selection θ• Equijoin: R θS = πA(σθ(R x S))– Join condition θ consists only of equalities–ProjectionπAdrops all redundantattributesProjection πAdrops all redundant attributes– By far most used join in practice•Naturaljoin:R S=π(σ(R x S))•Natural join: R S = πA(σθ(R x S))– Equijoin– Equality on all common attributes (names) in R and in S–Projection drops duplicate common attributes–Projection drops duplicate common attributes22Theta-Join ExampleAnonPatientPAnnonJobJage zip disease5498125heartAnonPatientPAnnonJobJjob age ziplawyer54981255498125heart20 98120
View Full Document