1Relational Database Design TheoryPart ICPS 116Introduction to Database Systems2Announcements (September 11) Homework #1 due in one week Details of the course project and a list of suggested ideas will be available this Thursday3MotivationHow do we tell if a design is bad e gSID name CID142 Bart CPS116142 Bart CPS114857 Lisa CPS116857 Lisa CPS130…… …How do we tell if a design is bad, e.g.,StudentEnroll (SID, name, CID)? This design has redundancy, because the name of a student is recorded multiple times, once for each course the student is taking How about a systematic approach to detecting and removing redundancy in designs? Dependencies, decompositions, and normal forms4Functional dependencies A functional dependency (FD) has the form X → Y, where X and Y are sets of attributes in a relation R X → Y means that whenever two tuples in R agree on all the attributes in X, they must also agree on all attributes in YXYZabca??………XYZabcab?………Must be b Could be anything5FD examplesAddress (street_address, city, state, zip) street_address, city, state → zip zip → city, statezipstate→zip?zip, state→zip? This is a trivial FD Trivial FD: LHS ⊇ RHS zip → state, zip? This is non-trivial, but not completely non-trivial Completely non-trivial FD: LHS ∩ RHS = ∅6Keys redefined using FD’sA set of attributes K is a key for a relation R if K → all (other) attributes of R That is, Kis a “super key”,py No proper subset of K satisfies the above condition That is, K is minimal27Reasoning with FD’sGiven a relation R and a set of FD’s F Does another FD follow from F? Are some of the FD’s in Fredundant (i.e., they follow (, yfrom the others)? Is K a key of R? What are all the keys of R?8Attribute closure Given R, a set of FD’s F that hold in R, and a set of attributes Z in R:The closure of Z (denoted Z+) with respect to F is the set of all attributes {A1, A2, …} functionally didbZ(h i ZAA)determined by Z(that is, Z →A1A2…) Algorithm for computing the closure Start with closure = Z If X → Y is in F and X is already in the closure, then also add Y to the closure Repeat until no more attributes can be added9A more complex exampleStudentGrade (SID, name, email, CID, grade) SID → name, emailemail→SIDemail→SID SID, CID → grade(Not a good design, and we will see why later)10Example of computing closure F includes: SID → name, email email → SID SID, CID → grade{CIDemail}+=?{ CID, email}= ? email → SID Add SID; closure is now { CID, email, SID } SID → name, email Add name, email; closure is now { CID, email, SID, name } SID, CID → grade Add grade; closure is now all the attributes in StudentGrade11Using attribute closureGiven a relation R and set of FD’s F Does another FD X → Y follow from F? Compute X+with respect to F If Y ⊆ X+, then X → Y follow from F Is K a key of R? Compute K+with respect to F If K+contains all the attributes of R, K is a super key Still need to verify that K is minimal (how?)12Rules of FD’s Armstrong’s axioms Reflexivity: If Y ⊆ X, then X → Y Augmentation: If X → Y, then XZ → YZ for any Z Transitivity: If X → Y and Y → Z, then X → Z Rules derived from axioms Splitting: If X → YZ, then X → Y and X → Z Combining: If X → Y and X → Z, then X → YZ313Using rules of FD’sGiven a relation R and set of FD’s F Does another FD X → Y follow from F? Use the rules to come up with a proof Example:• F includes:SID → name, email; email → SID; SID, CID → grade• CID, email → grade?email → SID (given in F)CID, email → CID, SID (augmentation)SID, CID → grade (given in F)CID, email → grade (transitivity)14Non-key FD’s Consider a non-trivial FD X → Y where X is not a super key Since X is not a super key, there are some attributes (say Z) that are not functionally determined by XThat b is always associated with a is recorded by multiple rows:redundancy, update anomaly, deletion anomalyXYZabc1abc2………15Example of redundancy StudentGrade (SID, name, email, CID, grade) SID → name, emailSID name email CID grade142 Bart [email protected] CPS116 B-142Btb t@[email protected] Milhouse [email protected] CPS116 B+857 Lisa [email protected] CPS116 A+857 Lisa [email protected] CPS130 A+456 Ralph [email protected] CPS114 C…… … … …16DecompositionSID name email142 Bart [email protected]@fox.comSID CID grade142 CPS116 B-142CPS114BSID name email CID grade…… … … … Eliminates redundancy To get back to the original relation:[email protected] Lisa [email protected] Ralph [email protected]…… …142CPS114B123 CPS116 B+857 CPS116 A+857 CPS130 A+456 CPS114 C…… …17Unnecessary decompositionSID name email142 Bart [email protected] Milhouse [email protected] Lisa [email protected] Ralph [email protected]…… …SID name142 BartSID email142 [email protected] Fine: join returns the original relation Unnecessary: no redundancy is removed, and now SID is stored twice!123 Milhouse857 Lisa456 Ralph……123 [email protected] [email protected] [email protected]……18Bad decompositionSID CID grade142 CPS116 B-142 CPS114 B123 CPS116 B+857 CPS116 A+857 CPS130 A+456CPS114CSID CID142 CPS116142 CPS114123CPS116SID grade142 B-142 B123BSID grade142 B-142 B123B Association between CID and grade is lost Join returns more rows than the original relation456CPS114C…… …123CPS116857 CPS116857 CPS130456 CPS114……123B+857 A+857 A+456 C……123B+857 A+456 C……419Lossless join decomposition Decompose relation R into relations S and T attrs(R) = attrs(S) ∪ attrs(T) S = πattrs(S)( R ) T = πattrs(T)( R )Th d i i ill jid iiifThe decomposition is a lossless join decompositionif, given known constraints such as FD’s, we can guarantee that R = S ! T Any decomposition gives R ⊆ S ! T (why?) A lossy decomposition is one with R ⊂ S ! T20Loss? But I got more rows! “Loss” refers not to the loss of tuples, but to the loss of information Or, the ability to distinguish different original relationsNllSID CID gradeSID CID gradeNo way to tellwhich is the original relation142 CPS116 B-142 CPS114 B123 CPS116 B+857 CPS116 A+857 CPS130 A+456 CPS114 C…… …SID CID142 CPS116142 CPS114123 CPS116857 CPS116857 CPS130456
View Full Document