1Relational Database Design TheoryCPS 116Introduction to Database Systems2Announcements (Tue. Sep. 9) Homework #1 due in one week Need a help session this Friday or next Monday? Course project description available today Choice of “standard” or “open” One- to three-person team (approval needed beyond 3) Two milestones + demo/report Milestone #1 due in ~5½ weeks, right after fall break3MotivationSID name CID142 Bart CPS116142 Bart CPS114857 Lisa CPS116857 Lisa CPS130…… … How about a systematic approach to detecting and removing redundancy in designs? Dependencies, decompositions, and normal forms24Functional dependencies A functional dependency (FD) has the form X → Y, where X and Y are sets of attributes in a relation R X → Y means that whenever two tuples in R agree on all the attributes in X, they must also agree on all attributes in YXYZabca??………XYZabcab?………Must be b Could be anything5FD examplesAddress (street_address, city, state, zip) street_address, city, state → zip6Keys redefined using FD’sA set of attributes K is a key for a relation R if K → all (other) attributes of R That is, Kis a “super key”,py No proper subset of K satisfies the above condition That is, K is minimal37Reasoning with FD’sGiven a relation R and a set of FD’s F Does another FD follow from F? Are some of the FD’s in Fredundant (i.e., they follow (, yfrom the others)? Is K a key of R? What are all the keys of R?8Attribute closure Given R, a set of FD’s F that hold in R, and a set of attributes Z in R:The closure of Z (denoted Z+) with respect to F is the set of all attributes {A1, A2, …} functionally didbZ(h i ZAA)determined by Z(that is, Z →A1A2…) Algorithm for computing the closure Start with closure = Z If X → Y is in F and X is already in the closure, then also add Y to the closure Repeat until no more attributes can be added9A more complex exampleStudentGrade (SID, name, email, CID, grade)(Not a good design, and we will see why later)410Example of computing closure F includes: SID → name, email email → SID SID, CID → grade{CIDemail}+=?{ CID, email}= ? email → SID Add SID; closure is now { CID, email, SID } SID → name, email Add name, email; closure is now { CID, email, SID, name } SID, CID → grade Add grade; closure is now all the attributes in StudentGrade11Using attribute closureGiven a relation R and set of FD’s F Does another FD X → Y follow from F? Compute X+with respect to F If Y ⊆ X+, then X → Y follow from F Is K a key of R?12Rules of FD’s Armstrong’s axioms Reflexivity: If Y ⊆ X, then X → Y Augmentation: If X → Y, then XZ → YZ for any Z Transitivity: If X → Y and Y → Z, then X → Z Rules derived from axioms Splitting: If X → YZ, then X → Y and X → Z Combining: If X → Y and X → Z, then X → YZ) Using these rules, you can prove or disprove an FD given a set of FDs513Non-key FD’s Consider a non-trivial FD X → Y where X is not a super key Since X is not a super key, there are some attributes (say Z) that are not functionally determined by XThat b is always associated with a is recorded by multiple rows:redundancy, update anomaly, deletion anomalyXYZabc1abc2………14Example of redundancy StudentGrade (SID, name, email, CID, grade) SID → name, emailSID name email CID grade142 Bart [email protected] CPS116 B-142Btb t@[email protected] Milhouse [email protected] CPS116 B+857 Lisa [email protected] CPS116 A+857 Lisa [email protected] CPS130 A+456 Ralph [email protected] CPS114 C…… … … …15DecompositionSID name email142 Bart [email protected]@fox.comSID CID grade142 CPS116 B-142CPS114BSID name email CID grade…… … … … Eliminates redundancy To get back to the original relation:[email protected] Lisa [email protected] Ralph [email protected]…… …142CPS114B123 CPS116 B+857 CPS116 A+857 CPS130 A+456 CPS114 C…… …616Unnecessary decompositionSID name email142 Bart [email protected] Milhouse [email protected] Lisa [email protected] Ralph [email protected]…… …SID name142 BartSID email142 [email protected] Fine: join returns the original relation Unnecessary: no redundancy is removed, and now SID is stored twice!123 Milhouse857 Lisa456 Ralph……123 [email protected] [email protected] [email protected]……17Bad decompositionSID CID grade142 CPS116 B-142 CPS114 B123 CPS116 B+857 CPS116 A+857 CPS130 A+456CPS114CSID CID142 CPS116142 CPS114123CPS116SID grade142 B-142 B123BSID grade142 B-142 B123B456CPS114C…… …123CPS116857 CPS116857 CPS130456 CPS114……123B+857 A+857 A+456 C……123B+857 A+456 C……18Lossless join decomposition Decompose relation R into relations S and T attrs(R) = attrs(S) ∪ attrs(T) S = πattrs(S)( R ) T = πattrs(T)( R )Th d i i ill jid iiifThe decomposition is a lossless join decompositionif, given known constraints such as FD’s, we can guarantee that R = S ! T Any decomposition gives R ⊆ S ! T (why?) A lossy decomposition is one with R ⊂ S ! T719Loss? But I got more rows! “Loss” refers not to the loss of tuples, but to the loss of information Or, the ability to distinguish different original relationsNllSID CID gradeSID CID gradeNo way to tellwhich is the original relation142 CPS116 B-142 CPS114 B123 CPS116 B+857 CPS116 A+857 CPS130 A+456 CPS114 C…… …SID CID142 CPS116142 CPS114123 CPS116857 CPS116857 CPS130456 CPS114……SID grade142 B-142 B123 B+857 A+456 C……142 CPS116 B142 CPS114 B-123 CPS116 B+857 CPS116 A+857 CPS130 A+456 CPS114 C…… …20Questions about decomposition When to decomposeHow to come up with a correct decomposition (i eHow to come up with a correct decomposition (i.e., lossless join decomposition)21An answer: BCNF A relation R is in Boyce-Codd Normal Form if For every non-trivial FD X → Y in R, X is a super key That is, all FDs follow from “key → other attributes” When to decompose As long as some relation is not in BCNF How to come up with a correct decomposition Always decompose on a BCNF violation (details next))Then it is guaranteed to be a lossless join decomposition!822BCNF decomposition algorithm Find a BCNF violation That is, a non-trivial FD X → Y in R where X is not a super key of R Decompose R into R1and R2, where R1has
View Full Document