Functional Dependencies and NormalizationAdministriviaHomework 5: A Web ApplicationHomework 5: DB-based web appHomework 5: BearTunesReview: NormalizationDecomposing a RelationProblems Due to RedundancyPossible Efficiency ProblemsProblems Losing DataFunctional Dependencies help us…Where do FDs come from?Rules of InferenceAttribute ClosureBack to ReduncancyNormal FormsBoyce-Codd Normal Form (BCNF)Third Normal Form (3NF)Decomposition of a Relation SchemaExample (same as before)Slide 21Problems with DecompositionsLossless Decomposition (example)Lossy Decomposition (example)Lossless Join DecompositionsMore on Lossless DecompositionSlide 27Dependency Preserving DecompositionDependency Preserving Decompositions (Contd.)Decomposition into BCNFBCNF and Dependency PreservationWhat Does 3NF Achieve?Decomposition into 3NFMinimal Cover for a Set of FDsSummary of Schema RefinementFunctional Dependenciesand NormalizationR&G Chapter 19Lecture 26Science is the knowledge of consequences, and dependence of one fact upon another. Thomas Hobbes (1588-1679)Administrivia•Homework 5 available–Due a week from next Tuesday•Final exam 4 weeks from yesterdayHomework 5: A Web Application•In the beginning, web sites were only files–Web site reflected file system–Documents were static, usually HTML•Then came dynamic web applications–Generate HTML on the fly–Used CGI-Bin, Java servlets, executing code to emit HTML–Often programming code embedded in HTML•Most websites today are dynamic, web apps–Involve a programming language, and some kind of repository for holding information, i.e., a database.Homework 5: DB-based web app•Uses Ruby-on-Rails–very popular–super-quick development–lots of built-in features for web app dev•Has Postgres database at the backend–all data stored in 5 relations–Ruby moves data to/from database–Rails translates Ruby to HTMLHomework 5: BearTunes•Web front-end for music database•You will implement code in Ruby and RHTML•ExamplesReview: Normalization•Functional DependenciesX -> Y, if values of X are same, Y must be sameIf Y includes all attributes, X is a candidate key•Help us evaluate Design Tradeoffs:–Too few relations -> redundancy–Too many relations -> inefficiency–Way too many relations -> loss of information•If too much redundancy, decompose relationDecomposing a Relation•Redundancy can be removed by “chopping” the relation into pieces.•FD’s are used to drive this process.R W is causing the problems, so decompose SNLRWH into what relations?S N L R H123-22-3666 Attishoo 48 8 40231-31-5368 Smiley 22 8 30131-24-3650 Smethurst 35 5 30434-26-3751 Guldu 35 5 32612-67-4134 Madayan 35 8 40R W8 105 7Hourly_Emps2WagesProblems Due to Redundancy•Consider dependency X -> Y, with X not a candidate key•Update anomaly: –Several tuples can have same value for X. Can accidentally violate dependency when updating some records.•Insertion anomaly: –Can accidentally violate dependency when inserting new records, no way to ensure that new record doesn’t conflict with existing record.•Deletion anomaly: –Can lose information about dependency if delete last record having a certain value of X.Possible Efficiency Problems•Decompose too much, might need to rejoin tables to answer common queries•Also might need to join tables to check FDssid zip 53666 53688 53650 950029500295001Problems Losing Data•Decompose/Normalize too much, and can lose informationFunctional Dependencies help us…•…find candidate keys•…know when a table has redundancy•…know when a decomposition is efficientWhere do FDs come from?•They are properties of the real world•When designing database, must find them out based on the characteristics of real world things being modelled.•Database instance can show what FDs do not hold, but can’t really show what FDs are true•Given a set of FDs, can infer other FDs. •F+ = closure of F is the set of all FDs that are implied by F. (includes “trivial dependencies”)Rules of Inference•Armstrong’s Axioms (X, Y, Z are sets of attributes):–Reflexivity: If X Y, then X Y –Augmentation: If X Y, then XZ YZ for any Z–Transitivity: If X Y and Y Z, then X Z•These are sound and complete inference rules for FDs!–i.e., using AA you can compute all the FDs in F+ and only these FDs.•Some additional rules (that follow from AA):–Union: If X Y and X Z, then X YZ–Decomposition: If X YZ, then X Y and X ZAttribute Closure•Computing closure can be expensive. –Size of closure is exponential in # attrs!•Typically, just want to check if a given FD is in F+ •An efficient check:–Compute attribute closure of X (denoted X+) wrt F. X+ = Set of all attributes A such that X A is in F+•X+ := X•Repeat until no change: if there is an fd U V in F such that U is in X+, then add V to X+–Check if Y is in X+–Approach can also be used to find the keys of a relation.•If all attributes of R are in the closure of X then X is a superkey for R.•Q: How to check if X is a “candidate key”?Back to ReduncancyS N L R W H123-22-3666 Attishoo 48 8 10 40231-31-5368 Smiley 22 8 10 30131-24-3650 Smethurst 35 5 7 30434-26-3751 Guldu 35 5 7 32612-67-4134 Madayan 35 8 10 40Hourly_EmpsQ: Why was R W problematic, but S W not?Normal Forms•Q1: when is any refinement needed??!•A: If relation is in a normal form (BCNF, 3NF etc.):–we know that certain problems are avoided/minimized. –helps decide whether decomposing a relation is useful.•Role of FDs in detecting redundancy:–Consider a relation R with 3 attributes, ABC. •No (non-trivial) FDs hold: There is no redundancy here.•Given A B: If A is not a key, then several tuples could have the same A value, and if so, they’ll all have the same B value!•1st Normal Form – all attributes are atomic•1st 2nd (of historical interest) 3rd Boyce-Codd …Boyce-Codd Normal Form (BCNF)•Reln R with FDs F is in BCNF if, for all X A in F+–A X (called a trivial FD), or–X is a superkey for R.•In other words: “R is in BCNF if the only non-trivial FDs over R are key constraints.”•If R in BCNF, then every field of every tuple records information that cannot be inferred using FDs alone.•But, sometimes BCNF too restrictiveThird Normal Form (3NF)•Reln R with FDs F is in 3NF if,
View Full Document