Slide 1DatabasesRelational Database DesignMovies Database SchemaSlide 5Slide 6Slide 7Slide 8DesiredataApproachFunctional DependenciesFunctional DependenciesFunctional DependenciesDefinitionsApproachBCNF: Boyce-Codd Normal FormBCNF and RedundancyApproachBCNFOutline1. ClosureAdditional rulesExample2. Closure of an attribute setComputing the closure for AExampleUses of attribute set closures3. Extraneous Attributes4. Canonical CoverOutlineLoss-less DecompositionsDependency-preserving DecompositionsDependency-preserving DecompositionsOutlineBCNFAchieving BCNF SchemasExample 1Example 2-1Example 2-2Example 3OutlineBCNF may not preserve dependenciesBCNF may not preserve dependenciesOutline3NF3NF3NF and RedundancyDecomposing into 3NFOutlineBCNF and redundancyMulti-valued DependenciesOutline4NFComparing the normal formsDatabase design processRecapRecapRecapCMSC424: NormalizationInstructor: Amol Deshpande [email protected]Data Models◦Conceptual representation of the dataData Retrieval◦How to ask questions of the database◦How to answer those questionsData Storage◦How/where to store data, how to access itData Integrity◦Manage crashes, concurrency◦Manage semantic inconsistenciesDatabasesWhere did we come up with the schema that we used ?◦E.g. why not store the actor names with movies ?If from an E-R diagram, then:◦Did we make the right decisions with the E-R diagram ?Goals:◦Formal definition of what it means to be a “good” schema.◦How to achieve it.Relational Database DesignMovie(title, year, length, inColor, studioName, producerC#)StarsIn(movieTitle, movieYear, starName)MovieStar(name, address, gender, birthdate)MovieExec(name, address, cert#, netWorth)Studio(name, address, presC#)Movies Database SchemaMovie(title, year, length, inColor, studioName, producerC#, starName)<StarsIn merged into above>MovieStar(name, address, gender, birthdate)MovieExec(name, address, cert#, netWorth)Studio(name, address, presC#)Changed to:Is this a good schema ???Is this a good schema ???Title Year Length inColor StudioName prodC# StarNameStar wars 1977 121 Yes Fox 128 HamillStar wars 1977 121 Yes Fox 128 FisherStar wars 1977 121 Yes Fox 128 H. FordKing Kong 2005 187 Yes Universal 150 WattsKing Kong 1933 100 no RKO 20 FayIssues:1.Redundancy higher storage, inconsistencies (“anomalies”) update anomalies, insertion anamolies2.Need nulls Unable to represent some information without using nulls How to store movies w/o actors (pre-productions etc) ?Movie(title, year, length, inColor, studioName, producerC#, starName)Issues:3. Avoid sets - Hard to represent- Hard to queryMovie(title, year, length, inColor, studioName, producerC#, starNames)Title Year Length inColor StudioName prodC# StarNamesStar wars 1977 121 Yes Fox 128 {Hamill, Fisher, H. ford}King Kong 2005 187 Yes Universal 150 WattsKing Kong 1933 100 no RKO 20 FayName AddressFox Address1Studio2 Address1Universial Address2This process is also called “decomposition”Issues:4. Requires more joins (w/o any obvious benefits)5. Hard to check for some dependenciesWhat if the “address” is actually the presC#’s address ? No easy way to ensure that constraint (w/o a join).Split Studio(name, address, presC#) into:Studio1 (name, presC#) Studio2(name, address)???Name presC#Fox 101Studio2 101Universial 102Smaller schemas always good ????Smaller schemas always good ????movieTitle starNameStar Wars HamillKing Kong WattsKing Kong FayeIssues:6. “joining” them back results in more tuples than what we started with(King Kong, 1933, Watts) & (King Kong, 2005, Faye) This is a “lossy” decompositionWe lost some constraints/information The previous example was a “lossless” decomposition.Decompose StarsIn(movieTitle, movieYear, starName) into:StarsIn1(movieTitle, movieYear) StarsIn2(movieTitle, starName) ???movieTitle movieYearStar wars 1977King Kong 1933King Kong 2005Smaller schemas always good ????Smaller schemas always good ????No setsCorrect and faithful to the original design◦Avoid lossy decompositionsAs little redundancy as possible◦To avoid potential anomaliesNo “inability to represent information”◦Nulls shouldn’t be required to store informationDependency preservation◦Should be possible to check for constraintsDesiredataNot always possible. We sometimes relax these for: simpler schemas, and fewer joins during queries.Not always possible. We sometimes relax these for: simpler schemas, and fewer joins during queries.1. We will encode and list all our knowledge about the schema◦Functional dependencies (FDs) SSN name (means: SSN “implies” length)◦If two tuples have the same “SSN”, they must have the same “name” movietitle length ???? Not true. ◦But, (movietitle, movieYear) length --- True.2. We will define a set of rules that the schema must follow to be considered good◦“Normal forms”: 1NF, 2NF, 3NF, BCNF, 4NF, …◦A normal form specifies constraints on the schemas and FDs3. If not in a “normal form”, we modify the schema ApproachLet R be a relation schema and R and RThe functional dependency holds on R iff for any legal relations r(R), whenever two tuples t1 and t2 of r have same values for , they have same values for . t1[] = t2 [] t1[ ] = t2 [ ] Example:On this instance, A B does NOT hold, but B A does hold.Functional Dependencies141 53 7A BFunctional DependenciesDifference between holding on an instance and holding on all legal relationTitle Year holds on this instanceIs this a true functional dependency ? No.Two movies in different years can have the same name.Can’t draw conclusions based on a single instanceNeed to use domain knowledge to decide which FDs holdTitle Year Length inColor StudioName prodC# StarNameStar wars 1977 121 Yes Fox 128 HamillStar wars 1977 121 Yes Fox 128 FisherStar wars 1977 121 Yes Fox 128 H. FordKing Kong 1933 100 no RKO 20 FayFunctional dependencies and keys◦A key constraint is a specific form of a FD.◦E.g. if A is a superkey for R, then:A R◦Similarly for candidate keys and primary keys.Deriving FDs◦A set of FDs may imply other FDs◦ e.g. If A B, and B C, then clearly A C◦We will see a formal method for inferring this laterFunctional Dependencies1. A relation instance r satisfies a set of functional
View Full Document