Hiram CPSC 356 - Introduction to Normalization

Unformatted text preview:

Introduction to NormalizationBuilding a SchemaWhat Makes A Good Schema?Tracking Real Estate StaffRedundancy-caused AnomaliesSolving Redundancy ProblemsHow to Decompose?Functional DependenciesExamples of Functional DependenciesWhat are the dependencies?Finding Dependencies in DataCharacteristics of Functional Dependencies for NormalizationKeys & Functional DependencyManipulating Functional DependenciesArmstrong’s Inference Rules for Manipulating DependenciesAdditional Inference RulesWhen are two sets of FDs equivalent?Finding the ClosureClosure ExampleEquivalence TestFinding a KeyWhat is Normalization?Introduction to NormalizationCPSC 356 DatabaseEllen WalkerHiram CollegeBuilding a Schema•Start with a list of all attributes, considered as if you had a giant flat database (one relation) with all possible information in one place•Divide the attributes into multiple relations–Intuitively–According to formal rules (normalization)•This is a formalized alternative to the algorithms we learned beforeWhat Makes A Good Schema?1. Each relation should have clear semantics, i.e. can be easily described in a few words2. Try to avoid redundancy (to minimize storage space, but also to avoid anomalies)3. Avoid a design that encourages too many NULL values in a relation. NULL can be ambiguous: N/A vs. unknown vs. not-yet-entered, etc.4. Don’t split related attributes so that the relationship between them is lost (e.g. make sure LastName and UserID are both in the same relation)Tracking Real Estate Staf•Consider a single relation for real estate•It contains branch name, branch number, staf name, staf number, staf salary, etc.–One entry for each staf member of each branch–Branch information is repeated for diferent staf (REDUNDANCY!)–Staf information is repeated if they work in multiple branches (REDUNDANCY!)•This is an example of what NOT to doRedundancy-caused Anomalies•Insertion Anomalies–A branch with no staf has many NULLs–Entering a new staf member has NULL branch info•But branch number and staf number are both part of primary key! (Why)•Deletion Anomalies–When the last staf member at a branch is deleted, the branch info is lost•Update Anomalies–If we make a change in branch info once, it must be changed in all copies (for all staf).Solving Redundancy Problems•Decompose the relation into multiple relations•Use Foreign Keys so the complete relation can be reconstructed through a join–Branch: has branch number & branch info–Staf: has staf number, staf info & branch number as foreign key•Foreign Keys are exactly the attributes that are in the primary key of the other relation•Insertion, deletion & update anomalies are gone! –Consider: add branch with no staf, remove last staf member, update branch infoHow to Decompose?•Decompositions are not (always) intuitively obvious•Codd discovered mathematical properties (called Normal Forms) that describe “goodness” of decomposition•First, Second, Third normal forms decrease redundancy without loss of information•BCNF, Fourth and Fifth normal forms potentially introduce information loss (we will see…)•To understand normal forms, start with functional dependenciesFunctional Dependencies•If A and B are attributes, and every value of A is associated with exactly one value of B (so knowing A predicts B), then B is functionally dependent on A (We write this as: A->B) •Functional dependency is based on the semantics (meaning) of the attributes.•A->B and B->A are two diferent constraints–Email -> first name is a valid dependency–First name -> Email is not a valid dependencyExamples of Functional Dependencies•US Zip Code -> State•US Area Code -> State•Email -> Firstname, lastname•HotelNo, RoomNo -> Price•JobTitle, ServiceLength -> SalaryWhat are the dependencies?•item place customer-name•ring Kay jewelers prince charming•ring walmart miss piggy•Place -> item?•Item -> place?• oil walmart tin manFinding Dependencies in Data•If a value of attribute A is associated with two or more values of B, then it is not true that A->B.•If a value of attribute A is associated with exactly one value of B, then it might be true that A->B. •Only when every possible value of attribute A is associated with exactly one value of B is it true that A->B.Characteristics of Functional Dependencies for Normalization•For any given values of the attributes on the left, there is exactly one possible attribute on the right•No future data will ever invalidate the dependency •Dependency is nontrivial -- no attributes from the left are repeated on the rightKeys & Functional Dependency•Remember, a candidate key is a subset of attributes that is (guaranteed) unique for every tuple•Therefore, a valid candidate key determines all other attributes in the tuple•Therefore, there is a functional dependency from the candidate key to all other non-key attributes of the relation.•(Since the primary key is a candidate key, these arguments can also be made for primary keys)Manipulating Functional Dependencies•Given a set of dependencies, derive more dependencies using inference rules•The closure X+ of a set of dependencies is the set of all possible dependencies that can be derived from it.Armstrong’s Inference Rules for Manipulating Dependencies1. if Y is a subset of X, then X -> YAlternatively: X,Y -> X(Reflexive)1. If X->Y then X,Z->Y,Z(Augmentation)2. If X->Y and Y->Z then X->Z(Transitive)Additional Inference Rules4. A->A (Self-determination)5. If A->B,C then A->B and A->C (Decomposition)6. If A->B and A->C then A->B,C (Union)7. If A->B and C->D then A,C -> B,D (Composition)When are two sets of FDs equivalent?•When we can use inference rules to transform A to B , then A and B are equivalent•Problem: it might take a long time to find the right set of inference rules•What we need is a “standard form” of FD’s - then we can just compareFinding the Closure•F is a set of functional dependencies (e.g. the obvious ones from primary keys) We want to find X+, which is the set of all attributes that are dependent on X (based on F).X+ = Xrepeat for each dependency Y->Z in F do if Y is a subset of X+ then X+ = X+ union Zuntil no more can be added to X+Closure Example•F is the following set of dependencies:A->B,C C->D A,D -> F•What is A+ (all attributes that can be derived from A)?–Initialize A+ = A–Because A->B,C add B,C to A+–Because C is in A+ and


View Full Document

Hiram CPSC 356 - Introduction to Normalization

Download Introduction to Normalization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Normalization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Normalization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?