1 Lectures 6-7: Database Design Friday, April 9 and Monday April 12, 2010 Dan Suciu -- 444 Spring 20102 Outline • Design theory: 3.1-3.4 Dan Suciu -- 444 Spring 20103 Schema Refinements = Normal Forms • 1st Normal Form = all tables are flat • 2nd Normal Form = obsolete • Boyce Codd Normal Form = will study • 3rd Normal Form = see book Dan Suciu -- 444 Spring 20104 First Normal Form (1NF) • A database schema is in First Normal Form if all tables are flat Name GPA Courses Alice 3.8 Bob 3.7 Carol 3.9 Math DB OS DB OS Math OS Student Name GPA Alice 3.8 Bob 3.7 Carol 3.9 Student Course Math DB OS Student Course Alice Math Carol Math Alice DB Bob DB Alice OS Carol OS Takes Course May need to add keys5 Relational Schema Design Person buys Product name price name ssn Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies Dan Suciu -- 444 Spring 20106 Data Anomalies When a database is poorly designed we get anomalies: Redundancy: data is repeated Updated anomalies: need to change in several places Delete anomalies: may lose data when we don’t want Dan Suciu -- 444 Spring 20107 Relational Schema Design Anomalies: • Redundancy = repeat data • Update anomalies = Fred moves to “Bellevue” • Deletion anomalies = Joe deletes his phone number: what is his city ? Recall set attributes (persons with several phones): Name SSN PhoneNumber City Fred 123-45-6789 206-555-1234 Seattle Fred 123-45-6789 206-555-6543 Seattle Joe 987-65-4321 908-555-2121 Westfield One person may have multiple phones, but lives in only one city8 Relation Decomposition Break the relation into two: Name SSN City Fred 123-45-6789 Seattle Joe 987-65-4321 Westfield SSN PhoneNumber 123-45-6789 206-555-1234 123-45-6789 206-555-6543 987-65-4321 908-555-2121 Anomalies have gone: • No more repeated data • Easy to move Fred to “Bellevue” (how ?) • Easy to delete all Joe’s phone number (how ?) Name SSN PhoneNumber City Fred 123-45-6789 206-555-1234 Seattle Fred 123-45-6789 206-555-6543 Seattle Joe 987-65-4321 908-555-2121 Westfield9 Relational Schema Design (or Logical Design) Main idea: • Start with some relational schema • Find out its functional dependencies • Use them to design a better relational schema Dan Suciu -- 444 Spring 201010 Functional Dependencies • A form of constraint – hence, part of the schema • Finding them is part of the database design • Also used in normalizing the relations Dan Suciu -- 444 Spring 201011 Functional Dependencies Definition: If two tuples agree on the attributes then they must also agree on the attributes Formally: A1, A2, …, An B1, B2, …, Bm A1, A2, …, An B1, B2, …, Bm Dan Suciu -- 444 Spring 201012 When Does an FD Hold Definition: A1, ..., Am B1, ..., Bn holds in R if: ∀t, t’ ∈ R, (t.A1=t’.A1 ∧ ... ∧ t.Am=t’.Am ⇒ t.B1=t’.B1 ∧ ... ∧ t.Bn=t’.Bn ) A1 ... Am B1 ... nm if t, t’ agree here then t, t’ agree here t t’ R13 Examples EmpID Name, Phone, Position Position Phone but not Phone Position An FD holds, or does not hold on an instance: EmpID Name Phone Position E0045 Smith 1234 Clerk E3542 Mike 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 Lawyer Dan Suciu -- 444 Spring 201014 Example Position Phone EmpID Name Phone Position E0045 Smith 1234 Clerk E3542 Mike 9876 ← Salesrep E1111 Smith 9876 ← Salesrep E9999 Mary 1234 Lawyer Dan Suciu -- 444 Spring 201015 Example EmpID Name Phone Position E0045 Smith 1234 → Clerk E3542 Mike 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 → Lawyer but not Phone Position Dan Suciu -- 444 Spring 201016 Example FD’s are constraints: • On some instances they hold • On others they don’t name category color department price Gizmo Gadget Green Toys 49 Tweaker Gadget Green Toys 99 Does this instance satisfy all the FDs ? name color category department color, category price Dan Suciu -- 444 Spring 201017 Example name category color department price Gizmo Gadget Green Toys 49 Tweaker Gadget Black Toys 99 Gizmo Stationary Green Office-supp. 59 What about this one ? name color category department color, category price Dan Suciu -- 444 Spring 201018 An Interesting Observation If all these FDs are true: name color category department color, category price Then this FD also holds: name, category price Why ?? Dan Suciu -- 444 Spring 201019 Goal: Find ALL Functional Dependencies • Anomalies occur when certain “bad” FDs hold • We know some of the FDs • Need to find all FDs, then look for the bad ones Dan Suciu -- 444 Spring 201020 Armstrong’s Rules (1/3) Is equivalent to Splitting rule and Combing rule A1 ... Am B1 ... Bm A1, A2, …, An B1, B2, …, Bm A1, A2, …, An B1 A1, A2, …, An B2 . . . . . A1, A2, …, An Bm Dan Suciu -- 444 Spring 201021 Armstrong’s Rules (2/3) Trivial Rule Why ? A1 … Am where i = 1, 2, ..., n A1, A2, …, An Ai Dan Suciu -- 444 Spring 201022 Armstrong’s Rules (3/3) Transitive Closure Rule If and then Why ? A1, A2, …, An B1, B2, …, Bm B1, B2, …, Bm C1, C2, …, Cp A1, A2, …, An C1, C2, …, Cp Dan Suciu -- 444 Spring 201023 A1 … Am B1 … Bm C1 ... Cp Dan Suciu -- 444 Spring 201024 Example (continued) Start from the following FDs: Infer the following FDs: 1. name color 2. category department 3. color, category price Inferred FD Which Rule did we apply ? 4. name, category name 5. name, category color 6. name, category category 7. name, category color, category 8. name, category price25 Example (continued) Answers: Inferred FD Which Rule did we apply ? 4. name, category name Trivial rule 5. name, category color Transitivity on 4, 1 6. name, category category Trivial rule 7. name, category color, category Split/combine on 5, 6 8. name, category price Transitivity on 3, 7 1. name color 2. category department 3. color, category price THIS IS TOO HARD ! Let’s see an easier way.26 Closure of a set of Attributes Given a set of attributes A1, …, An The closure, {A1, …, An}+ = the set of attributes B
View Full Document