DOC PREVIEW
GSU CIS 8040 - A simple guide to five normal forms in relational database theory.William Kent

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

COMPUTING PRACTICES ASlMPLE GUIDE TO FIVE NORMAL FORMS IN RELATIONAL DATABASE THEORY W|LL|AM KErr International Business Machines Corporation Author's Present Address: William Kent, International Business Machines Corporation, General Products Division, Santa Teresa Laboratory, San Jose, CA Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1983 ACM 0001-0782/83/ 0200-0120 75¢ 1. INTRODUCTION The normal forms defined in relational database theory represent guidelines for record design. The guidelines cor- responding to first through fifth normal forms are pre- sented, in terms that do not require an understanding of relational theory. The design guidelines are meaningful even if a relational database system is not used. We pres- ent the guidelines without referring to the concepts of the relational model in order to emphasize their generality and to make them easier to understand. Our presentation conveys an intuitive sense of the intended constraints on record design, although in its informality it may be impre- cise in some technical details. A comprehensive treatment of the subject is provided by Date [4]. The normalization rules are designed to prevent up- date anomalies and data inconsistencies. With respect to performance trade-offs, these guidelines are biased to- ward the assumption that all nonkey fields will be up- dated frequently. They tend to penalize retrieval, since data which may have been retrievable from one record in an unnormalized design may have to be retrieved from several records in the normalized form. There is no obli- gation to fully normalize all records when actual perform- ance requirements are taken into account. 2. FIRST NORMAL FORM First normal form [1] deals with the "shape" of a record type. Under first normal form, all occurrences of a record type must contain the same number of fields. First normal form excludes variable repeating fields and groups. This is not so much a design guideline as a matter of defini- tion. Relational database theory does not deal with rec- ords having a variable number of fields. 3. SECOND AND THIRD NORMAL FORMS Second and third normal forms [2, 3, 7] deal with the relationship between nonkey and key fields. Under sec- ond and third normal forms, a nonkey field must provide a fact about the key, the whole key, and nothing but the SUMMARY: The concepts behind the five principal normal forms in relational database theory are presented in simple terms. 120 Communications of the ACM February 1983 Volume 26 Number 2key. In addition, the record must satisfy first normal form. We deal now only with "single-valued" facts. A single- valued fact could be a one-to-many relationship such as the department of an employee or a one-to-one relation- ship such as the spouse of an employee. Thus, the phrase "Y is a fact about X" signifies a one-to-one or one-to- many relationship between Y and X. In the general case, Y might consist of one or more fields and so might X. In the following example, QUANTITY is a fact about the combination of PART and WAREHOUSE. 3.1 Second Normal Form Second normal form is violated when a nonkey field is a fact about a subset of a key. It is only relevant when the key is composite, i.e., consists of several fields. Consider the following inventory record. I PART I WAREHOUSE QUANTITY ] WAREHOUSE-ADDRESS ..... key .............. The key here consists of the PART and WAREHOUSE fields together, but WAREHOUSE-ADDRESS is a fact about the WAREHOUSE alone. The basic problems with this design are: • The warehouse address is repeated in every record that refers to a part stored in that warehouse. • If the address of the warehouse changes, every record referring to a part stored in that warehouse must be up- dated. • Because of the redundancy, the data might become in- consistent, with different records showing different ad- dresses for the same warehouse. • If at some point in time there are no parts stored in the warehouse, there may be no record in which to keep the warehouse's address. To satisfy second normal form, the record shown above should be decomposed into (replaced by) the two records: .... key ................... [WAREHOUSE[WAREHOUSE-ADDRESS ..... key ..... When a data design is changed in this way, i.e., replacing unnormalized records with normalized records, the process is referred to as normalization. The term "normalization" is sometimes used relative to a particular normal form. Thus, a set of records may be normalized with respect to second normal form but not with respect to third. The normalized design enhances the integrity of the data by minimizing redundancy and inconsistency, but at some possible performance cost for certain retrieval appli- cations. Consider an application that wants the addresses of all warehouses stocking a certain part. In the unnor- malized form, the application searches one record type. With the normalized design, the application has to search two record types and connect the appropriate pairs. 3.2 Third Normal Form Third normal form is violated when a nonkey field is a fact about another nonkey field, as in [ EMPLOYEE I DEPARTMENT I LOCATION ..... key ...... The EMPLOYEE field is the key. If each department is located in one place, then the LOCATION field is a fact about the DEPARTMENT--in addition to being a fact about the EMPLOYEE. The problems with this design are the same as those caused by violations of second normal form. • The department's location is repeated in the record of every employee assigned to that department. • If the location of the department changes, every such record must be updated. • Because of the redundancy, the data might become in- consistent, e.g., different records showing different loca- tions for the same department. • If a department has no employees, there may be no record in which to keep the department's location. To satisfy third normal form, the record shown above should be decomposed into the two records: EMPLOYEE DEPARTMENT DEPARTMENT LOCATION .... key .......... key ...... To summarize, a record is in second and third


View Full Document

GSU CIS 8040 - A simple guide to five normal forms in relational database theory.William Kent

Download A simple guide to five normal forms in relational database theory.William Kent
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view A simple guide to five normal forms in relational database theory.William Kent and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A simple guide to five normal forms in relational database theory.William Kent 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?