New version page

# SJSU CMPE 226 - NORMALIZATION

Documents in this Course

28 pages

20 pages

10 pages

28 pages

43 pages

23 pages

42 pages

20 pages

36 pages

24 pages

37 pages

15 pages

3 pages

4 pages

71 pages

3 pages

4 pages

52 pages

5 pages

14 pages

33 pages

2 pages

42 pages

20 pages

8 pages

11 pages

5 pages

43 pages

19 pages

8 pages

8 pages

26 pages

3 pages

3 pages

14 pages

49 pages

3 pages

11 pages

6 pages

2 pages

4 pages

18 pages

3 pages

5 pages

2 pages

8 pages

2 pages

6 pages

24 pages

2 pages

45 pages

15 pages

5 pages

6 pages

8 pages

38 pages

3 pages

3 pages

45 pages

15 pages

## This preview shows page 1-2-3 out of 8 pages.

View Full Document
Do you want full access? Go Premium and unlock all 8 pages.
Do you want full access? Go Premium and unlock all 8 pages.
Do you want full access? Go Premium and unlock all 8 pages.

Unformatted text preview:

NORMALIZATIONBy MATTIE MACGREGORFIRST NORMAL FORM (1NF)UNIQUE IDENTIFIERSRELATIONSHIPSSECOND NORMAL FORM (2NF)REFINING RELATIONSHIPSTHIRD NORMAL FORM (3NF)LOGICAL DATA MODELING METHODOLOGYPHYSICAL DATABASE DESIGNNORMALIZATIONBy MATTIE MACGREGORThe aim of normalization is to remove certain kinds of data redundancy, and therefore avoid update anomalies.Update anomalies are difficulties with the insert, update and delete operations on a database due to the data structure.FIRST NORMAL FORM (1NF)An entity is said to be in first normal form (1NF) when all attributes are single-valued.To apply the first normal form to an entity, verify that each attribute in the entity has a single value for each instance of the entity. If any attribute has repeating values, it is not in 1NF..The data model with two entitiesUNIQUE IDENTIFIERSEach entity must have a unique identifier. An identifier is an attribute of an entity that meets the following rules:- It is unique across all instances of the entity. - It has a non-NULL value for each instance of the entity for the lifetime of the instance. - It has a value that never changes for the lifetime of the instance. The unique identifier is important because it designates which instance of an entity is being dealt with. Identifier selection is critical because it is also used to model relationships. If an identifier for an entity doesn't meet one of the above rules, it could affect the whole data model.Designers often invent an identifying attribute that has no other meaning except to serve as an identifying attribute. Because this attribute is invented and is completely unrelatedto the entity, the designer has full control over it and a guarantee that it meets the rules of unique identifiers.A unique identifier is diagrammed as an underlined or italicized attribute..The entities with unique identifiersRELATIONSHIPSEntity identifiers enable their relationships to be modeled.A relationship describes a binary association between two entities. A relationship may also exist between an entity and itself. Such a relationship is a recursive relationship. Each entity within a relationship describes and is described by the other.Each side of the relationship has two components: a name and a degree. Each side of the relationship has a name that describes the relationship. Degree, also referred to as cardinality, states how many instances of the describing entity must describe one instanceof the described entity.Degree is expressed using two different values: "one and only one" (1) and "one or many" (M).Sometimes it is helpful to express a relationship verbally. One way of doing so is to plug the various components of a direction of the relationship into the formula:Entity_A has [one and only one | one or many] Entity_BIn data models, a relationship can be shown by drawing a line between the two entities. Degree is expressed with a straight line for "one and only one" relationships or crows feetfor "one or many" relationships.A relationship between two entitiesSECOND NORMAL FORM (2NF)An entity is said to be in the second normal form (2NF) if it is already in 1NF and all non-identifying attributes are dependent on the entity's entire unique identifier.If any attribute is not dependent entirely upon the entity's unique identifier, that attribute has been misplaced and must be removed. Normalize these attributes either by finding the entity where it belongs or by creating an additional entity where the attribute should be placed..The data model with a new entity.TYPES OF RELATIONSHIPSThere are three main kinds of relationships, and it is important to determine both directions of the relationship when modeling the relationships between entities:- 1-to-1: Both sides of the relationship have a degree of one and only one. (1-to-1 relationships are rare.)- 1-to-M: One of the sides has a degree of one or many and the other side has a degree of one and only one.- M-to-M: Both sides of the relationship are one or many relationships.REFINING RELATIONSHIPSA 1-to-1 relationship may imply that two entities are the same entity. If they are, fold them into a single entity.There is often data to be captured about M-to-M relationships.Entity relationships in the data model.The data model below is another representation of this M-to-M relationship..The real relationship between these three entities in the data model.All many-to-many relationships should be resolved using the following technique:1. Create a new entity (also referred to as a junction entity). Name it appropriately. If you cannot think of an appropriate name for the junction entity, name it by combining the names of the two related entities.2. Relate the new entity to the two original entities. Each of the original entities should have a one-to-many relationship with the junction entity.3. If the new entity does not have an obvious unique identifier, inherit the identifying attributes from the original entities into the junction entity and make them together the unique identifier for the new entity..The data model in 2NF.In almost all cases, additional attributes belong in the new junction entity. If not, the many-to-many relationship still needs to be resolved. Otherwise, a problem will exist translating the data model into a physical schema.THIRD NORMAL FORM (3NF)An entity is said to be in the third normal form if it is already in 2NF and no non-identifying attributes are dependent on any other non-identifying attributes.Attributes that are dependent on other non-identifying attributes are normalized by moving both the dependent attribute and the attribute on which it is dependent into a new entity..The data model in 3NF.When a data model is in 3NF it has been normalized.There are other normal forms that have some value. For most design purposes, however, the third normal form is sufficient to guarantee a proper design.LOGICAL DATA MODELING METHODOLOGYTo review the methodology for completing a logical data model:1. Identify and model the entities. 2. Identify and model the relationships between the entities. 3. Identify and model the attributes. 4. Identify unique identifiers for each entity. 5. Normalize the data model. In practice, the process is rarely so linear. It is often tempting and appropriate to jump around between entities, relationships, attributes and unique identifiers.It is not as important to follow a strict process as it is to discover and capture all of the information necessary to correctly model

View Full Document