NormalizationBasicsBenefitsBrief Description of Normal FormsAdditional InformationAnomaliesTipsAdditional TipsQuestions to askExamples of Bad DatabasesProblems with the databaseIn PracticeOverall topicsFinal Motto:ReferencesNormalizationBy Albert Lin2BasicsProcess of efficiently organizing data in a database.GoalsEliminate redundant dataEnsure data dependency sensibility3BenefitsFaster sorting and index creationLarger clustered indexesNarrow and compact indexesFewer indexes per table.Fewer null valuesLess inconsistencies4Brief Description of Normal Forms1st Normal FormNo repeating groups2nd Normal FormEach column depends entirely on the primary key.3rd Normal FormEach column depends directly on the primary key.4th Normal FormIsolate independent multiple relationships.5th Normal FormIsolate Semantically related multiple relationships.Boyce-Codd Normal FormNon Trivial dependencies between candidate keys.Optimal Normal FormLimited to simple FactsDomain-Key Normal FormNo modification anomolies5Additional Information1st Normal FormAddresses the structure of an isolated table.2nd, 3rd and Boyce Codd Normal FormAddresses 1-1 and 1-many relationships.4th and 5th Normal Form Many-Many relationships.These forms are cumulative.6AnomaliesInsertion AnomaliesDuplication of data enforced impossible due to entity integrityDeletion AnomaliesLeads to undesired loss of data.Modification AnomaliesModification of multiple rows can lead to inconsistencies.Databases that are sufficiently normalized can reduce anomalies from occurring.7TipsMake a table for each list.Use non-meaningful primary keysEliminate Repeating GroupsEliminate Columns not dependent on Primary KeyEach table should have an identifierShould only store data for a single type of entity.Avoid null columns8Additional TipsRecords are free, new fields are expensive.Know when data requires duplicationUse referential integrity The higher normal form generally results in faster data retrieval.9Questions to askWhat data do you need?What are you going to do with the data?How are the data related?What is the future of the data?10Examples of Bad DatabasesSID Class Time LocationProf_ID0015 CS166 10:30MH226 342151205 CS146 12:00MH225 251230025 CS166 1:30 MH222 342150015 CS151 8:00 MH222 242152531 CS156 9:00 MH422 251241205 CS120 10:30DH251 124122522 CS046 8:00 MH224 342151523 CS140 5:00 MH422 251230015 CS140 3:00 MH422 3421511Problems with the databaseIn a large scale database, information would be repeated numerous times, resulting in redundant dataRetrieval of data would be difficult and longIndex creation would be difficult.12In PracticeMany databases are “de-normalized” to some degree.This is due to performance issues.It may require fewer joins and result in faster retrievals.However, before doing “de-normalization” performance issues must exist and de-normalization must dramatically improve it before introducing a suboptimal designA de-normalized table can be harder to update.13Overall topicsNormalization helps organization and speed of organizing a database, which can help a company produce a database system that is quick and easily accessible.The higher the normal form, the less chance anomalies will arise.In case of performance issues, de-normalization can be done in order to increase performance.Final Motto: Strive for Single Themed Tables.15ReferencesLitt, Steve. “Normalization”. 1996 http://www.troubleshooters.com/littstip/ltnorm.html“Rules of Data Normalization”. 2005. http://www.datamodel.org/NormalizationRules.html“Normalization” http://msdn.microsoft.com/library/default.asp?url=/library/en-us/createdb/cm_8_des_02_2oby.aspChung,Luke. “Database Normalization Tips”. FMS. 2001. http://www.fmsinc.com/tpapers/genaccess/databasenorm.htmlJanert, Phillip. “Practical Database”. IEEE. 2004. http://www.devx.com/ibm/Article/20859Reus, Bernhard. “Databases”. University of Sussex. 2004. http://www.informatics.sussex.ac.uk/users/bernhard/db2005/Slides/dbXIII.pdfWyllys, R.E. “Steps in Normalization” University of Texas at Austin. 2003.
View Full Document