Unformatted text preview:

Database table design Single table vs. multiple tablesSlide 2Good things for single table design!Are there any disadvantages with this single table design?Why does the primary key consist of two fields?Redundancy!Problems due to RedundancyWhy waste time?Update anomaliesINSERTION anomalyDeletion anomalyUpdate anomalySlide 13To summarizeHow to address these issues?How to break information into multiple tables?Any information loss due to decomposing of a single big relation into multiple relations?Welcome to CSCI342Database table designSingle table vs. multiple tablesSen Zhang•Why does any nontrivial relational database have many tables?–E-R model design: multiple constructs on ERD will be mapped to multiple tables.–This slides will give you a less formal, but more intuitive explanation.Good things for single table design!•We would like to consider to organize all information into one single flat table.•The good things are obvious for such a single table design: –Simple and straightforward: one for all.–Everything can be found within one table.•You can issue simple select command to retrieve almost all information you need based on this one big single table. •Of course, it can be easily done, because you do not need to look at another table.•So, single table design simplifies query answering.Are there any disadvantages with this single table design?Ssn Name courseid Course name instructorinstructorID8766 John 109 C language James1118766 John 242 Database Stephen1121334 Bill 242 Database Stephen1121111 david 230 Image processingJames 1111345 Bill 242 Database Stephen112What this table tells us is the information about which courses have been enrolled by which students; other fields tell us other satellite information such as instructor’s name for each course. Underscored fields SSN and courseid together indicates a compound key of the table!Why does the primary key consist of two fields?•Neither SSN nor COURSEID alone can suffice to serve as the primary key of the table.–SSN alone does not have a unique value for each row. –COURSEID alone neither.–So we have to find a new primary key - in this case it has to be a compound key since no single attribute can uniquely identify a row. The new primary key is a compound key (COURSEID + SSN).Redundancy!•First of all, it contains redundant data. •For example, not only a student’s ID information, but also his/her name has to be repeated for every course he/she enrolls.•Similarly, the same situation exists for every course.Problems due to Redundancy•Every repetition of the same information is wasting storage space and liable to produce inconsistencies. –Waste space: it is obvious! The wasted space can be easily calculated!–Easily cause inconsistencies. For example, C Language could appear as c language in a different row, but they are supposed to be exactly same. Since same information appear in multiple places, it demands more effort to keep the same information consistent! –Also waste time!Why waste time?•OLTPs are designed for optimal transaction speed. •When a consumer makes a purchase online, they expect the transactions to occur instantaneously. •A database design should record the new data, changes by affecting the least information.Update anomalies•Furthermore, redundant data is the main cause of insertion, deletion, and updating anomalies, what together are called update anomalies. •Update anomalies are problems that arise when information is inserted, deleted, or updated. –Insertion anomaly –Update anomaly –Deletion anomalyINSERTION anomaly•With the primary key including courseid, we cannot enter a new student until they have at least one course to study. We are not allowed NULLs in the primary key so we must have a pair of key value in both SSN and COUSEID before we can create a new record. –For example, a new student (1234) who just enrolls in the college but has not registered with any courses yet cannot be added into the table until he/she registers the first course. The primary key is a compound key (ssn# + courseid#).•This is known as the insertion anomaly. It is difficult to insert new records into the database. •On a practical level, it also means that it is difficult to keep the data up to date.Deletion anomaly•If a course is enrolled by only students, and that student needs to be deleted, then not only is the information about student but also information about the course will disappear. (But what we desire is that, any course should be recorded somewhere , even no any student enrolls it.)•For example, If all of the records for student `8766 ' were deleted from the table, we would inadvertently lose all of the information on the course ‘109’ C Language. Because the only student who registers 109 is 8766, if 8766 is deleted, 109 disappears. Again this problem arises from the need to have a compound primary key. Because we cannot simply keep 109 by replacing 8766 with NULL, remember, SSN and course ID both contribute to key, which does not allow NULL.•This would be the same for any student who was studying only one course and the course was deleted, the student which is supposed to be kept in the table has to be deleted.Update anomaly–If the student 8766’s name was misspelled and we want to update his name, multiple rows (depends how many courses he/she has enrolled with wrong names!) would have to be updated with this new information. •Update anomalies–Traditionally, “update” is an umbrella name for update as well as insert and delete.–The above anomalies are mainly analyzed from the point of view of the needs to update students information. –All the above anomalies happen as well if it is courses that are of concern.To summarize•Why there are insert anomaly?•it may not be possible to store some information unless some other information is stored as well (because both keys have to be known together, but usually not!). •Why there are delete anomaly? •It may not be possible to delete some information without losing some other information as well. •Why there are update anomaly?•if one copy of such repeated data is updated, an inconsistency is created unless all copies are similarly updated.How to address these issues?•To minimize redundancy and address all issues due to redundancy, we can consider to decompose the relation into multiple relations: –Split the table into multiple tables (three table here: one for


View Full Document

Oneonta CSCI 242 - LECTURE NOTE

Download LECTURE NOTE
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LECTURE NOTE and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LECTURE NOTE 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?