Announcements• Today– Relational Data Model (Chapter 5)•Read– Chapter 5 and Sections 6.0-6.5•Exam– Monday Oct 16, in classMisc. Thoughts on Indexes• For a frequently updated file, B+Tree index is almost always superior to a sorted file– For small amount of space, we get all the advantages of sorted files plus efficient insertion and deletion• On average B+trees are ~ 67% full• Bulk loading index Vs incremental construction• B+trees Vs other multilevel indexes– MLI blocks typically sequentially allocated, B+Treeusually not• Key Compression – internal nodes of B+Treeneed not store full values• Indexes on multiple fieldsData Model• A data model is an abstraction that describes how data is represented and used– Examples: • Hierarchical data model (file systems, LDAP)• Network data model (legacy DBMSes)• Relational data model (most modern DBMSes)• Object data model (data + programs)• A DBMS typically supports one data model• The process of data modelinginvolves creating an instanceof a data model for some enterpriseExample LDAP directory(an instance of hierarchical data model)The hierarchical model has difficulty representing 1 to many relationshipsExample Instance of Relational Data ModelRelation• The key construct for representing data in RDM is the relation(informally, a table)A relation schema describes a relation• R(A1, A2, ..., An)• EMPLOYEE(Fname, Mint, Lname, SSN,...,Dno)The relation nameAttributes (ie, columns) of the relationSchema for relation ofdegree (or airity) nFormal Definitions - Schema•The Schema (or description) of a Relation:– Denoted by R(A1, A2, .....An)– R is the name of the relation– The attributes of the relation are A1, A2, ..., An• Example:CUSTOMER (Cust-id, Cust-name, Address, Phone#)– CUSTOMER is the relation name– Defined over the four attributes: Cust-id, Cust-name, Address, Phone#• Each attribute has a domain or a set of valid values. – For example, the domain of Cust-id is 6 digit numbers.Formal Definitions - Domain•A domain has a logical definition:– Example: “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S.• A domain also has a data-type or a format defined for it.– The USA_phone_numbers may have a format: (ddd)ddd-dddd where each d is a decimal digit.– Dates have various formats such as year, month, date formatted as yyyy-mm-dd, or as dd mm,yyyy etc.• The attribute name designates the role played by a domain in a relation:– Used to interpret the meaning of the data elements corresponding to that attribute– Example: The domain Date may be used to define two attributes named “Invoice-date” and “Payment-date” with different meaningsFormal Definitions - Tuple•A tuple is an ordered set of values (enclosed in angled brackets ‘< … >’)• Each value is derived from an appropriate domain.• A row in the CUSTOMER relation is a 4-tuple and would consist of four values, for example:– <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000">– This is called a 4-tuple as it has 4 values– A tuple (row) in the CUSTOMER relation.• A relation is a set of such tuples (rows)Relation• The key construct for representing data in RDM is the relation(informally, a table)Cartesian Product• (ignoring NULL values) each valid tuple in R is an element of the Cartesian product of R’s domains. A relation state is a subset of the CP• Example: – D1 = {“red”, “green”, “blue”} D2 = {“small”, “large}– D1 x D2 = { (“red”, “small”), (“red”, ”large”), (“green”, “small”, (“green”, “large”),(“blue”, “small”), (“blue”, “large”) }tuplesPicture + Questions• How big is the Cartesian product of N domains?• How many possible relation states are there?Formal Definitions - Summary• Formally,– Given R(A1, A2, .........., An)– r(R) ⊂ dom (A1) X dom (A2) X ....X dom(An)• R(A1, A2, …, An) is the schema of the relation•R is the name of the relation• A1, A2, …, An are the attributes of the relation• r(R): a specific state (or "value" or “population”) of relation R – this is a set of tuples (rows)– r(R) = {t1, t2, …, tn} where each ti is an n-tuple– ti = <v1, v2, …, vn> where each vj element-of dom(Aj)Formal Definitions - Example• Let R(A1, A2) be a relation schema:– Let dom(A1) = {0,1}– Let dom(A2) = {a,b,c}• Then: dom(A1) X dom(A2) is all possible combinations:{<0,a> , <0,b> , <0,c>, <1,a>, <1,b>, <1,c> } • The relation state r(R) ⊂ dom(A1) X dom(A2)• For example: r(R) could be {<0,a> , <0,b> , <1,c> }– this is one possible state (or “population” or “extension”) r of the relation R, defined over A1 and A2.– It has three 2-tuples: <0,a> , <0,b> , <1,c>Notation• Notation:– We refer to component values of a tuple t by:• t[Ai] or t.Ai• This is the value vi of attribute Ai for tuple t– Similarly, t[Au, Av, ..., Aw] refers to the subtuple of t containing the values of attributes Au, Av, ..., Aw, respectively in tDefinition SummaryState of the RelationPopulated TableSchema of a RelationTable DefinitionTupleRowDomainAll possible Column ValuesAttributeColumn HeaderRelationTableFormal TermsInformal TermsCharacteristics Of Relations• Ordering of tuples in a relation r(R):– The tuples are not considered to be ordered, even though they appear to be in the tabular form.• Ordering of attributes in a relation schema R (and of values within each tuple):– We will consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> to be ordered .• (However, a more general alternative definition of relation does not require this ordering).ConstraintsConstraints• A key aspect of RDM is the ability to impose constraintson the database state• A constraint on a single relation places restrictions on valid relation states– Examples: • two students can’t have same student ID number–Example of key constraint• Student name cannot be NULL – Domain constraints (implicit)Key Constraints• Superkey of R: – Is a set of attributes SK of R with the following condition:• No two tuples in any valid relation state r(R) will have the same value for SK• That is, for any distinct tuples t1 and t2 in r(R), t1[SK] ≠t2[SK]• This condition must hold in any valid state r(R)• Key of R:– A "minimal" superkey– That is, a key is a superkey K such that removal of any attribute from K
View Full Document