Lecture 12 Overview of Post Relational Development Oct 12 2007 ChengXiang Zhai CS511 Advanced Database Management Systems 1 Outline Evolution of data models Post relational research topics CS511 Advanced Database Management Systems 2 Nine Historical Epochs Hierarchical IMS late 1960 s and 1970 s Network CODASYL 1970 s Relational 1970 s and early 1980 s Entity relationship 1970 s Extended relational early 1980 s Semantic late 1970 s and 1980 s Object oriented late 1980 s and early 1990 Object relational late 1980 s and early 1990 Semi structured XML late 1990 s to present CS511 Advanced Database Management Systems 3 Pre Relational Era IMS hierarchical data model Lessons L1 Physical and logical data independence are highly desirable L2 Tree structured data models are very restrictive L3 It is a challenge to provide sophisticated logical reorganization of tree structured data L4 A record at a time user interface forces the programmer to do manual query optimization and this is often hard CODASYL L5 Networks are more flexible than hierarchies but more complex L6 Loading and recovering networks is more complex than hierarchies CS511 Advanced Database Management Systems 4 Relational Era Resolution of relational vs CODASYL is settled by The success of the VAX The non portability of CODASYL engines The complexity of IMS logical data bases Lessons L7 Set a time languages are good regardless of the data model since they offer much improved physical data independence L8 Logical data independence is easier with a simple data model than with a complex one L9 Technical debates are usually settled by the elephants of the marketplace and often for reasons that have little to do with the technology L10 Query optimizers can beat all the best record at a time DBMS application programmers CS511 Advanced Database Management Systems 5 The Entity Relationship Era Proposed in mid 1970 s by Peter Chen Never gained acceptance as the underlying data model implemented by a DBMS No query language Over shadowed by the relational model Looked too much like a cleaned up version of CODASYL But widely successful for DB schema design DB design using normalization was dead in the water It was straightforward to convert an ER diagram into a set of tables in 3rd normal form Lessons L11 Functional dependencies are too difficult for mere mortals to understand Another reason for KISS Keep it simple stupid CS511 Advanced Database Management Systems 6 Extended Relational R Era Beginning in the early 1980 s A sizeable collection of papers of the following template Consider an application call it X Try to implement X on a relational DBMS Show why the queries are difficult or why poor performance is observed Add a new feature to the relational model to correct the problem Valuable contributions Set valued attributes e g available colors of an item Aggregation tuple reference as a data type e g supply PT SR qty price where PT and SR are pointers to tuples Generalization inheritance Lessons L12 Unless there is a big performance or functionality advantage new constructs will go nowhere CS511 Advanced Database Management Systems 7 The Semantic Data Model SDM Era Early 1980 s Motivation relational data model is semantically impoverished can t easily express a class of data of interest Define more general classes allowing multiple inheritance Most SDMs are very complex and were general paper proposals Have the same problems as the R work CS511 Advanced Database Management Systems 8 Object Oriented OO Era Beginning in the mid 1980 s Motivation impedance mismatch between relational DBs and languages like C DBs have their own naming systems data type systems and conventions for returning data as results Need conversions between DB conventions and programming language conventions Like gluing an apple onto a pancake As a result persistent programming language has attracted much attention CS511 Advanced Database Management Systems 9 Persistent Programming Language Characteristics Variables can represent disk based data as well as main memory data DB search criteria language constructs Early prototypes late 1970 s Pascal R Rigel Cleaner than SQL embedding However compiler must be extended with DBMSoriented functionality not very successful No technology transfer CS511 Advanced Database Management Systems 10 Object Oriented Data Bases In the mid 1980 s C triggered resurgence of interest in persistent programming languages Research systems Garden Exodus Startups Ontologic Object Design Versant General goal persistent C Extend C as a data model Any C structure can be persisted Support relationship Application market domain engineering DBs Typically open a large object e g electronic circuit process it exclusively and close it No need for a declarative query language only need to reference objects No fancy transaction management is needed one user at a time Performance has to be competitive with conventional C CS511 Advanced Database Management Systems 11 Current Status of OODB Market never got very large too many vendors competing for a niche market The OODB vendors either have failed or repositioned their companies to offer something else E g Object Design is now Excelon and selling XML services Reasons for the failure For their own market absence of leverage no standard relink the world For competing with Relational DBs lack of transactions low level record at a time with the exception of O2 which embedded a declarative language i e OQL into a programming language Lesson L13 Packages will not sell to users unless they are in major pain CS511 Advanced Database Management Systems 12 The Object Relational Era Motivated by the need for handling geographic data Question How to extend a relational DB to handle new data type The object relational proposal add the following to SQL Postgres User defined data types User defined operators User defined functions and User defined access methods Commercially successful Postgres Illsutra acquired by Informix Lessons L14 The major benefits of OR is two fold putting code in the database thereby blurring the distinction between code and data and user defined access methods L15 Widespread adoption of new technology requires either standards and or elephant pushing hard CS511 Advanced Database Management Systems 13 Semi Structured Data Motivation abundance of semi structured data exchange format Early system Lore Current standards XMLSchema XQuery Two major points Schema last Complex network oriented data
View Full Document