Lecture 14 Overview of Post Relational Development Oct 13 2006 ChengXiang Zhai CS511 Advanced Database Management Systems 1 New Challenges in Databases Traditional Relational Data New Data Type Traditional RDBMS Functions New Data Info Management Functions CS511 Advanced Database Management Systems Traditional Users New Users 2 New Kinds of Data Text data Multimedia data Scientific data Sensor data Log data Personal data Web Email Blog CS511 Advanced Database Management Systems Ranking in DB Schema Lean Last Semi structured data model Complex object indexing Stream data Data mining Data integration Internet computing applications 3 New Users Everyone CS511 Advanced Database Management Systems 4 New Functions Information integration Navigation Ranking Pattern finding data mining Decision support CS511 Advanced Database Management Systems New More general Data Model Architecture Object Oriented New Algorithms Adding intelligence to DB 5 New Computing Environment Distributed computing Networks Internet Mobile devices cell phones PDAs Distributed DB Peer to Peer P2P DB Mobile DB CS511 Advanced Database Management Systems 6 Web Changes Everything Observations Publishing of data is almost free many are simultaneously producer and consumer Web is becoming a huge database of distributed data online published by everyone of autonomous databases online Trends static HTML pages dynamic pages presenting DB HTML XML for better describing structured data Slide from Kevin Chang s presentation CS511 Advanced Database Management Systems 7 Web Changes Everything What are needed Content producers tools for building huge data store Content consumers tools for discovering and querying info on the web Slide from Kevin Chang s presentation CS511 Advanced Database Management Systems 8 Database Technology Timeline Simple Data Management Global Enterprise Management Early 80s Late 80s Early Relational Prerelational Simple OLTP Simple transactions on line backup recovery Early Mid 90s Client server Relational Active Database Stored procedures triggers Enterprise capable Relational Data Warehouse Hi end OLTP Scaleable OLTP parallel query partitioning cluster support row level locking high availability Late 90s 21st C Packaged Vertical Applications Support for all types of data extensibility objects Slide from Anil Nori s presentation CS511 Advanced Database Management Systems Internet Computing Middleware messaging queues events Java CORBA Web interfaces 9 Current State of DBMSs OLTP applications Large amounts of data Simple data simple queries and updates Update statement from debit credit transaction UPDATE accounts SET abalance abalance delta WHERE aid aid Typically update intensive Large number of concurrent users transactions Data warehousing applications Large amounts of data Simple data but complex querying Typically read intensive Large number of users Slide from Anil Nori s presentation CS511 Advanced Database Management Systems Current State of DBMSs These applications require Large users transactions High performance High availability 7x24 operations Scalability High levels of security Administrative support Good utilities Slide from Anil Nori s presentation CS511 Advanced Database Management Systems Internet Applications Challenges Transaction Processing Data Warehousing Users Larger User Populations Trained Self Service Analysts Size Network Systems Independent Integrated Gigabytes Systems Management Simple Intelligent Global Terabytes Usage Batch Operations Hours Local Every Employee Immediate Importance Useful BusinessCritical Slide from Anil Nori s presentation CS511 Advanced Database Management Systems 12 Internet Applications Challenges Information Management Type Tabular Heterogeneous Generic Open Applications Standalone Integrated Personalized Site Operation Access Lots of read only Management Low TCO Mission Critical Content Direct APIs Proprietary Delivery Read write E commerce Apps Search Availability Occasional CS511 Advanced Database Management Systems Slide from Anil Nori s presentation 24X7 13 Internet Challenges Availability Scalability Security Need near 100 availability Must be easy to manage Replication hot standby foolproof system Number of users is orders of magnitude higher Global users Managing millions of users Encryption Performance Internet user expectations Speed vs correctness e g Search engines vs blade cartridge extender Availability vs correctness CS511 Advanced Database Management Systems Slide from Anil Nori s presentation14 Selected Current Topics Text Database and Information Retrieval Ranking in Databases Data Integration P2P Databases Data Warehousing OLAP Data Mining Stream Data Processing Web Services Semi Structured Data XML CS511 Advanced Database Management Systems 15 Today s Topic Evolution of data models Object oriented DBs vs Object relational DBs XML revolution CS511 Advanced Database Management Systems 16 Nine Historical Epochs Hierarchical IMS late 1960 s and 1970 s Network CODASYL 1970 s Relational 1970 s and early 1980 s Entity relationship 1970 s Extended relational early 1980 s Semantic late 1970 s and 1980 s Object oriented late 1980 s and early 1990 Object relational late 1980 s and early 1990 Semi structured XML late 1990 s to present CS511 Advanced Database Management Systems 17 Pre Relational Era IMS hierarchical data model Lessons L1 Physical and logical data independence are highly desirable L2 Tree structured data models are very restrictive L3 It is a challenge to provide sophisticated logical reorganization of tree structured data L4 A record at a time user interface forces the programmer to do manual query optimization and this if often hard DODASYL L5 Networks are more flexible than hierarchies but more complex L6 Loading and recovering networks is more complex than hierarchies CS511 Advanced Database Management Systems 18 Relational Era Resolution of relational vs CODASYL is settled by The success of the VAX The non portability of CODASYL engines The complexity of IMS logical data bases Lessons L7 Set a time languages are good regardless of the data model since they offer much improved physical data independence L8 Logical data independence is easier with a simple data model than with a complex one L9 Technical debates are usually settled by the elephants of the marketplace and often for reasons that have little to do with the technology L10 Query optimizers can beat all the best record at a time DBMS application programmers
View Full Document