Lecture 13 Project Topics Oct 6 2006 ChengXiang Zhai CS511 Advanced Database Management Systems 1 Traditonal DBMS Topics User Web Forms Applications DBA query transaction Query Parser Transaction Manager Query Rewriter Query Optimizer Lock Manager Logging Recovery Query Executor Files Access Methods Buffer Manager Storage Manager Buffers Lock Tables Main Memory Storage CS511 Advanced Database Management Systems 2 The Next Database Revolution Gray 04 Object Relational Web Services Queues Transactions Workflows Cubes and Online Analytic Processing Data Mining Column Stores Text Temporal and Spatial Data Access Semi Structured Data Stream Processing Publish Subscribe and Replication Late Binding in Query Plans Massive Memory Massive Latency Smart Objects Databases Everywhere Self Managing and Always Up CS511 Advanced Database Management Systems 3 Some Text Mining Results on SIGMOD abstracts Thanks to Qiaozhu Mei for producing the results CS511 Advanced Database Management Systems 4 Major Themes between 1975 1986 Benchmarks Lock crash recovery Query Interface Indexing space utilization Critical utilities Dependencies Database machine CS511 Advanced Database Management Systems Memory resource management Network Model Storage Allocation Data partitions Normal forms Data models Objects Logical Data 5 Major Themes between 1997 2006 Association rule learning Query plan optimization P2P skyline Spatial data mining data mining Enterprise applications Xml documents Websites information integration Synopses approximate answers K nearest neighbors Retrieval similarity CS511 Advanced Database Management Systems Data streams Clustering Aggregation approximation algorithm Time series data Networked data View materialization Data cube Indexing 6 Theme Life Cycles Cold Topics Bursting before 10 years 1200 1000 800 600 400 Rel at i onal Model Al gebra OODB Obj ect Dat abase Loggi ng Recovery 200 0 CS511 Advanced Database Management Systems 7 Theme Life Cycles Post Bursting Topics Bursting within 10 years 1200 1000 800 600 400 Vi ew Mai ntai nence Cl usteri ng Data Cube Sampl i ng 200 0 CS511 Advanced Database Management Systems 8 Theme Life Cycles Bursting Topics Current Bursting 1800 1600 1400 1200 1000 800 600 400 200 0 CS511 Advanced Database Management Systems Sensor dat a XML dat a Web dat a Dat a St reams Ranki ng Top K 9 New Challenges in Databases Traditional Relational Data New Data Type Traditional RDBMS Functions New Data Info Management Functions CS511 Advanced Database Management Systems Traditional Users New Users 10 New Kinds of Data Text data Multimedia data Stream Sensor data Log data Personal data Web Email Blog CS511 Advanced Database Management Systems 11 New Users Everyone CS511 Advanced Database Management Systems 12 New Functions Information integration Navigation Ranking Pattern finding data mining Decision support CS511 Advanced Database Management Systems 13 Topic 1 Database Generation RDBMS is quite useful so let s create more relational data What are some useful databases to generate Some specific ideas UIUC white pages From the UIUC domain extract name url email UIUC yellow pages From the UIUC domain extract college dept building UIUC event DB From the UIUC domain generate event time place URL Master CS course DB From all the CS dept website extract course information CS511 Advanced Database Management Systems 14 Topic 2 Integrated Access to Structured and Unstructured Data Information often exists in both structured form and unstructured form How can we access information in a unified way Ideas How to integrate DBLP CiteSeer and Google Scholar and allow a user to search all of them together How to integrate UIUC DCS web pages with DCS databases e g courses people CS511 Advanced Database Management Systems 15 Topic 3 Schema Last or No Schema Search Standard DB assumes that the user knows about the schema very well What if a user doesn t know much about the schema Can we support keyword queries on databases How can we assist users in formulating structured queries Ideas Start with keywords as we retrieve tuples we may figure out what part of the schema should the user use to refine the results Pick a DB and develop a keyword based search system CS511 Advanced Database Management Systems 16 Topic 4 Ranking DB objects In many DB applications a user is interested in ranking objects How to allow a user to define preferences How to improve the ranking accuracy How to exploit the ranking preferences to do query optimization Ideas Build a wrapper on top of existing web DBs and support more user friendly ranking of items E g book domain realtors cars computers CS511 Advanced Database Management Systems 17 Topic 5 Navigation Support Declarative queries are good is you know exactly what you want In exploratory search a user may want to navigate in the data space How can we extend SQL to support navigation Allow a user to specify configure the navigation needs Implement navigation operators on top of SQL Ideas Pick a complex DB e g in biology domain Provide high level navigation support CS511 Advanced Database Management Systems 18 Topic 6 Expand a DB Given some columns in the current DB add new columns Ideas Expand a product table with reviews of products found on the Web product name vendor review Find UIUC CS Alumni contact information name graduation year contact information Expand course catalogue course title instructors students CS511 Advanced Database Management Systems 19 Topic 7 Best K Queries Top K queries score each tuple independently There is no way to capture global preferences constraints How to minimize redundancy Find books covering AI DB and OS with minimum total cost Generalize Top K to Best K queries A decision theoretic view of DB query Extend SQL to support Best K queries Find efficient algorithms to answer best k queries CS511 Advanced Database Management Systems 20 Topic 8 Interactive ER Graph Mining Fuzzy entity relation graphs are quite common Entities people organizations extracted from text Relations is friend of work for extracted from text A regular RDBMS can only support limited query capabilities How to support complex functions Compare People A People B Path People A People B is friend of How to design a mining language Extend SQL How to implement such a mining language CS511 Advanced Database Management Systems 21 Topic 9 Community Web Portal People form communities People in the same community often share similar information needs interests How
View Full Document