Misc Topics Amol Deshpande CMSC424 Topics Today Database system architectures Client server Parallel and Distributed Systems Object Oriented Object Relational XML OLAP Data Warehouses Information Retrieval Database System Architectures Centralized single user Client Server Architectures Connected over a network typically Back end manages the database Front end s Forms report writes sqlplus How they talk to each other ODBC Interface standard for talking to the server in C JDBC In Java Transaction servers vs data servers Database System Architectures Parallel Databases Why More transactions per second or less time per query Throughput vs Response Time Speedup vs Scaleup Database operations are embarrassingly parallel E g Consider a join between R and S on R b S b But perfect speedup doesn t happen Start up costs Interference Skew Parallel Databases Shared nothing vs shared memory vs shared disk Parallel Databases Shared Memory Shared Disk Disk interconnect is very fast Shared Nothing Communication between processors Extremely fast Scalability Not beyond 32 or Not very scalable Very very 64 or so memory disk interconnect scalable bus is the is the bottleneck bottleneck Notes Cache coherency Transactions an issue complicated natural faulttolerance Distributed transactions are complicated deadlock detection etc Main use Low degrees of parallelism Everywhere Not used very often Over a LAN so slowest Parallel Databases Shared Memory Communication between processors Scalability Notes Main use Shared Disk Shared Nothing Extremely fast Disk interconnect is very fast Over a LAN so slowest Not beyond 32 or 64 or so memory bus is the bottleneck Cache coherency an issue Not very scalable Very very disk interconnect scalable is the bottleneck Transactions complicated natural faulttolerance Distributed transactions are complicated deadlock detection etc Low degrees of parallelism Not used very often Everywhere Distributed Systems Over a wide area network Typically not done for performance reasons For that use a parallel system Done because of necessity Imagine a large corporation with offices all over the world Also for redundancy and for disaster recovery reasons Lot of headaches Especially if trying to execute transactions that involve data from multiple sites Keeping the databases in sync 2 phase commit for transactions uniformly hated Autonomy issues Even within an organization people tend to be protective of their unit department Locks Deadlock management Works better for query processing Since we are only reading the data Next Object oriented Object relational XML Motivation Relational model Clean and simple Great for much enterprise data But lot of applications where not sufficiently rich Multimedia CAD for storing set data etc Object oriented models in programming languages Complicated but very useful Smalltalk C now Java Allow Complex data types Inheritance Encapsulation People wanted to manage objects in databases History In the 1980 s and 90 s DB researchers recognized benefits of objects Two research thrusts OODBMS extend C with transactionally persistent objects Niche Market CAD etc ORDBMS extend Relational DBs with object features Much more common Efficiency Extensibility SQL 99 support Postgres First ORDBMS Berkeley research project Became Illustra became Informix bought by IBM Object Relational Example Create User Defined Types UDT CREATE TYPE BarType AS name CHAR 20 addr CHAR 20 CREATE TYPE BeerType AS name CHAR 20 manf CHAR 20 CREATE TYPE MenuType AS bar REF BarType beer REF BeerType price FLOAT Create Tables of UDTs CREATE TABLE Bars OF BarType CREATE TABLE Beers OF BeerType CREATE TABLE Sells OF MenuType Example Querying SELECT FROM Bars Produces tuples such as BarType Joe s Bar Maple St Another query SELECT bb name bb addr FROM Bars bb Inserting tuples SET newBar BarType newBar name Joe s Bar newBar addr Maple St INSERT INTO Bars VALUES newBar Example UDT s can be used as types of attributes in a table CREATE TYPE AddrType AS street CHAR 30 city CHAR 20 zip INT CREATE TABLE Drinkers name CHAR 30 addr AddrType favBeer BeerType Find the beers served by Joe SELECT ss beer name FROM Sells ss WHERE ss bar name Joe s Bar An Alternative OODBMS Persistent OO programming Imagine declaring a Java object to be persistent Everything reachable from that object will also be persistent You then write plain old Java code and all changes to the persistent objects are stored in a database When you run the program again those persistent objects have the same values they used to have Solves the impedance mismatch between programming languages and query languages E g converting between Java and SQL types handling rowsets etc But this programming style doesn t support declarative queries For this reason OODBMSs haven t proven popular OQL A declarative language for OODBMSs Was only implemented by one vendor in France Altair OODBMS Currently a Niche Market Engineering spatial databases physics etc Main issues Navigational access Programs specify go to this object follow this pointer Not declarative Though advantageous when you know exactly what you want not a good idea in general Kinda similar argument as network databases vs relational databases Summary cont ORDBMS offers many new features but not clear how to use them schema design techniques not well understood No good logical design theory for non 1st normal form query processing techniques still in research phase a moving target for OR DBA s OODBMS Has its advantages Niche market Lot of similarities to XML as well XML Extensible Markup Language Derived from SGML Standard Generalized Markup Language Similar to HTML but HTML is not extensible Extensible can add new tags etc Emerging as the wire format data interchange format XML bank 1 customer customer name Hayes customer name customer street Main customer street customer city Harrison customer city account account number A 102 account number branch name Perryridge branch name balance 400 balance account account account customer bank 1 Attributes Elements can have attributes account acct type checking account number A 102 account number branch name Perryridge branch name balance 400 balance account Attributes are specified by name value pairs inside the starting tag of an element An element may have several attributes but each attribute name can only occur once account acct type checking monthly fee 5 Attributes Vs Subelements Distinction between subelement and attribute In the context of
View Full Document
Unlocking...