Misc TopicsTopicsDatabase System ArchitecturesSlide 4Parallel DatabasesSlide 6Slide 7Slide 8Distributed SystemsNext…MotivationHistoryObject-Relational: ExampleExampleSlide 15An Alternative: OODBMSOODBMSSummary, cont.XMLSlide 20AttributesAttributes Vs. SubelementsNamespacesDocument Type Definition (DTD)Bank DTDIDs and IDREFsBank DTD with AttributesXML data with ID and IDREF attributesQuerying and Transforming XML DataTree Model of XML DataXPathFunctions in XPathMore XPath FeaturesXSLTXSLT TemplatesCreating XML OutputXQueryFLWR Syntax in XQueryJoinsXML: SummarySlide 41OLAPData WarehousesData MiningInformation RetrievalMisc TopicsMisc TopicsAmol DeshpandeAmol DeshpandeCMSC424CMSC424TopicsTopicsTodayDatabase system architecturesClient-serverParallel and Distributed SystemsObject Oriented, Object RelationalXMLOLAPData WarehousesInformation RetrievalDatabase System ArchitecturesDatabase System ArchitecturesCentralized single-userClient-Server ArchitecturesConnected over a network typicallyBack-end: manages the databaseFront-end(s): Forms, report-writes, sqlplusHow they talk to each other ?ODBC:–Interface standard for talking to the server in CJDBC: –In JavaTransaction servers vs. data serversDatabase System ArchitecturesDatabase System ArchitecturesParallel DatabasesParallel DatabasesWhy ?More transactions per second, or less time per queryThroughput vs. Response TimeSpeedup vs. ScaleupDatabase operations are embarrassingly parallelE.g. Consider a join between R and S on R.b = S.b But, perfect speedup doesn’t happenStart-up costsInterferenceSkewParallel DatabasesParallel DatabasesShared-nothing vs. shared-memory vs. shared-diskParallel DatabasesParallel DatabasesDistributed transactions are complicated (deadlock detection etc);Transactions complicated; natural fault-tolerance.Cache-coherency an issueNotes Main useScalability ?Communication between processorsEverywhereNot used very oftenLow degrees of parallelismVery very scalableNot very scalable (disk interconnect is the bottleneck)Not beyond 32 or 64 or so (memory bus is the bottleneck)Over a LAN, so slowestDisk interconnect is very fastExtremely fastShared NothingShared DiskShared MemoryParallel DatabasesParallel DatabasesShared Memory Shared Disk Shared NothingCommunication between processorsExtremely fast Disk interconnect is very fastOver a LAN, so slowestScalability ?Not beyond 32 or 64 or so (memory bus is the bottleneck)Not very scalable (disk interconnect is the bottleneck)Very very scalableNotes Cache-coherency an issueTransactions complicated; natural fault-tolerance.Distributed transactions are complicated (deadlock detection etc);Main useLow degrees of parallelismNot used very oftenEverywhereDistributed SystemsDistributed SystemsOver a wide area networkTypically not done for performance reasonsFor that, use a parallel systemDone because of necessityImagine a large corporation with offices all over the worldAlso, for redundancy and for disaster recovery reasonsLot of headachesEspecially if trying to execute transactions that involve data from multiple sitesKeeping the databases in sync– 2-phase commit for transactions uniformly hatedAutonomy issues–Even within an organization, people tend to be protective of their unit/departmentLocks/Deadlock managementWorks better for query processingSince we are only reading the dataNext…Next…Object oriented, Object relational, XMLMotivationMotivationRelational model:Clean and simpleGreat for much enterprise dataBut lot of applications where not sufficiently richMultimedia, CAD, for storing set data etcObject-oriented models in programming languagesComplicated, but very usefulSmalltalk, C++, now JavaAllow Complex data typesInheritanceEncapsulationPeople wanted to manage objects in databases.HistoryHistoryIn the 1980’s and 90’s, DB researchers recognized benefits of objects. Two research thrusts:OODBMS: extend C++ with transactionally persistent objectsNiche MarketCAD etcORDBMS: extend Relational DBs with object featuresMuch more commonEfficiency + ExtensibilitySQL:99 supportPostgres – First ORDBMS Berkeley research projectBecame Illustra, became Informix, bought by IBMObject-Relational: ExampleObject-Relational: ExampleCreate User Defined Types (UDT)CREATE TYPE BarType AS (name CHAR(20),addr CHAR(20));CREATE TYPE BeerType AS (name CHAR(20),manf CHAR(20));CREATE TYPE MenuType AS (bar REF BarType,beer REF BeerType,price FLOAT);Create Tables of UDTsCREATE TABLE Bars OF BarType;CREATE TABLE Beers OF BeerType;CREATE TABLE Sells OF MenuType;ExampleExampleQuerying:SELECT * FROM Bars;Produces “tuples” such as:BarType(’Joe’’s Bar’, ’Maple St.’)Another query:SELECT bb.name(), bb.addr()FROM Bars bb;Inserting tuples:SET newBar = BarType();newBar.name(’Joe’’s Bar’);newBar.addr(’Maple St.’);INSERT INTO Bars VALUES(newBar);ExampleExampleUDT’s can be used as types of attributes in a tableCREATE TYPE AddrType AS (street CHAR(30),city CHAR(20),zip INT);CREATE TABLE Drinkers (name CHAR(30),addr AddrType,favBeer BeerType);Find the beers served by Joe:SELECT ss.beer()->nameFROM Sells ssWHERE ss.bar()->name = ’Joe’’s Bar’;An Alternative: OODBMSAn Alternative: OODBMSPersistent OO programmingImagine declaring a Java object to be “persistent”Everything reachable from that object will also be persistentYou then write plain old Java code, and all changes to the persistent objects are stored in a databaseWhen you run the program again, those persistent objects have the same values they used to have!Solves the “impedance mismatch” between programming languages and query languagesE.g. converting between Java and SQL types, handling rowsets, etc.But this programming style doesn’t support declarative queriesFor this reason (??), OODBMSs haven’t proven popularOQL: A declarative language for OODBMSsWas only implemented by one vendor in France (Altair)OODBMSOODBMSCurrently a Niche MarketEngineering, spatial databases, physics etc…Main issues:Navigational accessPrograms specify go to this object, follow this pointerNot declarativeThough advantageous when you know exactly what you want, not a good idea in general Kinda similar argument
View Full Document