XMLXML for Representing DataXML vs Other Data ModelsSemi-structured Data ExplainedSemistructured Data ExplainedXML Data v.s. E/R, ODL, RelationalData Sharing with XML: Easy Exporting Relational Data to XMLExport data grouped by companiesThe DTDExport Data by ProductsWhich One Do We Choose ?Storing XML DataXML Query LanguagesAn Example of XML DataXPathXPath: Simple ExpressionsXPath: Restricted Kleene ClosureXpath: Text NodesXpath: WildcardXpath: Attribute NodesXpath: QualifiersXpath: More QualifiersSlide 24Xpath: SummaryXpath: More DetailsThe Root and the RootSlide 28Slide 29XQueryFLWR (“Flower”) ExpressionsSlide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38FOR v.s. LETSlide 40Collections in XQuerySlide 42Sorting in XQuerySlide 44If-Then-ElseExistential QuantifiersUniversal QuantifiersOther Stuff in XQueryXMLMay 1st, 2002XML for Representing Data<persons><row> <name>John</name> <phone> 3634</phone></row> <row> <name>Sue</name> <phone> 6343</phone> <row> <name>Dick</name> <phone> 6363</phone></row></persons><persons><row> <name>John</name> <phone> 3634</phone></row> <row> <name>Sue</name> <phone> 6343</phone> <row> <name>Dick</name> <phone> 6363</phone></row></persons>n a m e p h o n eJ o h n 3 6 3 4S u e 6 3 4 3D i c k 6 3 6 3row row rowname name namephone phonephone“John” 3634 “Sue” “Dick”6343 6363personsXML:personsXML vs Other Data Models•XML is self-describing•Schema elements become part of the data–Relational schema: persons(name,phone)–In XML <persons>, <name>, <phone> are part of the data, and are repeated many times•Consequence: XML is much more flexible•XML = semistructured dataSemi-structured Data Explained•Missing attributes:•Repeated attributes<person> <name> John</name> <phone>1234</phone> </person><person> <name>Joe</name></person><person> <name> John</name> <phone>1234</phone> </person><person> <name>Joe</name></person> no phone !<person> <name> Mary</name> <phone>2345</phone> <phone>3456</phone></person><person> <name> Mary</name> <phone>2345</phone> <phone>3456</phone></person> two phones !Semistructured Data Explained•Attributes with different types in different objects•Nested collections (no 1NF)•Heterogeneous collections:–<db> contains both <book>s and <publisher>s<person> <name> <first> John </first> <last> Smith </last> </name> <phone>1234</phone></person><person> <name> <first> John </first> <last> Smith </last> </name> <phone>1234</phone></person> structured name !XML Data v.s. E/R, ODL, Relational•Q: is XML better or worse ?•A: serves different purposes–E/R, ODL, Relational models:•For centralized processing, when we control the data–XML:•Data sharing between different systems•we do not have control over the entire data•E.g. on the Web•Do NOT use XML to model your data ! Use E/R, ODL, or relational instead.Data Sharing with XML: Easy Data source(e.g. relationalDatabase)ApplicationWebXMLExporting Relational Data to XML•Product(pid, name, weight)•Company(cid, name, address)•Makes(pid, cid, price)product companymakesExport data grouped by companies<db><company> <name> GizmoWorks </name> <address> Tacoma </address> <product> <name> gizmo </name> <price> 19.99 </price> </product> <product> …</product> …</company><company> <name> Bang </name> <address> Kirkland </address> <product> <name> gizmo </name> <price> 22.99 </price> </product> …</company>…</db><db><company> <name> GizmoWorks </name> <address> Tacoma </address> <product> <name> gizmo </name> <price> 19.99 </price> </product> <product> …</product> …</company><company> <name> Bang </name> <address> Kirkland </address> <product> <name> gizmo </name> <price> 22.99 </price> </product> …</company>…</db>Redundantrepresentationof productsThe DTD<!ELEMENT db (company*)><!ELEMENT company (name, address, product*)><!ELEMENT product (name,price)><!ELEMENT name (#PCDATA)><!ELEMENT address (#PCDATA)><!ELEMENT price (#PCDATA)><!ELEMENT db (company*)><!ELEMENT company (name, address, product*)><!ELEMENT product (name,price)><!ELEMENT name (#PCDATA)><!ELEMENT address (#PCDATA)><!ELEMENT price (#PCDATA)>Export Data by Products<db> <product> <name> Gizmo </name> <manufacturer> <name> GizmoWorks </name> <price> 19.99 </price> <address> Tacoma </address> </manufacturer> <manufacturer> <name> Bang </name> <price> 22.99 </price> <address> Kirkland </address> </manufacturer> … </product> <product> <name> OneClick </name> …</db><db> <product> <name> Gizmo </name> <manufacturer> <name> GizmoWorks </name> <price> 19.99 </price> <address> Tacoma </address> </manufacturer> <manufacturer> <name> Bang </name> <price> 22.99 </price> <address> Kirkland </address> </manufacturer> … </product> <product> <name> OneClick </name> …</db>RedundantRepresentationof companiesWhich One Do We Choose ?•The structure of the XML data is determined by agreement, with our partners, or dictated by committees•XML Data is often nested, irregular, etc•No normal forms for XML Storing XML Data•We got lots of XML data from the Web, how do we store it ?•Ideally: convert to relational data, store in RDBMS•Much harder than exporting relations to XML (why ?)•DB Vendors currently work on tools for loading XML data into an RDBMSXML Query Languages•Xpath •XML-QL•XqueryAn Example of XML Data<bib><book> <publisher> Addison-Wesley </publisher> <author> Serge Abiteboul </author> <author>
View Full Document