DOC PREVIEW
Duke CPS 296.1 - An introduction to XML

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1An introduction to XMLCPS 296.1Topics in Database Systems2From HTML to XML• HTML describes the presentation of the content<h1>Bibliography</h1><p><i>Foundations of Databases</i>Abiteboul, Hull, and Vianu<br>Addison Wesley, 1995…• XML describes only the content<bibliography><book><title>Foundations of Databases</title><author>Abiteboul</author><author>Hull</author><author>Vianu</author><publisher>Addison Wesley</publisher><year>1995</year></book>…</bibliography>! Separation of content from presentation allows the content to be presented easily in different looks3Other nice features of XML• Portability: Just like HTML, you can ship XML data across any platforms– Relational data requires heavy-weight protocols, e.g., JDBC• Flexibility: You can represent any information (structured, semi-structured, documents, …)– Relational data is best suited for structured data• Extensibility: Since data describes its own schema, you can change it easily– Relational schema is rigid and difficult to change • “Publishability”: We need new data models, query languages, query processing and optimization techniques4XML terminology• Tag names: book, title, author• Start tags: <book>, <title>, <author>• End tags: </book>, </title>, </author>• An element is enclosed by a pair of start and end tags: <book>…</book>– Elements can be nested: <book>…<title>…</title>…</book>– Empty elements: <is_textbook></is_textbook>• Can be abbreviated: <is_textbook/>• Elements can also have attributes: <book ISBN=”…” price=”80.00”><bibliography><book ISBN=”10” price=”80.00”><title>Foundations of Databases</title><is_textbook/><author>Abiteboul</author><author>Hull</author><author>Vianu</author><publisher>Addison Wesley</publisher><year>1995</year></book>…</bibliography>5Well-formed XML documentsA well-formed XML document• Follows XML lexical conventions– Wrong: <section>We show that x < 0…</section>– Right: <section>We show that x &lt; 0…</section>• Contains a single root element• Has tags that are properly matched and elements that are properly nested– Right: <section>…<subsection>…</subsection>…</section>– Wrong: <section>…<subsection>…</section>…</subsection>6More XML features• Comments: <!-- Comments here…-->• CDATA: <![CDATA[Tags: <book>, …]]>• ID’s and references– <person id=”o12”><name>Homer</name>…</person><person id=”o34”><name>Marge</name>…</person><person id=”o56” father=”o12” mother=”o34”><name>Bart</name></person>…• Namespaces allow external schemas and qualified names– <book xmlns:myCitationStyle=”http://…/mySchema”><myCitationStyle:title>…</myCitationStyle:title><myCitationStyle:author>…</myCitationStyle:author>…</book>• Processing instructions for apps: <? …java applet…?>• And more…27Valid XML documents• A valid XML document conforms to a Document Type Definition (DTD)– A DTD is optional• A DTD specifies– A grammar for the document– Constraints on structures and values of elements, attributes, etc.<? XML version=”1.0”?><DOCTYPE book [<!ELEMENT book (title, author*, publisher?, section+)><!ATTLIST book ISBN CDATA #REQUIRED><!ATTLIST book year CDATA #IMPLIED><!ELEMENT title (#PCDATA)><!ELEMENT author (#PCDATA)><!ELEMENT section (#PCDATA | title | section)*>]>…8Data models for XML• Graph and tree models used in research!Semistructured model of Lore and TSIMMIS (Stanford)!Ordered tree model of YAT (INRIA)• Document Object Model (DOM)– Object-oriented programming interface for XML• XML Infoset• Data models for various XML query languages!Data model defined by XML Query Working Group for XPath and XQuery9Semistructured data model of Lore• Graph-based, unordered, edge-labeled10Ordered tree model of YAT• Tree-based, ordered, node-labeled, with references11Data model of XML Query WG• Conceptually, also tree-based, ordered, and with references• Functional notation (think ML); no explicit data structure•Example<book price=”10.50”><title>…</title><author>…</author><author>…</author>…</book>"e1= elemNode(qnameValue(null, “book”), { a1}, [e2, e3, e3])a1= attrNode(qnameValue(null, “price”), decimalValue(10.50))e2= elemNode(qnameValue(null, “title”), …)e3= elemNode(qnameValue(null, “author”), …)e4= elemNode(qnameValue(null, “author”), …)12Query languages for XMLReal XMLIdealized XMLSimple graphsNavigation,selectionSPJ,regexpSPJ,regexp,groupingOQL,regexpOQL,conditional,recursionData modelExpressive powerUnQLXPath,XQLXML-QL Lorel YATLXSLT,Quilt (XQuery)For a more accurate comparison seeBonifati and Ceri, SIGMOD Records, 2000313XML-QL! Deutsch, Fernandez, Florescu, Levy, and Suciu. “XML-QL: A Query Language for XML.” WWW, 1999• Data model: a (totally) ordered or a (totally) unordered graph• Query language– WHERE clause to bind variables and test predicates– CONSTRUCT clause to build output XML structures• Features– XML patterns, regexp path expressions– Joins on multiple input sources– Skolem functions for grouping14XML-QL: XML patterns• Retrieve the titles of the books written by Abiteboul before 2000WHERE<bib><book year=$y ISBN=$isbn><title>$t</title><author><lastname>Abiteboul</lastname></author></book></bib> IN ”bib.xml”,$y < 2000CONSTRUCT<resultBook ISBN=$isbn><resultTitle>$t</resultTitle></resultBook>Scan bib.xml:match the patternto obtain all($y, $isbn, $t) bindingsSelect those thatpass the predicateConstruct output foreach ($y, $isbn, $t) bindingobtained in WHERE15XML-QL: joins• Retrieve all reviews for the books written by AbiteboulWHERE<bib><book ISBN=$isbn><author><lastname>Abiteboul</lastname></author></book></bib> IN ”bib.xml”,<reviews><review ISBN=$isbn></review> ELEMENT_AS $e</reviews> in ”reviews.xsl”CONSTRUCT$e16XML-QL: outerjoins (slide 1)• Retrieve the titles of the books written by Abiteboul, together with their reviews, if anyWHERE<bib><book ISBN=$isbn><title>$t</title><author><lastname>Abiteboul</lastname></author></book></bib> IN ”bib.xml”,<reviews><review ISBN=$isbn></review> ELEMENT_AS $e</reviews> in ”reviews.xsl”CONSTRUCT<resultBookWithReview ISBN=$isbn><title>$t</title>$e</resultBookWithReview>What is wrong?If a book has no review,it will not be in the outputBook title appears multiple timesif there are multiple reviews17XML-QL: outerjoins (slide 2)• Use nested queries with outerjoin semanticsWHERE<bib><book


View Full Document

Duke CPS 296.1 - An introduction to XML

Documents in this Course
Lecture

Lecture

18 pages

Lecture

Lecture

6 pages

Lecture

Lecture

13 pages

Lecture

Lecture

5 pages

Load more
Download An introduction to XML
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view An introduction to XML and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view An introduction to XML 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?