1XPath and XQueryCPS 116Introduction to Database Systems2Announcements (Thu. Oct. 2) Deadline for Homework #2 non-Gradiance part extended to next Tuesday Gradiance part is still due today! Midterm next Thursday in class Sample midterm (from last year) available• Sample solution will be available next TuesdayProject milestone #1 due in 2 weeks!3Query languages for XML XPath Path expressions with conditions)Building block of other standards (XQuery, XSLT, XLink, XPointer, etc.) XQuery XPath + full-fledged SQL-like query language XSLT XPath + transformation templates4Example DTD and XML<?xml version=“1.0”?><!DOCTYPE bibliography [<!ELEMENT bibliography (book+)><!ELEMENT book (title, author*, publisher?, year?, section*)><!ATTLIST book ISBN CDATA #REQUIRED><!ATTLIST book price CDATA #IMPLIED><!ELEMENT title (#PCDATA)><!ELEMENT author (#PCDATA)><!ELEMENT publisher (#PCDATA)><!ELEMENT year (#PCDATA)><!ELEMENT i (#PCDATA)>|<!ELEMENT content (#PCDATA|i)*><!ELEMENT section (title, content?, section*)>]><bibliography><book ISBN=”ISBN-10” price=”80.00”><title>Foundations of Databases</title><author>Abiteboul</author><author>Hull</author><author>Vianu</author><publisher>Addison Wesley</publisher><year>1995</year><section>…</section>…</book>…</bibliography>5XPath XPath specifies path expressions that match XML data by navigating down (and occasionally up and across) the tree Example Query: /bibliography/book/author•Like a UNIX path Result: all author elements reachable from root via the path /bibliography/book/author6Basic XPath constructs/ separator between steps in a pathname matches any child element with this tag name* matches any child element@namematches the attribute with this name@namematches the attribute with this name@* matches any attribute// matches any descendent element or the current element itself. matches the current element.. matches the parent element27Simple XPath examples All book titles/bibliography/book/title All book ISBN numbers/bibliography/book/@ISBNAll titl l t h i th d tAll title elements, anywhere in the document//title All section titles, anywhere in the document//section/title Authors of bibliographical entries (suppose there are articles, reports, etc. in addition to books)/bibliography/*/author8Predicates in path expressions[condition] matches the “current” element if conditionevaluates to true on the current element Books with price lower than $50/bibliography/book[@price<50] XPath will automatically convert the price string to a numeric value for comparison Books with author “Abiteboul”/bibliography/book[author=‘Abiteboul’] Books with a publisher child element/bibliography/book[publisher] Prices of books authored by “Abiteboul”/bibliography/book[author=‘Abiteboul’]/@price9More complex predicatesPredicates can have and’s and or’s Books with price between $40 and $50/bibliography/book[40<=@price and @price<=50] Books authored by “Abiteboul” or those with price yplower than $50/bibliography/book[author=“Abiteboul” or @price<50]10Predicates involving node-sets/bibliography/book[author=‘Abiteboul’] There may be multiple authors, so author in general returns a node-set (in XPath terminology) The predicate evaluates to true as long as it pgevaluates true for at least one node in the node-set, i.e., at least one author is “Abiteboul” Tricky query/bibliography/book[author=‘Abiteboul’ and author!=‘Abiteboul’] Will it return any books?11XPath operators and functionsFrequently used in conditions:x + y, x – y, x * y, x div y, x mod ycontains(x, y) true if string x contains string ycount(node-set) counts the number nodes in node-setposition() returns the “context position” (roughly, the position of the current node in the node-set containing it)last() returns the “context size” (roughly, the size of the node-set containing the current node)name() returns the tag name of the current element12More XPath examples All elements whose tag names contain “section” (e.g., “subsection”)//*[contains(name(), ‘section’)] Title of the first section in each book/bibliography/book/section[position()=1]/title A shorthand: /bibliography/book/section[1]/title Title of the last section in each book/bibliography/book/section[position()=last()]/title Books with fewer than 10 sections/bibliography/book[count(section)<10] All elements whose parent’s tag name is not “book”//*[name()!=‘book’]/*313A tricky example Suppose that price is a child element of book, and there may be multiple prices per book Books with some price in range [20, 50] How about:/bibliography/book[price >= 20 and price <= 50] Correct answer:/bibliography/book[price[. >= 20 and . <= 50]]14De-referencing IDREF’sid(identifier) returns the element with identifier Suppose that books can reference other books<section><title>Introduction</title>XML is a hot topic these days; see <bookref ISBN=“ISBN-10”/> for more details…/</section>Find all references to books written by “Abiteboul” in the book with “ISBN-10”/bibliography/book[@ISBN=‘ISBN-10’]//bookref[id(@ISBN)/author=‘Abiteboul’]Or simply: id("ISBN-10")//bookref[id(@ISBN)/author="Abiteboul"]15General XPath location steps Technically, each XPath query consists of a series of location steps separated by / Each location step consists of An axis: one of self, attribute, parent, child, ancestor,†ancestor-or-self,†descendent, descendent-or-self, followingfollowingsiblingpreceding†precedingfollowing, following-sibling, preceding,†preceding-sibling,†and namespace A node-test: either a name test (e.g., book, section, *) or a type test (e.g., text(), node(), comment()), separated from the axis by :: Zero of more predicates (or conditions) enclosed in square brackets†These reverse axes produce result node-sets in reverse document order; others (forward axes) produce node-sets in document order16Example of verbose syntaxVerbose (axis, node test, predicate):/child::bibliography/child::book[attribute::ISBN=‘ISBN-10’]/descendent-or-self::node()/child::title/child::titleAbbreviated:/bibliography/book[@ISBN=‘ISBN-10’]//title child is the default axis // stands for /descendent-or-self::node()/17One more example Which of the following queries correctly find the third author in the entire input
View Full Document