DOC PREVIEW
Duke CPS 116 - XML-Relational Mapping

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1XML-Relational MappingCPS 116Introduction to Database Systems2Announcements (October 28) Homework #2 has been graded Homework #3 out last Thursday; due next Thursday Project milestone #2 due in 2 weeksj Check your email for feedback3Approaches to XML processing Text files (!) Specialized XML DBMS Lore (Stanford), Strudel (AT&T), Timber (Michigan), MonetDB/XQuery (CWI, Netherlands), Tamino (Software AG), eXist, Sedna, Apache XIndice, XML:DB API initiative… Still a long way to go Object-oriented DBMS ObjectStore, ozone, … Not as mature as relational DBMS Relational (and object-relational) DBMS Middleware and/or object-relational extensions4Mapping XML to relational Store XML in a CLOB (Character Large OBject) column Simple, compact Full-text indexing can help (often provided by DBMS vendors as object-relational “extensions”) Poor integration with relational query processingUd iUpdates are expensive Alternatives? Schema-oblivious mapping:well-formed XML → generic relational schema• Node/edge-based mapping for graphs• Interval-based mapping for trees• Path-based mapping for trees Schema-aware mapping:valid XML → special relational schema based on DTD5Node/edge-based: schema Element(eid, tag) Attribute(eid, attrName, attrValue) Attribute order does not matter ElementChild(eid, pos, child)ifi h d i f hildKey: (eid, attrName)Keys: (eid, pos), (child) pos specifies the ordering of children child references either Element(eid) or Text(tid) Text(tid, value) tid cannot be the same as any eid) Need to “invent” lots of id’s) Need indexes for efficiency, e.g., Element(tag), Text(value)6Node/edge-based: example<bibliography><book ISBN=”ISBN-10” price=”80.00”><title>Foundations of Databases</title><author>Abiteboul</author><author>Hull</author><author>Vianu</author><publisher>Addison Wesley</publisher><year>1995</year></book>…</bibliography>Element ElementChildeid pos childe0 1 e1e1 1 e2e1 2 e3e1 3 e4e1 4 e5eid tage0 bibliographye1 booke2 titlee3 authore4 authorAttributeTe x te1 5 e6e1 6 e7e2 1 t0e3 1 t1e4 1 t2e5 1 t3e6 1 t4e7 1 t5e5 authore6 publishere7 yeartid valuet0 Foundations of Databasest1 Abiteboult2 Hullt3 Vianut4 Addison Wesleyt5 1995eid attrName attrValuee1 ISBN ISBN-10e1 price 8027Node/edge-based: simple paths //title SELECT eid FROM Element WHERE tag = ‘title’;//section/title SELECT e2.eidFROM Element e1, ElementChild c, Element e2WHERE e1.tag = ‘section’AND e2.tag = ‘title’AND e1.eid = c.eidAND c.child = e2.eid;)Path expression becomes joins! Number of joins is proportional to the length of the path expression8Node/edge-based: more complex paths //bibliography/book[author=“Abiteboul”]/@price SELECT a.attrValueFROM Element e1, ElementChild c1,Element e2, Attribute aWHERE e1.tag = ‘bibliography’AND e1.eid = c1.eid AND c1.child = e2.eidAND e2 tag =‘book’AND e2.tag = bookAND e2.eid = a.eidAND a.attrName = ‘price’;AND EXISTS (SELECT * FROM ElementChild c2,Element e3, ElementChild c3, Text tWHERE e2.eid = c2.eid AND c2.child = e3.eidAND e3.tag = ‘author’AND e2.eid = c3.eid AND c3.child = t.tidAND t.value = ‘Abiteboul’)9Node/edge-based: descendent-or-self //book//title Requires SQL3 recursion WITH ReachableFromBook(id) AS((SELECT eid FROM Element WHERE tag = ‘book’)UNION ALL((SELECT c.childFROM ReachableFromBook r, ElementChild cWHERE r.eid = c.eid))SELECT eidFROM ElementWHERE eid IN (SELECT * FROM ReachableFromBook)AND tag = ‘title’;10Interval-based: schema Element(left, right, level, tag) left is the start position of the element right is the end position of the element level is the nesting depth of the element (strictly speaking, unnecessary) Key is left Text(left, right, level, value) Key is left Attribute(left, attrName, attrValue) Key is (left, attrName)11Interval-based: example1<bibliography>2<book ISBN=”ISBN-10” price=”80.00”>3<title>4Foundations of Databases</title>56<author>7Abiteboul</author>89<author>10Hull</author>1112<author>13Vianu</author>1415<publisher>16Addison Wesley</publisher>1718<year>191995</year>20</book>21…</bibliography>999bibliographybook1,999,12,21,2) Where did ElementChild go? E1 is the parent of E2 iff:[E1.left, E1.right] ⊃ [E2.left, E2.right], andE1.level = E2.level –1title author author author publisher year3,5,3 6,8,3 9,11,3 12,14,3 15,17,3 18,20,312Interval-based: queries //section/title SELECT e2.leftFROM Element e1, Element e2WHERE e1.tag = ‘section’ AND e2.tag = ‘title’AND e1.left < e2.left AND e2.right < e1.rightAND e1.level = e2.level-1;;)Path expression becomes “containment” joins!• Number of joins is proportional to path expression length //book//title SELECT e2.leftFROM Element e1, Element e2WHERE e1.tag = ‘book’ AND e2.tag = ‘title’AND e1.left < e2.left AND e2.right < e1.right;)No recursion!313Summary of interval-based mapping Path expression steps become containment joins No recursion needed for descendent-or-self Comprehensive XQuery-SQL translation is possible14A path-based mappingLabel-path encoding Element(pathid, left, right, …), Path(pathid, path), … path is a label path starting from the root Why are left and right still needed?To preserve structureElementPathppathid left right ...1 1 999 ...2 2 21 ...3 3 5 ...4 6 8 ...4 9 11 ...4 12 14 ...... ... ... ...pathid path1 /bibliography2 /bibliography/book3 /bibliography/book/title4 /bibliography/book/author... ...15Label-path encoding: queries Simple path expressions with no conditions//book//title Perform string matching on Path Join qualified pathid’s with Element//book[publisher=‘Prentice Hall’]/title//book[publisher Prentice Hall ]/title  Evaluate //book/title Evaluate //book/publisher[text()=‘Prentice Hall’] How to ensure title and publisher belong to the same book?)Path expression with attached conditions needs to be broken down, processed separately, and joined back16Another path-based mappingDewey-order encoding Each component of the id represents the order of the child within its parent Unlike label-path, this encoding is “lossless”bibliographybooktitle author author author publisher year11.11.1.1 1.1.2 1.1.3 1.1.4 1.1.5 1.1.6Element(dewey_pid, tag)Te x t (dewey_pid, value)Attribute(dewey_pid, attrName, attrValue)17Dewey-order encoding: queries


View Full Document

Duke CPS 116 - XML-Relational Mapping

Documents in this Course
Part I

Part I

8 pages

XSLT

XSLT

4 pages

XSLT

XSLT

8 pages

Part I

Part I

8 pages

XSLT

XSLT

8 pages

Load more
Download XML-Relational Mapping
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view XML-Relational Mapping and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view XML-Relational Mapping 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?