DOC PREVIEW
Duke CPS 116 - Relational Mapping

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1XML-Relational MappingCPS 116Introduction to Database Systems2Announcements (October 28) Homework #2 has been graded Homework #3 out last Thursday; due next Thursday Project milestone #2 due in 2 weeksj Check your email for feedback3Approaches to XML processing Text files (!) Specialized XML DBMS Lore (Stanford), Strudel (AT&T), Timber (Michigan), MonetDB/XQuery (CWI, Netherlands), Tamino (Software AG), eXist, Sedna, Apache XIndice, XML:DB API initiative… Still a long way to go Object-oriented DBMS ObjectStore, ozone, … Not as mature as relational DBMS Relational (and object-relational) DBMS Middleware and/or object-relational extensions24Mapping XML to relational Store XML in a CLOB (Character Large OBject) column Simple, compact Full-text indexing can help (often provided by DBMS vendors as object-relational “extensions”) Alternatives? Schema-oblivious mapping:well-formed XML → generic relational schema• Node/edge-based mapping for graphs• Interval-based mapping for trees• Path-based mapping for trees Schema-aware mapping:valid XML → special relational schema based on DTD5Node/edge-based: schema Element(eid, tag) Attribute(eid, attrName, attrValue) Attribute order does not matter ElementChild(eid, pos, child)ifi h d i f hildKey:Keys: pos specifies the ordering of children child references either Element(eid) or Text(tid) Text(tid, value) tid cannot be the same as any eid) Need to “invent” lots of id’s) Need indexes for efficiency, e.g., Element(tag), Text(value)6Node/edge-based: example<bibliography><book ISBN=”ISBN-10” price=”80.00”><title>Foundations of Databases</title><author>Abiteboul</author><author>Hull</author><author>Vianu</author><publisher>Addison Wesley</publisher><year>1995</year></book>…</bibliography>Element ElementChildeid pos childe0 1 e1e1 1 e2e1 2 e3e1 3 e4e1 4 e5eid tage0 bibliographye1 booke2 titlee3 authore4 authorAttributeTe x te1 5 e6e1 6 e7e2 1 t0e3 1 t1e4 1 t2e5 1 t3e6 1 t4e7 1 t5e5 authore6 publishere7 yeartid valuet0 Foundations of Databasest1 Abiteboult2 Hullt3 Vianut4 Addison Wesleyt5 1995eid attrName attrValuee1 ISBN ISBN-10e1 price 8037Node/edge-based: simple paths //title SELECT eid FROM Element WHERE tag = ‘title’;//section/title SELECT e2.eidFROM Element e1, ElementChild c, Element e2WHERE e1.tag = ‘section’AND e2.tag = ‘title’AND e1.eid = c.eidAND c.child = e2.eid;)Path expression becomes8Node/edge-based: more complex paths //bibliography/book[author=“Abiteboul”]/@price SELECT a.attrValueFROM Element e1, ElementChild c1,Element e2, Attribute aWHERE e1.tag = ‘bibliography’AND e1.eid = c1.eid AND c1.child = e2.eidAND e2 tag =‘book’AND e2.tag = bookAND e2.eid = a.eidAND a.attrName = ‘price’;AND EXISTS (SELECT * FROM ElementChild c2,Element e3, ElementChild c3, Text tWHERE e2.eid = c2.eid AND c2.child = e3.eidAND e3.tag = ‘author’AND e2.eid = c3.eid AND c3.child = t.tidAND t.value = ‘Abiteboul’)9Node/edge-based: descendent-or-self //book//title Requires SQL3 recursion WITH ReachableFromBook(id) AS((SELECT eid FROM Element WHERE tag = ‘book’)UNION ALL((SELECT c.childFROM ReachableFromBook r, ElementChild cWHERE r.eid = c.eid))SELECT eidFROM ElementWHERE eid IN (SELECT * FROM ReachableFromBook)AND tag = ‘title’;410Interval-based: schema Element(left, right, level, tag) left is the start position of the element right is the end position of the element level is the nesting depth of the element (strictly speaking, unnecessary) Key is Text(left, right, level, value) Key is Attribute(left, attrName, attrValue) Key is11Interval-based: example1<bibliography>2<book ISBN=”ISBN-10” price=”80.00”>3<title>4Foundations of Databases</title>56<author>7Abiteboul</author>89<author>10Hull</author>1112<author>13Vianu</author>1415<publisher>16Addison Wesley</publisher>1718<year>191995</year>20</book>21…</bibliography>999bibliographybook1,999,12,21,2) Where did ElementChild go? E1 is the parent of E2 iff:title author author author publisher year3,5,3 6,8,3 9,11,3 12,14,3 15,17,3 18,20,312Interval-based: queries //section/title SELECT e2.leftFROM Element e1, Element e2WHERE e1.tag = ‘section’ AND e2.tag = ‘title’AND e1.left < e2.left AND e2.right < e1.rightAND e1.level = e2.level-1;;)Path expression becomes “containment” joins!• Number of joins is proportional to path expression length //book//title SELECT e2.leftFROM Element e1, Element e2WHERE e1.tag = ‘book’ AND e2.tag = ‘title’AND e1.left < e2.left AND e2.right < e1.right;)No recursion!513Summary of interval-based mapping Path expression steps become containment joins No recursion needed for descendent-or-self Comprehensive XQuery-SQL translation is possible14A path-based mappingLabel-path encoding Element(pathid, left, right, …), Path(pathid, path), … path is a label path starting from the root Why are left and right still needed?ElementPathpathid left right ...1 1 999 ...2 2 21 ...3 3 5 ...4 6 8 ...4 9 11 ...4 12 14 ...... ... ... ...pathid path1 /bibliography2 /bibliography/book3 /bibliography/book/title4 /bibliography/book/author... ...15Label-path encoding: queries Simple path expressions with no conditions//book//title Perform string matching on Path Join qualified pathid’s with Element//book[publisher=‘Prentice Hall’]/title//book[publisher Prentice Hall ]/title616Another path-based mappingDewey-order encoding Each component of the id represents the order of the child within its parent Unlike label-path, this encoding is “lossless”bibliographybooktitle author author author publisher year11.11.1.1 1.1.2 1.1.3 1.1.4 1.1.5 1.1.6Element(dewey_pid, tag)Te x t (dewey_pid, value)Attribute(dewey_pid, attrName, attrValue)17Dewey-order encoding: queries Examples://title//section/title//book//title//b k[ bli h ‘P ti H ll’]/titl//book[publisher=‘Prentice Hall’]/title Works similarly as interval-based mapping• Except parent/child and ancestor/descendant relationship are checked by prefix matching Serves a different purpose from label-path encoding Any advantage over interval-based mapping?18Schema-aware mapping Idea: use DTD to design a better schema Basic approach: elements of the same type go into one table Tag name → table name Attributes → columns•If one


View Full Document

Duke CPS 116 - Relational Mapping

Documents in this Course
Part I

Part I

8 pages

XSLT

XSLT

4 pages

XSLT

XSLT

8 pages

Part I

Part I

8 pages

XSLT

XSLT

8 pages

Load more
Download Relational Mapping
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Relational Mapping and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Relational Mapping 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?