1Lecture 10XMLWednesday, October 18, 20062XML Outline• XML (4.6, 4.7)– Syntax– Semistructured data– DTDs3Additional Readings on XMLMain source: www.w3.org (but hard to read)• http://www.w3.org/XML/Strongly recommend readings:• http://www.w3.org/XML/1999/XML-in-10-points• www.zvon.org/xxl/XMLTutorial/General/book_en.htmlFor XPath and XQuery:• http://www.galaxquery.org/4XML• A flexible syntax for data• Used in:– Data exchange– Flexible databases: e.g. property lists– Configuration files: e.g. Web.Config– Document markup: e.g. XHTML• Roots: SGML - a very nasty languageWe will study only XML as data5XML for Data Exchange• Relational data does not have a syntax– I can’t “give” you my relational database– Examples of syntaxes: CSV (comma-separated-values), ASN.1• XML = syntax for data– But XML is not relational: semistructured• Usage:– Export: Database → XML– Transport/transform XML– Import: XML → Databases or application6XML for Databases• Relational databases have rigid schema– Schema evolution is costly• XML is flexible: semistructured data– Store data in XML• Warning: not normal form ! Not even 1NF– Don’t try this at home7From HTML to XMLHTML describes the presentation8HTML<h1> Bibliography </h1><p> <i> Foundations of Databases </i>Abiteboul, Hull, Vianu<br> Addison Wesley, 1995<p> <i> Data on the Web </i>Abiteoul, Buneman, Suciu<br> Morgan Kaufmann, 1999<h1> Bibliography </h1><p> <i> Foundations of Databases </i>Abiteboul, Hull, Vianu<br> Addison Wesley, 1995<p> <i> Data on the Web </i>Abiteoul, Buneman, Suciu<br> Morgan Kaufmann, 19999XML Syntax<bibliography><book> <title> Foundations… </title><author> Abiteboul </author><author> Hull </author><author> Vianu </author><publisher> Addison Wesley </publisher><year> 1995 </year></book>…</bibliography><bibliography><book> <title> Foundations… </title><author> Abiteboul </author><author> Hull </author><author> Vianu </author><publisher> Addison Wesley </publisher><year> 1995 </year></book>…</bibliography>XML describes the content10XML Terminology• tags: book, title, author, …• start tag: <book>, end tag: </book>• elements: <book>…</book>,<author>…</author>• elements are nested• empty element: <red></red> abbrv. <red/>• an XML document: single root elementwell formed XML document: if it has matching tags11More XML: Attributes<book price = “55” currency = “USD”><title> Foundations of Databases </title><author> Abiteboul </author>…<year> 1995 </year></book><book price = “55” currency = “USD”><title> Foundations of Databases </title><author> Abiteboul </author>…<year> 1995 </year></book>12Attributes v.s. Elements<book price = “55” currency = “USD”><title> Foundations of DBs </title><author> Abiteboul </author>…<year> 1995 </year></book><book price = “55” currency = “USD”><title> Foundations of DBs </title><author> Abiteboul </author>…<year> 1995 </year></book>attributes are alternative ways to represent data<book><title> Foundations of DBs </title><author> Abiteboul </author>…<year> 1995 </year><price> 55 </price><currency> USD </currency></book><book><title> Foundations of DBs </title><author> Abiteboul </author>…<year> 1995 </year><price> 55 </price><currency> USD </currency></book>13ComparisonMust be atomicMay be nestedMust be uniqueMay be repeatedUnorderedOrderedAttributesElements14XML v.s. HTML• What are the differences between XML and HTML ?In class15More XML: Oids and References<person id=“o555”> <name> Jane </name> </person><person id=“o456”> <name> Mary </name><mother idref=“o555”/></person><person id=“o555”> <name> Jane </name> </person><person id=“o456”> <name> Mary </name><mother idref=“o555”/></person>oids and references in XML are just syntaxAre just keys/ foreign keys designby someone who didn’t take 444Don’t use them: use your ownforeign keys instead.16More XML: CDATA Section• Syntax: <![CDATA[ .....any text here...]]>• Example:<example> <![CDATA[ some text here </notAtag> <>]]></example><example> <![CDATA[ some text here </notAtag> <>]]></example>17More XML: Entity References• Syntax: &entityname;• Example: <element> this is less than < </element>• Some entities:Unicode char&“"‘'&&>><<18More XML: Processing Instructions• Syntax: <?target argument?>• Example:• What do they mean ?<product> <name> Alarm Clock </name><?ringBell 20?><price> 19.99 </price></product><product> <name> Alarm Clock </name><?ringBell 20?><price> 19.99 </price></product>19More XML: Comments• Syntax <!-- .... Comment text... -->• Yes, they are part of the data model !!!20XML Namespaces• name ::= [prefix:]localpart<book xmlns:isbn=“www.isbn-org.org/def”><title> … </title><number> 15 </number><isbn:number> …. </isbn:number></book><book xmlns:isbn=“www.isbn-org.org/def”><title> … </title><number> 15 </number><isbn:number> …. </isbn:number></book>Means nothing asURL; just a uniquename21<tag xmlns:mystyle = “http://…”>…<mystyle:title> … </mystyle:title><mystyle:number> …</tag><tag xmlns:mystyle = “http://…”>…<mystyle:title> … </mystyle:title><mystyle:number> …</tag>XML Namespaces• syntactic: <number> , <isbn:number>• semantic: provide URL for schemaBelong to this namespace22XML Semantics: a Tree !<data><person id=“o555” ><name> Mary </name><address><street>Maple</street> <no> 345 </no> <city> Seattle </city> </address></person><person><name> John </name><address>Thailand</address><phone>23456</phone></person></data><data><person id=“o555” ><name> Mary </name><address><street>Maple</street> <no> 345 </no> <city> Seattle </city> </address></person><person><name> John </name><address>Thailand</address><phone>23456</phone></person></data>dataMarypersonpersonnameaddressnameaddressstreet no cityMaple 345SeattleJohnThaiphone23456ido555ElementnodeTextnodeAttributenodeOrder matters !!!23XML Data• XML is self-describing• Schema elements become part of the data– Reational schema: persons(name,phone)– In XML <persons>, <name>, <phone> are part of the data, and are repeated many times• Consequence: XML is much more flexible• XML = semistructured data24Mapping Relational Data to XML Data<persons><row> <name>John</name><phone> 3634</phone></row><row> <name>Sue</name><phone> 6343</phone><row> <name>Dick</name><phone> 6363</phone></row></persons><persons><row>
View Full Document