DOC PREVIEW
Rutgers University CS 336 - Lecture Notes

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Principles of Information and Database Management198:336Week 8 – Mar 28Matthew StoneXML – MotivationsSemi-structured data– Relaxing traditional schema– Storing more complex objectsStandardized data– Using reference schemas for interoperability– “Meta-data” – language for data descriptionWeb data– Supported in protocols for information exchangeOutlineXML – overviewXML data representationsXML and standardization– XML namespaces– XML resource description frameworkXML and the web–XHTML– Cascading style sheets and XSLTXMLeXtensible Markup Language– “File format” for giving partial structure to text documents.– Based on the use of paired tags to give a tree structure to the document.Tags in XMLWork like parentheses…[(5 + 7) * 3]²But make category of structure explicitpower(product(sum(5,7), 3), 2)Tree visualizationpowerbase exponentproductsumvalue valuevaluevalue57322Basic tag syntax<tag> – open a tag</tag> – close a tag Example becomes<power><base><product><sum><value>5</value><value>7</value></sum><value>3</value></product></base><exponent><value>2</value></exponent></power>Storing data in XMLRelational data – Combines schema and tuples togetherExample– Schemastudent(id:integer, name:string, email:string)– Tuple(65, “Teddy Salad”, [email protected]”)Storing relational data in XMLIn XML, encode table<student>…</student>Storing relational data in XMLThen columns…<student><id> … </id><name> … </name><email> … </email></student>Storing relational data in XMLThen values…<student><id>65</id><name>Teddy Salad</name><email>[email protected]</email></student>3Storing relational data in XMLFor whole tables, just repeat<tableOfStudents><student><id>64</id><name>Anne Elk</name><email>[email protected]</email></student><student><id>65</id><name>Teddy Salad</name>…Storing data in XMLText data– Elements can be freeform text– Elements can be further “marked up” to indicate presentation or structureStoring text data in XMLthe basics<text>Elk: Yes, well you may well ask me what is my theory.Presenter: I am asking.Elk: Good for you. My word yes. Well Chris, what is it that it is – this theory of mine. Well, this is what it is – my theory that I have, that is to say, which is mine, is mine.</text> Storing text data in XMLmarkup<drama><line><player>Elk</player><content>Yes, well you may well ask me what is my theory.</content></line><line><player>Presenter</player><content>I <loud>am</loud> asking.</content></line><line><player>Elk</player><content>Good for you. My word yes. Well Chris, what is it that it is – this theory of mine. Well, this is what it is –my theory that I have, that is to say, which is mine, is mine.</content></line></drama> Storing data in XMLMix – partly well-defined, partly open-ended– Example: product descriptions– Name, description – formatted text– Nutrition information – content FDA requiresStoring mixed data in XML<product><info><name>California trail mix</name><description>We mix sweet <loud>ripe</loud> fruit with <loud>premium</loud> nuts to bring you the taste of <loud>pure energy</loud>…</description></info><nutrition><servings><size>1/4 cup</size><per>about 27</per></servings><calories><total>120</total><fat>25</fat></calories>… </nutrition></product>4Describing dataDTDs – “document type definitions”– Original proposal for XML– Describes possible patterns of elements– Grammar with regular expression syntaxDTD examples<!ELEMENT loud (#PCDATA) ><!ELEMENT description (#PCDATA | loud)* ><!ELEMENT name (#PCDATA) ><!ELEMENT info (name, description) >DTDsNot very specific– Don’t constrain types of values– Don’t indicate links to standards– Can only see one layer of structure at a timeXML SchemaGive a template for a document –as more XML!– Complicated syntax, but powerful.XML Schema examplesLoud<element name=“loud” type=“string” />Name<element name=“name” type=“string” />Hey, what’s all that junk?XML also has empty tags<foo></foo>is the same as<foo />5Hey, what’s all that junk?XML also has attributes on opening tags<tag attribute=“value” >Hey, what’s all that junk?So <element name=“loud” type=“string” />Defines an empty element<element name=“loud” type=“string”></element>– Whose name attribute has value “loud” – Whose type attribute has value “string”XML Schema ExamplesDescription<element name=“description”><complexType mixed=“true”><choice minOccurs=“0” maxOccurs=“unbounded”><element name=“loud” type=“string” /></choice></complexType></element>XML Schema ExamplesEasier to define your own types<complexType name=“descriptionType” mixed=“true”><choice minOccurs=“0” maxOccurs=“unbounded”><element name=“loud” type=“string” /></choice></complexType>XML Schema ExamplesInfo<complexType name=“infoType”><sequence><element name=“name” type=“string” /><element name=“description”type=“descriptionType” /></sequence></complexType><element name=“info” type=“infoType” />What’s the point?Even with semi-structured data– You can check that your data falls in a specific range of possibilities– ValidationProblems:What about files created by scripts?6StandardizationWhat schema are you using?– Does your element <name> mean the same thing as my element <name>?– If your license gives me <permission action=“copy” />do I really know what I can do with your data?Key principleNeed a way to uniquely identify tokens as instances of known concepts.Compare: UPC codes, ISBN numbersSolutionUse URLs/URIs– Uniform resource locators– Uniform resource identifiersBuild on the existing infrastructure to avoid clashing names on the web.ExampleThe official DTD for XHTML 1.0 strict – A standard for describing hypertext web documents as XMLlives herehttp://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd(a URL)ExampleA standard reference for the concepts associated with XHTML is this URIhttp://www.w3.org/1999/xhtmlUsing this “namespace” means your intended meaning for your document is what is spelled out there.Using namespaces<tag1 xmlns:ns=“URI”>…. <ns:tag2 … /></tag1>Declared using xmlns attributeUsed using “:” syntax7MetadataData about data– We’ve seen one example: schemas– If you are building a document that respects a particular XML Schema, you can say so<product


View Full Document

Rutgers University CS 336 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?