New version page

UH COSC 3480 - Semistructured Data Extensible Markup Language Document Type Definitions

This preview shows page 1-2-14-15-29-30 out of 30 pages.

View Full Document
View Full Document

End of preview. Want to read all 30 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

XMLSemistructured DataGraphs of Semistructured DataExample: Data GraphSlide 5Well-Formed and Valid XMLWell-Formed XMLTagsExample: Well-Formed XMLXML and Semistructured DataExampleDTD StructureDTD ElementsExample: DTDElement DescriptionsExample: Element DescriptionUse of DTD’sExample (a)Example (b)AttributesExample: AttributesExample: Attribute UseID’s and IDREF’sCreating ID’sCreating IDREF’sExample: ID’s and IDREF’sThe DTDExample DocumentEmpty ElementsExample: Empty Element1Jeff Ullman: Introduction to XMLXMLSemistructured DataExtensible Markup LanguageDocument Type Definitions2Jeff Ullman: Introduction to XMLSemistructured DataAnother data model, based on trees.Motivation: flexible representation of data.oOften, data comes from multiple sources with differences in notation, meaning, etc.Motivation: sharing of documents among systems and databases.3Jeff Ullman: Introduction to XMLGraphs of Semistructured DataNodes = objects.Labels on arcs (attributes, relationships).Atomic values at leaf nodes (nodes with no arcs out).Flexibility: no restriction on:oLabels out of a node.oNumber of successors with a given label.4Jeff Ullman: Introduction to XMLExample: Data GraphBudA.B.Gold1995MapleJoe’sM’lobbeer beerbarmanfmanfservedAtnamenamenameaddrprizeyear awardrootThe bar objectfor Joe’s BarThe beer objectfor BudNotice anew kindof data.5Jeff Ullman: Introduction to XMLXMLXML = Extensible Markup Language.While HTML uses tags for formatting (e.g., “italic”), XML uses tags for semantics (e.g., “this is an address”).Key idea: create tag sets for a domain (e.g., genomics), and translate all data into properly tagged XML documents.6Jeff Ullman: Introduction to XMLWell-Formed and Valid XMLWell-Formed XML allows you to invent your own tags.oSimilar to labels in semistructured data.Valid XML involves a DTD (Document Type Definition), a grammar for tags.7Jeff Ullman: Introduction to XMLWell-Formed XMLStart the document with a declaration, surrounded by <?xml … ?> .Normal declaration is:<?xml version = “1.0” standalone = “yes” ?>o“Standalone” = “no DTD provided.”Balance of document is a root tag surrounding nested tags.8Jeff Ullman: Introduction to XMLTagsTags, as in HTML, are normally matched pairs, as <FOO> … </FOO> .Tags may be nested arbitrarily.XML tags are case sensitive.9Jeff Ullman: Introduction to XMLExample: Well-Formed XML<?xml version = “1.0” standalone = “yes” ?><BARS><BAR><NAME>Joe’s Bar</NAME><BEER><NAME>Bud</NAME><PRICE>2.50</PRICE></BEER><BEER><NAME>Miller</NAME><PRICE>3.00</PRICE></BEER></BAR><BAR> … </BARS>A NAMEsubobjectA BEERsubobject10Jeff Ullman: Introduction to XMLXML and Semistructured DataWell-Formed XML with nested tags is exactly the same idea as trees of semistructured data.We shall see that XML also enables nontree structures, as does the semistructured data model.11Jeff Ullman: Introduction to XMLExampleThe <BARS> XML document is:Joe’s BarBud 2.50 Miller 3.00PRICEBARBARBARSNAME. . .BARPRICENAMEBEERBEERNAME12Jeff Ullman: Introduction to XMLDTD Structure<!DOCTYPE <root tag> [<!ELEMENT <name>(<components>)>. . . more elements . . .]>13Jeff Ullman: Introduction to XMLDTD ElementsThe description of an element consists of its name (tag), and a parenthesized description of any nested tags.oIncludes order of subtags and their multiplicity.Leaves (text elements) have #PCDATA (Parsed Character DATA ) in place of nested tags.14Jeff Ullman: Introduction to XMLExample: DTD<!DOCTYPE BARS [<!ELEMENT BARS (BAR*)><!ELEMENT BAR (NAME, BEER+)><!ELEMENT NAME (#PCDATA)><!ELEMENT BEER (NAME, PRICE)><!ELEMENT PRICE (#PCDATA)>]>A BARS object haszero or more BAR’snested within.A BAR has oneNAME and oneor more BEERsubobjects.A BEER has aNAME and aPRICE.NAME and PRICEare text.15Jeff Ullman: Introduction to XMLElement DescriptionsSubtags must appear in order shown.A tag may be followed by a symbol to indicate its multiplicity.o* = zero or more.o+ = one or more.o? = zero or one.Symbol | can connect alternative sequences of tags.16Jeff Ullman: Introduction to XMLExample: Element DescriptionA name is an optional title (e.g., “Prof.”), a first name, and a last name, in that order, or it is an IP address:<!ELEMENT NAME ((TITLE?, FIRST, LAST) | IPADDR)>17Jeff Ullman: Introduction to XMLUse of DTD’s1. Set standalone = “no”.2. Either:a) Include the DTD as a preamble of the XML document, orb) Follow DOCTYPE and the <root tag> by SYSTEM and a path to the file where the DTD can be found.18Jeff Ullman: Introduction to XMLExample (a)<?xml version = “1.0” standalone = “no” ?><!DOCTYPE BARS [<!ELEMENT BARS (BAR*)><!ELEMENT BAR (NAME, BEER+)><!ELEMENT NAME (#PCDATA)><!ELEMENT BEER (NAME, PRICE)><!ELEMENT PRICE (#PCDATA)>]><BARS><BAR><NAME>Joe’s Bar</NAME><BEER><NAME>Bud</NAME> <PRICE>2.50</PRICE></BEER><BEER><NAME>Miller</NAME> <PRICE>3.00</PRICE></BEER></BAR> <BAR> … </BARS>The DTDThe document19Jeff Ullman: Introduction to XMLExample (b)Assume the BARS DTD is in file bar.dtd.<?xml version = “1.0” standalone = “no” ?><!DOCTYPE BARS SYSTEM “bar.dtd”><BARS><BAR><NAME>Joe’s Bar</NAME><BEER><NAME>Bud</NAME><PRICE>2.50</PRICE></BEER><BEER><NAME>Miller</NAME><PRICE>3.00</PRICE></BEER></BAR><BAR> … </BARS>Get the DTDfrom the filebar.dtd20Jeff Ullman: Introduction to XMLAttributesOpening tags in XML can have attributes.In a DTD,<!ATTLIST E . . . > declares an attribute for element E, along with its datatype.21Jeff Ullman: Introduction to XMLExample: AttributesBars can have an attribute kind, a character string describing the bar.<!ELEMENT BAR (NAME BEER*)><!ATTLIST BAR kind CDATA #IMPLIED>Character stringtype; no tagsAttribute is optionalopposite: #REQUIRED22Jeff Ullman: Introduction to XMLExample: Attribute UseIn a document that allows BAR tags, we might see:<BAR kind = “sushi”><NAME>Akasaka</NAME><BEER><NAME>Sapporo</NAME><PRICE>5.00</PRICE></BEER>...</BAR>Note attributevalues are quoted23Jeff Ullman: Introduction to XMLID’s and IDREF’sAttributes can be pointers from one object to another.oCompare to HTML’s NAME = “foo” and HREF = “#foo”.Allows the structure of an XML document to be a general graph, rather than just a tree.24Jeff Ullman: Introduction to XMLCreating ID’sGive an element E an attribute A of type ID.When using tag <E > in an XML document, give its attribute A a unique


View Full Document
Loading Unlocking...
Login

Join to view Semistructured Data Extensible Markup Language Document Type Definitions and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Semistructured Data Extensible Markup Language Document Type Definitions and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?