XMLHTML and XML, IHTML and XML, IIHTML and XML, IIIXML-related technologiesExample XML documentOverall structureXML building blocksElements and attributesWell-formed XMLEntitiesXML declarationProcessing instructionsCommentsCDATANames in XMLNamespacesNamespaces and URIsNamespace syntaxReview of XML rulesAnother well-structured exampleXML as a treeValid XMLMixed contentExample XML document, revisedViewing XMLExtended document standardsVocabularyThe EndJan 14, 2019XMLeXtensible Markup Language2HTML and XML, IXML stands for eXtensible Markup LanguageHTML is used to mark up text so it can be displayed to usersXML is used to mark up data so it can be processed by computersHTML describes both structure (e.g. <p>, <h2>, <em>) and appearance (e.g. <br>, <font>, <i>)XML describes only content, or “meaning”HTML uses a fixed, unchangeable set of tagsIn XML, you make up your own tags3HTML and XML, IIHTML and XML look similar, because they are both SGML languages (SGML = Standard Generalized Markup Language) Both HTML and XML use elements enclosed in tags (e.g. <body>This is an element</body>)Both use tag attributes (e.g.,<font face="Verdana" size="+1" color="red">)Both use entities (<, >, &, ", ')More precisely,HTML is defined in SGMLXML is a (very small) subset of SGML4HTML and XML, IIIHTML is for humansHTML describes web pagesYou don’t want to see error messages about the web pages you visitBrowsers ignore and/or correct as many HTML errors as they can, so HTML is often sloppyXML is for computersXML describes dataThe rules are strict and errors are not allowedIn this way, XML is like a programming languageCurrent versions of most browsers can display XMLHowever, browser support of XML is spotty at best5XML-related technologiesDTD (Document Type Definition) and XML Schemas are used to define legal XML tags and their attributes for particular purposesCSS (Cascading Style Sheets) describe how to display HTML or XML in a browserXSLT (eXtensible Stylesheet Language Transformations) and XPath are used to translate from one form of XML to anotherDOM (Document Object Model), SAX (Simple API for XML, and JAXP (Java API for XML Processing) are all APIs for XML parsing6Example XML document<?xml version="1.0"?><weatherReport> <date>7/14/97</date> <city>North Place</city>, <state>NX</state> <country>USA</country> High Temp: <high scale="F">103</high> Low Temp: <low scale="F">70</low> Morning: <morning>Partly cloudy, Hazy</morning> Afternoon: <afternoon>Sunny & hot</afternoon> Evening: <evening>Clear and Cooler</evening></weatherReport>From: XML: A Primer, by Simon St. Laurent7Overall structure An XML document may start with one or more processing instructions (PIs) or directives: <?xml version="1.0"?><?xml-stylesheet type="text/css" href="ss.css"?>Following the directives, there must be exactly one tag, called the root element, containing all the rest of the XML: <weatherReport> ...</weatherReport>8XML building blocksAside from the directives, an XML document is built from:elements: high in <high scale="F">103</high>tags, in pairs: <high scale="F">103</high>attributes: <high scale="F">103</high>entities: <afternoon>Sunny & hot</afternoon>character data, which may be:parsed (processed as XML)--this is the defaultunparsed (all characters stand for themselves)9Elements and attributesAttributes and elements are somewhat interchangeableExample using just elements: <name> <first>David</first> <last>Matuszek</last></name>Example using attributes: <name first="David" last="Matuszek"></name>You will find that elements are easier to use in your programs--this is a good reason to prefer themAttributes often contain metadata, such as unique IDsGenerally speaking, browsers display only elements (values enclosed by tags), not tags and attributes10Well-formed XMLEvery element must have both a start tag and an end tag, e.g. <name> ... </name>But empty elements can be abbreviated: <break />.XML tags are case sensitiveXML tags may not begin with the letters xml, in any combination of casesElements must be properly nested, e.g. not <b><i>bold and italic</b></i>Every XML document must have one and only one root elementThe values of attributes must be enclosed in single or double quotes, e.g. <time unit="days">Character data cannot contain < or &11EntitiesFive special characters must be written as entities: & for & (almost always necessary) < for < (almost always necessary) > for > (not usually necessary) " for " (necessary inside double quotes) ' for ' (necessary inside single quotes)These entities can be used even in places where they are not absolutely requiredThese are the only predefined entities in XML12XML declarationThe XML declaration looks like this:<?xml version="1.0" encoding="UTF-8" standalone="yes"?>The XML declaration is not required by browsers, but is required by most XML processors (so include it!)If present, the XML declaration must be first--not even whitespace should precede itNote that the brackets are <? and ?>version="1.0" is required (this is the only version so far)encoding can be "UTF-8" (ASCII) or "UTF-16" (Unicode), or something else, or it can be omittedstandalone tells whether there is a separate DTD13Processing instructionsPIs (Processing Instructions) may occur anywhere in the XML document (but usually first)A PI is a command to the program processing the XML document to handle it in a certain wayXML documents are typically processed by more than one programPrograms that do not recognize a given PI should just ignore itGeneral format of a PI: <?target instructions?>Example: <?xml-stylesheet type="text/css" href="mySheet.css"?>14Comments<!-- This is a comment in both HTML and XML -->Comments can be put anywhere in an XML documentComments are useful for:Explaining the structure of an XML documentCommenting out parts of the XML during development and testingComments are not elements and do not have an end tagThe blanks after <!-- and before --> are optionalThe character sequence -- cannot occur in the commentThe closing bracket must be -->Comments are not displayed by browsers, but can be seen by anyone who looks at the source code15CDATABy default, all text inside an XML document is
View Full Document