RELAX NG Jan 13 2019 Caveat I did not have a RELAX NG validator when I wrote these slides Therefore if an example appears to be wrong it probably is 2 What is RELAX NG RELAX NG is a schema language for XML OASIS is the Organization for the Advancement of Structured Information Standards It is an alternative to DTDs and XML Schemas It is based on earlier schema languages RELAX and TREX It is not a W3C standard but is an OASIS standard ebXML Enterprise Business XML is a joint effort of OASIS and UN CEFACT United Nations Centre for Trade Facilitation and Electronic Business OASIS developed the highly popular DocBook DTD for describing books articles and technical documents RELAX NG has recently been adopted as an ISO IEC standard 3 Design goals Simple and easy to learn Uses XML syntax Does not change the information set of an XML document But there is also a concise non XML syntax I m not sure what this means Supports XML namespaces Treats attributes uniformly with elements so far as possible Has unrestricted support for unordered content Has unrestricted support for mixed content Has a solid theoretical basis Can make use of a separate datatyping language such W3C XML Schema Datatypes 4 RELAX NG tools Jing Sun s MSV Translates from DTDs into RNG RELAX NG syntax or RNG compact syntax Trang Another validator DTDinst An open source validator written in Java Translates RNG compact syntax into RNG syntax Translates RNG or RNG compact syntax into DTDs Sun s RELAX NG Converter Translates DTDs into RNG syntax but not well Translates an XML Schema subset into RNG syntax imperfectly 5 Basic structure A RELAX NG specification is written in XML so it obeys all XML rules The RELAX NG specification has one root element The document it describes also has one root element The root element of the specification is element If the root element of your document is book then the RELAX NG specifications begins element name book xmlns http relaxng org ns structure 1 0 and ends element 6 Data elements RELAX NG makes a clear separation between For starters we will use the two XML defined elements the structure of a document which it describes the datatypes used in the document which it gets from somewhere else such as from XML Schemas text text usually written text Plain character data not containing other elements empty empty usually written empty Does not contain anything Other datatypes such as double double are not defined in RELAX NG To inherit datatypes from XML Schemas use datatypeLibrary http www w3 org 2001 XMLSchemadatatypes as an attribute of the root element 7 Defining tags To define a tag and specify its content use element name myElement Content goes here element Example The DTD ELEMENT name firstName lastName ELEMENT firstName PCDATA ELEMENT lastName PCDATA Translates to element name name element name firstName text element element name lastName text element element Note As in the DTD the components must occur in order 8 RELAX NG describes patterns Your RELAX NG document specifies a pattern that matches your valid XML documents For example the pattern element name name element name firstName text element element name lastName text element element Will match the XML name firstName David firstName lastName Matuszek lastName name 9 Easy tags zeroOrMore zeroOrMore The enclosed content occurs zero or more times oneOrMore oneOrMore The enclosed content occurs one or more times optional optional The enclosed content occurs once or not at all choice choice Any one of the enclosed elements may occur An XML comment not a container and may not contain two consecutive hyphens 10 Example element name addressList zeroOrMore element name name element name firstName text element element name lastName text element element element name address choice element name email text element element name USPost text element choice element zeroOrMore element 11 Enumerations The value value pattern matches a specified value Example element name gender choice value male value value female value choice element The contents of value are subject to whitespace normalization Leading and trailing whitespace is removed Internal sequences of whitespace characters are collapsed to a single blank 12 More about data Remember To inherit datatypes from XML Schemas add this attribute to the root element datatypeLibrary http www w3 org 2001 XMLSchema datatypes You can access the inherited types with the data tag for instance data type double The data pattern must match the entire content of the enclosing tag not just part of it element name illegalUse Don t do this data type double element name moreStuff text element element If you don t specify a datatype library RELAX NG defines the following for you along with text and empty string No whitespace normalization is done token A sequence of characters containing no whitespace 13 group group group is used as fat parentheses Example choice element name name text choice 1 element group element name firstName text element choice 2 element name lastName text element group choice 14 Attributes Attributes are defined practically the same way as elements Example attribute name attributeName attribute element name name attribute name title text attribute element name firstName text element element name lastName text element element Matches name title Dr firstName David firstName lastName Matuszek lastName name 15 More about attributes With attributes as with elements you can use optional choice and group It doesn t make sense to use oneOrMore or zeroOrMore with attributes In keeping with the usual XML rules The order in which you list elements is significant The order in which you list attributes is not significant 16 Still more about attributes attribute name attributeName text attribute can be and usually is abbreviated as attribute name attributeName However element name elementName text element can not be abbreviated as element name elementName If an element has no attributes and no content you must use empty explicitly 17 list list pattern list matches a whitespaceseparated list of tokens and applies the pattern to those tokens Example A floating point number and some integers element name vector list data type float oneOrMore data type int oneOrMore list element 18 interleave interleave interleave allows the contained elements to occur in any order interleave is more sophisticated than you might expect If a contained element can occur more than once the various instances do not need to
View Full Document