SAXSAX, DOM, and XOMDifference between SAX and DOMCallbacksSimple SAX programThe Sample class, IThe Sample class, IIThe Handler class, IThe Handler class, IIResultsMore resultsFactoriesParser factoriesGetting a parserDeclaring which handler to useSAX handlersClass DefaultHandlerContentHandler methods, IContentHandler methods, IIContentHandler methods, IIIContentHandler methods, IVAttributes, IAttributes, IIContentHandler methods, VContentHandler methods, VIExampleWhitespaceHandling ignorable whitespaceError Handling, IError Handling, IIError Handling, IIIThe EndJan 13, 2019SAX2SAX, DOM, and XOMSAX and DOM are standards for XML parsers--program APIs to read and interpret XML filesDOM is a W3C standardSAX is an ad-hoc (but very popular) standardSAX was developed by David Megginson and is open sourceThere are various implementations availableJava implementations are provided as part of JAXP (Java API for XML Processing)JAXP is included as a package in Java 1.4 and Java 5JAXP is available separately for Java 1.3XOM is a new parser by Elliott Rusty HaroldUnlike many XML technologies, XML parsers are relatively easy3Difference between SAX and DOMDOM reads the entire XML document into memory and stores it as a tree data structureSAX reads the XML document and calls one of your methods for each element or block of text that it encountersConsequences:DOM provides “random access” into the XML documentSAX provides only sequential access to the XML documentDOM is slow and requires huge amounts of memory, so it cannot be used for large XML documentsSAX is fast and requires very little memory, so it can be used for huge documents (or large numbers of documents)This makes SAX much more popular for web sitesSome DOM implementations have methods for changing the XML document in memory; SAX implementations do not4CallbacksSAX works through callbacks: you call the parser, it calls methods that you supplyYour programmain(...)startDocument(...) startElement(...)characters(...)endElement( )endDocument( )parse(...)The SAX parser5Simple SAX programThe following program is adapted from CodeNotes® for XML by Gregory Brill, pages 158-159The program consists of two classes:Sample -- This class contains the main method; itGets a factory to make parsersGets a parser from the factoryCreates a Handler object to handle callbacks from the parserTells the parser which handler to send its callbacks toReads and parses the input XML fileHandler -- This class contains handlers for three kinds of callbacks:startElement callbacks, generated when a start tag is seenendElement callbacks, generated when an end tag is seencharacters callbacks, generated for the contents of an element6The Sample class, Iimport javax.xml.parsers.*; // for both SAX and DOMimport org.xml.sax.*;import org.xml.sax.helpers.*;// For simplicity, we let the operating system handle exceptions// In "real life" this is poor programming practicepublic class Sample { public static void main(String args[]) throws Exception { // Create a parser factory SAXParserFactory factory = SAXParserFactory.newInstance(); // Tell factory that the parser must understand namespaces factory.setNamespaceAware(true); // Make the parser SAXParser saxParser = factory.newSAXParser(); XMLReader parser = saxParser.getXMLReader();7The Sample class, IIIn the previous slide we made a parser, of type XMLReader // Create a handler Handler handler = new Handler(); // Tell the parser to use this handler parser.setContentHandler(handler); // Finally, read and parse the document parser.parse("hello.xml"); } // end of Sample classYou will need to put the file hello.xml :In the same directory, if you run the program from the command lineOr where it can be found by the particular IDE you are using8The Handler class, Ipublic class Handler extends DefaultHandler {DefaultHandler is an adapter class that defines these methods and others as do-nothing methods, to be overridden as desiredWe will define three very similar methods to handle (1) start tags, (2) contents, and (3) end tags--our methods will just print a lineEach of these three methods could throw a SAXException // SAX calls this method when it encounters a start tag public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes attributes) throws SAXException { System.out.println("startElement: " + qualifiedName); }9The Handler class, II // SAX calls this method to pass in character data public void characters(char ch[ ], int start, int length) throws SAXException { System.out.println("characters: \"" + new String(ch, start, length) + "\""); } // SAX call this method when it encounters an end tag public void endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException { System.out.println("Element: /" + qualifiedName); }} // End of Handler class10ResultsIf the file hello.xml contains: <?xml version="1.0"?> <display>Hello World!</display>Then the output from running java Sample will be: startElement: display characters: "Hello World!" Element: /display11More resultsNow suppose the file hello.xml contains:<?xml version="1.0"?><display> <i>Hello</i> World!</display>Notice that the root element, <display>, now contains a nested element <i> and some whitespace (including newlines)The result will be as shown at the right:startElement: displaycharacters: ""characters: "" characters: " " startElement: icharacters: "Hello"endElement: /icharacters: "World!"characters: " "endElement: /display// empty string// newline// spaces// another newline12FactoriesSAX uses a parser factoryA factory is an alternative to constructorsFactories allow the programmer to:Decide whether or not to create a new objectDecide what kind (subclass, implementation) of object to createTrivial example:class TrustMe { private TrustMe() { } // private constructor public TrustMe makeTrust() { // factory method if ( /* test of some sort */) return new TrustMe(); } }}13Parser factoriesTo create a SAX parser factory,
View Full Document