DOC PREVIEW
Duke CPS 116 - SAX & DOM

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1SAX & DOMCPS 116Introduction to Database Systems2Announcements (October 27) Homework #3 due next Tuesday Project milestone #2 due Nov. 103SAX & DOM Both are API’s for XML processing SAX (Simple API for XML) Started out as a Java API, but now exists for other languages too  DOM (Document Object Model) Language-neutral API with implementations in Java, C++, etc.) JAXP (Java API for XML Processing) Bundled with standard JDK Includes SAX, DOM parsers and XSLT transformers24SAX processing model Serial access XML document is processed as a stream Only one look at the data Cannot go back to an early portion of the document Event-driven A parser generates events as it goes through the document (e.g., start of the document, end of an element, etc.) Application defines event handlers that get invoked when events are generated5SAX eventsMost frequently used events: startDocument endDocument startElement endElement characters Whenever the parser has processed a chunk of character data (without generating other kinds of events) Warning: The parser may generate multiple characters events for one piece of text <?xml version=“1.0”><bibliography><book ISBN=”ISBN-10” price=”80.00”><title>Foundations of Databases</title>…</book>…</bibliography>startElementstartDocumentstartElementendElementendElementendDocumentstartElementcharactersendElementWhitespace may come up as charactersor ignorableWhitespace, depending onwhether a DTD is present6A simple SAX example Print out text contents of title elementsimport java.io.*;import org.xml.sax.*;import org.xml.sax.helpers.DefaultHandler;import javax.xml.parsers.*;public class SaxExample extends DefaultHandler {public static void main(String[] argv) throws Exception {String fileName = argv[0];// Create a SAX parser:SAXParserFactory factory = SAXParserFactory.newInstance();SAXParser saxParser = factory.newSAXParser();// Parse the document with this event handler:DefaultHandler handler = new SaxExample();saxParser.parse(new File(fileName), handler);return;}… …37A simple SAX example (cont’d)private StringBuffer titleStringBuffer = null;public void startElement(String uri, String localName,String qName,Attributes attributes) {if (qName.equals(“title”))titleStringBuffer = new StringBuffer();}public void endElement(String uri, String localName,String qName) {if (qName.equals(“title”)) {System.out.println(titleStringBuffer.toString());titleStringBuffer = null;}}public void characters(char[] ch, int start, int length) {if (titleStringBuffer != null)titleStringBuffer.append(ch, start, length);}Warning: This code does not handle data with //title[//title] patternOnly relevant whennamespace is involvedAssuming no namespaceprocessing, qname is tag name8A common mistakeWhat is wrong with the following? private String titleString = null;public void endElement(String uri, String localName,String qName) {// Print the last chunk of characters seen before </title>:if (qName.equals(“title”))System.out.println(titleString);}public void characters(char[] ch, int start, int length) {titleString = new String(ch, start, length);}9A more complex SAX example Print out the text contents of top-level section titles in books, i.e., //book/section/title Old code would print out all titles, e.g., //book/title, //book//section/title For simplicity, assume that if we have the pattern //book/section/title//book/section/title, we print the higher-level title element Idea: maintain as state the path from the rootprivate ArrayList path = new ArrayList();private int pathLengthWhenOutputIsActivated;410A more complex SAX example (cont’d)public void startElement(String uri, String localName,String qName,Attributes attributes) {path.add(qName); // Maintain the path.if (path.size() >= 3 &&((String)(path.get(path.size()-1))).equals(“title”) &&((String)(path.get(path.size()-2))).equals(“section”) &&((String)(path.get(path.size()-3))).equals(“book”)) {// path matches //book/section/title:if (titleStringBuffer == null) {pathLengthWhenOutputIsActivated = path.size();titleStringBuffer = new StringBuffer();}}}11A more complex SAX example (cont’d)public void endElement(String uri, String localName,String qName) {if (titleStringBuffer != null &&path.size() == pathLengthWhenOutputIsActivated) {// Closing the element that activated output buffering:System.out.println(titleStringBuffer.toString());titleStringBuffer = null;}path.remove(path.size()-1); // Maintain the path.}public void characters(char[] ch, int start, int length) {if (titleStringBuffer != null)titleStringBuffer.append(ch, start, length);}This check prevents premature outputin case that title has subelementsWould it work if we change this check to qName.equals(“title”)?12DOM processing model XML is parsed by a parser and converted into an in-memory DOM tree DOM API allows an application to Construct a DOM tree from an XML document Traverse and read a DOM tree Construct a new, empty DOM tree from scratch Modify an existing DOM tree Copy subtrees from one DOM tree to antheretc.513DOM Node’s A DOM tree is made up of Node’s Most frequently used types of Node’s: Document: root of the DOM tree• Not the sames as the root element of XML DocumentType: corresponds to the DOCTYPE declaration in an XML document Element: corresponds to an XML element Attr: corresponds to an attribute of an XML element Text: corresponds to chunk of text14DOM example<?xml version=“1.0”><!DOCTYPE …><bibliography><book ISBN=”ISBN-10” price=”80.00”><title>Foundations of Databases</title><author>Abiteboul</author><author>Hull</author><author>Vianu</author>…</book><book ISBN=“ISBN-20” price=“40.00”>…</book>…</bibliography>DocumentDocumentTypeElementTextElement Attr AttrTextElementTextTextElementTextTextElementTextTextElementTextTextElement Attr AttrWhitespace between tags is also parsed as Text15Node interfacen.getNodeType() returns the type of Node nn.getChildNodes() returns a NodeList containing Node n’s children For example, subelements are children of an Element; DocumentType is a child of the Documentd.getDocumentElement() returns the root Element of Document de.getNodeName() returns the tag name of Element ee.getAttributes() returns a NamedNodeMap (hash table) containing the attributes of Element e Attributes are not considered children!a.getNodeName()


View Full Document

Duke CPS 116 - SAX & DOM

Documents in this Course
Part I

Part I

8 pages

XSLT

XSLT

4 pages

XSLT

XSLT

8 pages

Part I

Part I

8 pages

XSLT

XSLT

8 pages

Load more
Download SAX & DOM
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view SAX & DOM and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view SAX & DOM 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?