DOC PREVIEW
Penn CIT 597 - DOM

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

DOM SAX and DOM SAX and DOM are standards for XML parsers program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad hoc but very popular standard There are various implementations available Java implementations are provided in JAXP Java API for XML Processing JAXP is included as a package in Java 1 4 JAXP is available separately for Java 1 3 Unlike many XML technologies SAX and DOM are relatively easy Difference between SAX and DOM DOM reads the entire XML document into memory and stores it as a tree data structure SAX reads the XML document and sends an event for each element that it encounters Consequences DOM provides random access into the XML document SAX provides only sequential access to the XML document DOM is slow and requires huge amounts of memory so it cannot be used for large XML documents SAX is fast and requires very little memory so it can be used for huge documents or large numbers of documents This makes SAX much more popular for web sites Some DOM implementations have methods for changing the XML document in memory SAX implementations do not Simple DOM program I This program is adapted from CodeNotes for XML by Gregory Brill page 128 import javax xml parsers import org w3c dom public class SecondDom public static void main String args try Main part of program goes here catch Exception e e printStackTrace System out Simple DOM program II First we need to create a DOM parser called a DocumentBuilder The parser is created not by a constructor but by calling a static factory method This is a common technique in advanced Java programming The use of a factory method makes it easier if you later switch to a different parser DocumentBuilderFactory factory DocumentBuilderFactory newInstance DocumentBuilder builder factory newDocumentBuilder Simple DOM program III The next step is to load in the XML file Here is the XML file named hello xml xml version 1 0 display Hello World display To read this file in we add the following line to our program Document document builder parse hello xml Notes document contains the entire XML file as a tree it is the Document Object Model If you run this from the command line your XML file should be in the same directory as your program An IDE may look in a different directory for your file if you get a java io FileNotFoundException this is probably why Simple DOM program IV The following code finds the content of the root element and prints it Element root document getDocumentElement Node textNode root getFirstChild System out println textNode getNodeValue This code should be mostly self explanatory we ll get into the details shortly The output of the program is Hello World Reading in the tree The parse method reads in the entire XML document and represents it as a tree in memory For a large document parsing could take a while If you want to interact with your program while it is parsing you need to parse in a separate thread Once parsing starts you cannot interrupt or stop it Do not try to access the parse tree until parsing is done An XML parse tree may require up to ten times as much memory as the original XML document If you have a lot of tree manipulation to do DOM is much more convenient than SAX If you don t have a lot of tree manipulation to do consider using SAX instead Structure of the DOM tree The DOM tree is composed of Node objects Node is an interface Some of the more important subinterfaces are Element Attr and Text An Element node may have children Attr and Text nodes are leaves Additional types are Document ProcessingInstruction Comment Entity CDATASection and several others Hence the DOM tree is composed entirely of Node objects but the Node objects can be downcast into more specific types as needed Operations on Nodes I The results returned by getNodeName getNodeValue getNodeType and getAttributes depend on the subtype of the node as follows getNodeName getNodeValue getNodeType getAttributes Element tag name Text text Attr name of attribute null text contents value of attribute ELEMENT NO DE TEXT NO DE NamedNodeM ap null ATTRIBUTE N ODE null Distinguishing Node types Here s an easy way to tell what kind of a node you are dealing with switch node getNodeType case Node ELEMENT NODE Element element Element node break case Node TEXT NODE Text text Text node break case Node ATTRIBUTE NODE Attr attr Attr node break default Operations on Nodes II Tree walking operations that return a Node getParentNode getFirstChild getNextSibling getPreviousSibling getLastChild Tests that return a boolean hasAttributes hasChildNodes Operations for Elements String getTagName Returns true if this Element has the named attribute boolean hasAttribute String name Returns true if this Element has the named attribute String getAttribute String name Returns the String value of the named attribute boolean hasAttributes Returns true if this Element has any attributes This method is actually inherited from Node Returns false if it is applied to a Node that isn t an Element NamedNodeMap getAttributes Returns a NamedNodeMap of all the Element s attributes This method is actually inherited from Node Returns null if it is applied to a Node that isn t an Element NamedNodeMap The node getAttributes operation returns a NamedNodeMap Because NamedNodeMaps are used for other kinds of nodes elsewhere in Java the contents are treated as general Nodes not specifically as Attrs Some operations on a NamedNodeMap are getNamedItem String name returns as a Node the attribute with the given name getLength returns as an int the number of Nodes in this NamedNodeMap item int index returns as a Node the indexth item This operation lets you conveniently step through all the nodes in the NamedNodeMap Java does not guarantee the order in which nodes are returned Operations on Texts Text is a subinterface of CharacterData and inherits the following operations among others public String getData throws DOMException Returns the text contents of this Text node public int getLength Returns the number of Unicode characters in the text public String substringData int offset int count throws DOMException Returns a substring of the text contents Operations on Attrs String getName Returns the name of this attribute Element getOwnerElement Returns the Element node this attribute is attached to or null if this attribute is not in use boolean getSpecified Returns true if this attribute was explicitly given a value in the original document String getValue Returns the


View Full Document

Penn CIT 597 - DOM

Documents in this Course
DOM

DOM

21 pages

More DOM

More DOM

11 pages

Rails

Rails

33 pages

DOM

DOM

21 pages

RELAX NG

RELAX NG

31 pages

RELAX NG

RELAX NG

31 pages

RELAX NG

RELAX NG

31 pages

RELAX NG

RELAX NG

31 pages

Rake

Rake

12 pages

Ruby

Ruby

58 pages

DOM

DOM

21 pages

Tomcat

Tomcat

16 pages

Servlets

Servlets

29 pages

Logging

Logging

17 pages

Html

Html

27 pages

DOM

DOM

22 pages

RELAX NG

RELAX NG

30 pages

Servlets

Servlets

28 pages

XHTML

XHTML

13 pages

DOM

DOM

21 pages

DOM

DOM

21 pages

Servlets

Servlets

26 pages

More CSS

More CSS

18 pages

Servlets

Servlets

29 pages

Logging

Logging

17 pages

Load more
Download DOM
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view DOM and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view DOM and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?