站内搜索: 请输入搜索关键词
当前页面: 图书首页 > XML and Java: Developing Web Applications, Second Edition

XML and Java: Developing Web Applications, Second Edition

[ directory ] Previous Section Next Section

4.2 DOM Basics

We touched briefly on DOM in Chapter 2 and on generating a DOM tree from scratch in Chapter 3. This section gives you more tips on DOM programming.

DOM provides a set of methods to access DOM trees. A node in a DOM tree may be one of the following interfaces: Document, ProcessingInstruction, Comment, DocumentType, Notation, Entity, Element, Text, CDATASection, or EntityReference. All these interfaces are derived from the Node interface (see Figure 2.3 in Chapter 2). Therefore, the basic structural methods, such as accessing and updating parent and child nodes, are defined in the Node interface. In this section, we show how the methods defined in Node are used and we look at some specific interfaces.

4.2.1 Accessing and Updating the Status of a Node

First, we cover the four methods for obtaining and updating information about the node, such as the node type and the node name. In the following explanation, assume that node is a variable pointing to an object of the Node interface or a derived interface.

  • node.getNodeType(): Obtains the node type.

    This method returns an integer that represents the type of this node. For example, if the node is an Element, an integer denoted by Node.ELEMENT_NODE is returned.

    If you want to check whether a node is a Text node, be sure to check not only Node.TEXT_NODE but also Node.CDATA_SECTION_NODE. In many cases, CDATASection nodes should be treated as Text nodes. So you should write something like this:

    if (node.getNodeType() == Node.TEXT_NODE
        || node.getNodeType() == Node.CDATA_SECTION_NODE) {
        :
    }
    

    You can use the runtime type identification (RTTI) mechanism, which is instanceof in Java. The following is equivalent to the previous fragment because CDATASection is derived from Text.

    if (node instanceof Text) {
        :
    }
    
  • node.getNodeName(): Obtains the name of the node.

    This method returns the name of the node. The name depends on the node type: For Element, it is the tag name; for Attr, it is the attribute name; and so on. Table 4.1 summarizes the getNodeType() and getNodeName() methods.

  • node.getNodeValue(): Obtains the value of the node.

    This method returns the value of the node. The value is type-dependent. If the node is an Attr, the value of the attribute is returned, and if it is a Text or a CDATASection, the value is the test string. The value is also defined for ProcessingInstruction and Comment. For other node types, the value is null.

  • node.setNodeValue(newValue): Updates the node value.

    This method updates the value defined in getNodeValue().

Table 4.1. Summary of getNodeType() and getNodeName() Methods

DOM INTERFACE

getNodeType()

getNodeType()

Element

Node.ELEMENT_NODE

Qualified name of element

Attr

Node.ATTRIBUTE_NODE

Qualified name of attribute

Text

Node.TEXT_NODE

"#text"

CDATASection

Node.CDATA_SECTOIN_NODE

"#cdata-section"

EntityReference

Node.ENTITY_REFERENCE_NODE

Entity name

Entity

Node.ENTITY_NODE

Entity name

ProcessingInstruction

Node.PROCESSING_INSTRUCTION_NODE

PI target

Comment

Node.COMMENT_NODE

"#comment"

Document

Node.DOCUMENT_NODE

"#document"

DocumentType

Node.DOCUMENT_TYPE_NODE

Name of the root element

DocumentFragment

Node.DOCUMENT_FRAGMENT_NODE

"#document-fragment"

Notation

Node.NOTATION_NODE

Notation name

4.2.2 Accessing Structural Information

Because DOM is tree-structured, any structural information is given by parent-child relationships. The Node interface provides a set of methods to access this information:

  • node.getParentNode()

  • node.hasChildNodes()

  • node.getFirstChild()

  • node.getLastChild()

  • node.getPreviousSibling()

  • node.getNextSibling()

Throughout this subsection, we use the following small XML fragment as an example.

<name>
    <given>John</given>
    <family>Doe</family>
</name>

The name element has five children:

  • Text node that has "\n "

  • Element node of the name given

  • Text node that has "\n "

  • Element node of the name family

  • Text node that has "\n"

Note that whitespace for indentation is represented as a Text node and "\n" represents an end-of-line character. Each of the given elements and family elements has one Text node as its child. See Figure 4.1.

Figure 4.1. DOM nodes representing the sample fragment

graphics/04fig01.gif

Obtaining the Parent Node

The node.getParentNode() method returns the parent node of this node. If the name element is the root element in a document, getParentNode() applied to the name element returns the Document node. Figure 4.2 demonstrates how to use this method.

Figure 4.2. Using getParentNode()

graphics/04fig02.gif

Checking Whether a Node Has One or More Child Nodes

There are several ways to check whether a node has a child:

  • node.hasChildNodes()

  • node.getFirstChild() != null

  • node.getLastChild() != null

  • node.getChildNodes().getLength() > 0

Any of these work well for this purpose, but we recommend the first one for readability.

Obtaining the Next and Previous Siblings

You can use the following two methods to go back and forth among child nodes:

  • node.getPreviousSibling(): Returns the previous sibling

  • node.getNextSibling(): Returns the next sibling

Figure 4.3 shows how to use these two methods.

Figure 4.3. Using getPreviousSibling() and getNextSibling() to move among siblings

graphics/04fig03.gif

Processing Children in the Order in Which They Appear

Suppose you want to visit the five children of the name element in the order in which they appear:

  1. Text node that has "\n "

  2. lement node of the name given

  3. Text node that has "\n "

  4. Element node of the name family

  5. Text node that has "\n"

To iterate on the children of a node, you have two choices:

  • Use getFirstChild() and getNextSibling()梩his is simple and straightforward.

    for (Node child = node.getFirstChild();
       child != null;
       child = child.getNextSibling()) {
       ...process child
    }
    
  • Obtain a NodeList, and use an index to access the children.

    NodeList nodeList = node.getChildNodes();
    for (int i = 0;  i < nodeList.getLength();  i++) {
        Node child = nodeList.item(i);
        ...process child
    }
    

Figure 4.4 illustrates how this is done.

Figure 4.4. Order of processing children

graphics/04fig04.gif

Processing Children in Reverse Order

There are two ways to process children in reverse order:

  • Use getLastChild() and getPreviousSibling()

    for (Node child = node.getLastChild();
        child != null;
        child = child.getPreviousSibling()) {
        ...process child
    }
    
  • Obtain a NodeList by calling getChildNodes(), and use item() to access each item in the NodeList.

    NodeList nodeList = node.getChildNodes();
    for (int i = nodeList.getLength()-1;  i >= 0;  i? {
        Node child = nodeList.item(i);
        ...process child
    }
    

4.2.3 Inserting, Detaching, and Replacing a Child Node

We have looked at the ways to examine various parts of a DOM tree. Now we shift our attention to modifying the tree. Structural changes in a DOM tree are always made by inserting, detaching, or replacing a child node. In this section, we cover the following methods:

  • doc.createXXX(...)(doc is an instance of Document)

  • node.insertBefore(newChild, refChild)

  • node.appendChild(newChild)

  • node.replaceChild(newChild, refChild)

  • node.removeChild(oldChild)

Some types of nodes?TT>Text, CDATASection, Comment, Notation, and ProcessingInstruction梟ever have child nodes. Others may have constraints regarding allowed child node types. Violations of these rules generate DOMExceptions.

Creating New Nodes

As described in Chapter 3, we use factory methods such as createElement() of the Document interface to create a new node.

Element newElement = doc.createElement("address");
newElement.appendChild(doc.createTextNode("1234 Orange Ave."));
// We discuss appendChild() later.

The newElement just created does not belong to a DOM tree rooted by doc. The owner of newElement is doc, and there is no parent of newElement because it is not yet appended to any node. Although newElement.getOwnerDocument() returns doc, newElement.getParentNode() returns null.

Another way to create a node is to duplicate other nodes.

Element newElement2 = newElement.cloneNode(false);

This statement creates a shallow copy. The node newElement2 is a copy of newElement, including copies of its attributes, but descendants of newElement are not copied. Even if newElement is a child of some node, newElement2 has no parent node.

Element newElement2 = newElement.cloneNode(true);

This statement performs a deep copy. The node newElement2 is a copy of newElement, with the children of newElement. In this case, newElement2 also has no parent node. Figure 4.5 shows the differences between a shallow copy and a deep copy.

Figure 4.5. Behavior of cloneNode()

graphics/04fig05.gif

Inserting a Child Node

To insert a child node, use node.insertBefore(newChild, refChild). It adds newChild to node as a child before refChild, which is a child of node. If refChild is null, newChild is added as the last child of node. Figure 4.6 illustrates how this is done.

Figure 4.6. Using insertBefore() to insert a new child

graphics/04fig06.gif

Adding a Child Node

To add a child node, use node.appendChild(newChild) to add newChild as the last child of node. This is equivalent to node.insertBefore(newChild, null) and is illustrated in Figure 4.7.

Figure 4.7. Using appendChild() to add a child

graphics/04fig07.gif

Replacing a Child Node

To replace an old child with a new child, use node.replaceChild(newChild, oldChild) to replace oldChild, which must be a child node of node, with newChild. The node oldChild will be deleted from the DOM tree梩hat is, oldChild.getParentNode()returns null. Figure 4.8 demonstrates its use.

Figure 4.8. Using replaceChild() to replace a child

graphics/04fig08.gif

Detaching a Child Node

Use node.removeChild(oldChild) to detach oldChild, which must be a child node of node, from the children of node. See Figure 4.9.

Figure 4.9. Using removeChild() to detach a child

graphics/04fig09.gif

4.2.4 DOM Tree and Attributes

An Element node may have attributes in addition to child nodes. Attributes can be accessed through the Attr interface, which is derived from the Node interface. We now discuss how to obtain attribute values or Attr nodes.

The following are the attribute-manipulating methods defined as part of the Element interface.

  • NamedNodeMap getAttributes()

  • String getAttribute(String name)

  • void setAttribute(String name, String value)

  • void removeAttribute(String name)

  • Attr getAttributeNode(String name)

  • void setAttributeNode(Attr newAttr)

  • void removeAttributeNode(Attr oldAttr)

To obtain an attribute whose name is already known, use getAttribute(String name) or getAttributeNode(String name). The former returns the attribute value string of an attribute of the name name, and the latter returns the Attr node of the attribute. To obtain a list of all the attributes of an element, use getAttributes() as follows.

NamedNodeMap map = element.getAttributes();
for (int i = 0;  i < map.getLength();  i++) {
   Attr attr = (Attr)map.item(i);
   ...process attr
}

Alternatively, the NamedNodeMap interface has the getNamedItem(String name) method to get an attribute by its name.

How to Check Whether an Element Has an Attribute

Observe that we cannot use the getAttribute() method of the Element interface to check whether an element has an attribute. As an example, suppose that we want to check the existence of an id attribute in the elem element.

String value = elem.getAttribute("id");

If the elem element has no id attribute, getAttribute("id") returns "" (zero-length string) rather than returning null. The getAttribute("id") also returns "" if the id attribute has a value of "". We cannot distinguish between these two cases.

To check the existence of an attribute, use "elem.getAttributeNode(name) != null," or "elem.getAttributes().getNamedItem(name) != null."

    [ directory ] Previous Section Next Section