Sunday, January 19, 2014
Document Object Model (DOM) API for XML approach is memory intensive compared to the SAX parser. Refer SAX Parser for an example implementation of SAX parser. If XML content size is large it is recommended to use the SAX parser approach. In the DOM parsing approach we load the entire contents of an XML file into a tree structure and then iterate through the tree to read the content. 

Typically when we need to modify the XML documents DOM parser would be advantageous. 

A sample implementation of DOM parser is listed below. Here we read the XML file and create a Document object in memory. Then we iterate through the tree and extract the required elements/ attributes. It is a typical practice to use a POJO to store the contents for application use.

Simple Java program implementation of DOM parser

This is the input XML file we are interested in parsing. We need the attribute ID and the elements TITLE and ARTIST.

 <CD id="1">
  <TITLE>Empire Burlesque</TITLE>
  <ARTIST>Bob Dylan</ARTIST>
 <CD id="2">
  <TITLE>Hide your heart</TITLE>
  <ARTIST>Bonnie Tyler</ARTIST>
 <CD id="3">
  <TITLE>Greatest Hits</TITLE>
  <ARTIST>Dolly Parton</ARTIST>
 <CD id="4">
  <TITLE>Still got the blues</TITLE>
  <ARTIST>Gary Moore</ARTIST>
  <COMPANY>Virgin records</COMPANY>

We create a POJO to store the contents for application use with the required data.

package com.sourcetricks.MyDomParser;

public class CD {
 private String id;
 private String title;
 private String artist;
 public String getId() {
  return id;
 public void setId(String id) { = id;
 public String getTitle() {
  return title;
 public void setTitle(String title) {
  this.title = title;
 public String getArtist() {
  return artist;
 public void setArtist(String artist) {
  this.artist = artist;
 public void print() {
  System.out.println("ID = " + id);
  System.out.println("Title = " + title);
  System.out.println("Artist = " + artist);

This is the main application program to read the XML file contents, parse and iterate to read the required content. Finally we print the POJO's.
package com.sourcetricks.MyDomParser;

import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;

public class MyDomParser {
 public static void main(String[] args) {

  List<CD> cdList = new ArrayList<CD>();
  try {
   // Setup the parser
   DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
   DocumentBuilder builder = builderFactory.newDocumentBuilder();
   // Read the XML file
   File inputFile = new File("resources/input-data.xml");
   InputStream inputStream = new FileInputStream(inputFile);
   // Parse the XML file   
   Document doc = builder.parse(inputStream);
   // Get all CD elements
   NodeList cdElements = doc.getElementsByTagName("CD");
   for ( int i = 0; i < cdElements.getLength(); i++ ) {
    Node currentNode = cdElements.item(i);
    // Seen the CD tag
    if ( currentNode instanceof Element ) {
     // Store in a pojo
     CD cd = new CD();
     // Read attribute of CD element
     cd.setId(((Element) currentNode).getAttribute("id"));
     // Child elements under CD
     NodeList childNodes = currentNode.getChildNodes();
     for ( int j = 0; j < childNodes.getLength(); j++ ) {
      Node childNode = childNodes.item(j);
      if ( childNode instanceof Element ) {
       if ( childNode.getNodeName().equalsIgnoreCase("title") ) {
       else if ( childNode.getNodeName().equalsIgnoreCase("artist") ) {
       // Include other elements as needed
  } catch (Exception e) {
  // Print contents of CD list
  for ( CD c : cdList ) {

This is output.
ID = 1
Title = Empire Burlesque
Artist = Bob Dylan
ID = 2
Title = Hide your heart
Artist = Bonnie Tyler
ID = 3
Title = Greatest Hits
Artist = Dolly Parton
ID = 4
Title = Still got the blues
Artist = Gary Moore

1 comment :

  1. Thank you for the clear example.


Contact Form


Email *

Message *

Back to Top