On the desktop and in the enterprise, Java and XML are a winning combination. In brief, Java is portable code and XML is portable data. Developing in Java gives you the ability to deploy code on many different platforms, while XML supplies a highly portable data format for exchanging data between application components and applications themselves.
XML's popularity reaches into the J2ME world. This chapter describes XML parsers that are available for MIDP environments. This is fun stuff, right at the raw edge of J2ME development. Standards are on the way, but they haven't arrived yet. See http://jcp.org/en/jsr/detail?id=172 for details on an XML parsing and web services JSR.
XML is the Extensible Markup Language. An XML file is some collection of data that is demarcated by tags. XML files are structured and highly portable.
Let's consider the Jargoneer application from Chapter 2 again. In that application, the MIDP device talks to an intermediate server. This server retrieves an HTML page from the Jargon File server, performs the parsing, and sends a distilled version of the data down to the MIDP device. Figure 14-1 shows this architecture.
What, exactly, could get sent from the intermediate server to the MIDP device? The Jargoneer example application actually sends flat text, but there are many other possibilities. The next simplest technique for exchanging data between a server and a device would be to use a properties file, like this:
word: grok pronunciation: /grok/ type: vt. meaning: [from the novel "Stranger in ...
This works fine and is probably all you would need for simple applications. You'd have to write a class that could parse this input (MIDP doesn't include java.util.Properties), but that wouldn't be too bad.
However, chances are excellent that some parts of your application are already speaking XML, and it would likely simplify your life considerably if your MIDlet could parse XML instead of having its own specific data format. Furthermore, using XML validation during the development cycle may be a big help in flushing out bugs.
As an XML file, then, the same information would probably look like this:
<?xml version="1.0" encoding="ISO-8859-1"?> <jargon-definition> <word>grok</word> <pronunciation>/grok/</pronunciation> <type>vt.</type> <meaning>[from the novel "Stranger in ...</meaning> </jargon-definition>
This simple XML document illustrates some important points. First, tags mark off every piece of data (element) in the document. In essence, every element has a name. Matching start and end tags are used to clearly separate elements. For example, the start tag <word> and the end tag </word> surround the word itself. Also note that elements may be nested. The jargon-definition element is simply a collection of other elements. Any of the other elements could contain further nested elements.
Element tags may also contain attributes. An alternate way of writing the previous XML file looks like this:
<?xml version="1.0" encoding="ISO-8859-1"?> <jargon-definition word="grok" pronunciation="/grok/" type="vt."> [from the novel "Stranger in ... </jargon-definition>
It's up to you exactly how you structure your XML data. Usually it depends on the structure of your application and the systems with which you will be exchanging data.
SAX is the Simple API for XML, a standard API for Java applications that want to parse XML data. The API is documented online at http://www.megginson.com/SAX/, but SAX-compliant parsers usually include the SAX API as part of their software. The current version of SAX is 2.0, but the small parsers covered in this chapter are only at the 1.0 level if they implement SAX at all.
SAX 1.0 revolves around the org.xml.sax.Parser interface. Parser has a method, parse(), that parses through an entire XML document, spitting out events to listening objects. Typically, your application will implement a DocumentHandler that receives notification about start tags, end tags, element data, and other important events. A SAX 1.0 application looks something like this:
try {
Parser p = new SAXParser(); // Create a specific parser implementation.
// Create some DocumentHandler named handler.
p.setDocumentHandler(handler);
p.parse();
}
catch (Exception e) { // Handle exceptions. }
The call to parse() proceeds until the document has been fully parsed. During the parse, callback methods in the registered DocumentHandler are invoked. In these methods, you'll process the data from the XML document.
SAX 1.0 is not MIDP-compliant straight out of the box. The Parser interface includes a setLocale() method that references the java.util.Locale class, a class that is missing in the MIDP platform.
Another standard API, the Document Object Model (DOM), takes a different approach to XML parsing. With DOM, the parser creates an internal model of a document as it is parsed. After parsing is complete, an application can examine the entire document. DOM is further described here at http://www.w3.org/TR/DOM-Level-2-Core/. Although none of the parsers described in this chapter implement DOM directly, some of them do follow the DOM paradigm of creating an internal representation of a parsed document.
A more recent XML parser API standard is XmlPull, which is documented at http://www.xmlpull.org/. XmlPull is implemented by kXML 2, which is presented later.
XML documents may also make reference to a Document Type Definition (DTD) or an XML Schema; these are files that describe the contents of a particular kind of XML document. We could, for example, write a DTD that specified the contents of a jargon-definition document. This is part of the power of XML, and it's the reason XML is sometimes called self-describing data.
Given a document, you can determine if it conforms to its DTD, which is a great way to determine if part of your system is producing data that's unreadable by the rest of your system. In XML terms, a document that follows the rules of its DTD or schema is valid. In the J2SE and J2EE worlds, parsers may be validating or non-validating. The J2ME world is too small to support XML document validation, so all of the parsers we'll discuss in this chapter are non-validating.
Even though you won't be able to perform validation on a MIDP device, you may well want to use validating parsers during your development and test cycle. For example, you might write code that emulates the MIDP client, having it request data from your server and validate the results. This helps flush out bugs in the server code before you make the switch over to the MIDP client software.
Common sense, as always, takes you a long way. As you contemplate the use of XML in your MIDP application, keep three things in mind:
Keep the documents small. If you're sending some 100KB document down to the MIDP deviceand only using a few elements, it's time to rethink your server-side strategy. You can probably transform the document at the server and just send what you need to the device. Keep in mind that network connectivity is likely to be slow and there's not much memory on the device.
Don't use comments in the XML that you send to the MIDP device, except perhaps as a debugging aid during the development cycle. Comments will only make the document longer, which implies a slower download and more memory usage on the device.
Choose a parser that fits your needs. Some of the parsers we'll examine build an entire model of a document in memory as the document is parsed. This is like writing a blank check to the supplier of the XML document. If the server sends you a 1MB file, these types of parsers will attempt to read through the whole thing, right up until they run out of memory. On the other hand, if you know the size of the files you'll be parsing, and they are small enough, you might choose a model-building parser, as it is slightly easier to use than the other types of parsers.
Consider using a compact representation of the XML document. Representing XML documents in ASCII or Unicode is not efficient, and there are various schemes for more compact representations. The WAP forum, in fact, has defined a standard for binary coded XML called WBXML. A parser called SWX can handle WBXML and is appropriate for small platforms like MIDP: http://www.trantor.de/wbxml/. For another approach to this problem, see the JXME project at http://jxme.jxta.org/.
Finally, you may be concerned about the performance of a small XML parser. This is a valid concern, especially on a small device that has a relatively slow processor. For a fascinating comparison of XML parser performance, see http://www.extreme.indiana.edu/~aslom/exxp/. With small documents, the small parsers can hold their own or outperform larger parsers.
As with other potentially time-consuming operations, parsing should be done in its own thread so the user interface doesn't freeze.