| [ directory ] |
|
10.2 ServletMore and more XML documents are likely to be exchanged in business transactions among companies in the near future. Let's consider a dynamic XML-based service that exchanges XML documents by replacing HTML with XML. As the first step, we introduce Servlet to process XML documents.[2]
First, in Section 10.2.1, we describe a stock quote servlet as an example of one-way service, which returns an XML document as an HTTP GET response. Then, in Section 10.2.2, we describe a bookstore servlet as an example of a request-and-response service, which receives an XML document from an HTTP POST request and then returns an XML document as its response. Finally, in Section 10.2.3, we discuss state management in servlets as a generic programming issue. In this section, we use application-specific XML documents. As you read this chapter, take into account that these techniques can be naturally extended to XML messaging (see Chapter 12) and Web services (see Chapter 13). 10.2.1 Returning XML Documents from a ServletAssume you are a CEO of a small start-up company that wants to begin a B2B service using the Web as soon as possible to defeat its rival companies. What is the easiest and quickest way to start such a service? One possibility is to provide a one-way service that returns an XML document as an HTTP response. Specifically, the service receives a request as an HTTP GET request, which does not contain an XML document but contains some HTTP request parameters. Then it processes the request parameters and finally returns an XML document as its response. This kind of one-way service is very easy to build and does not require a large effort for writing clients to connect to this service. For example, we could write a Java client for this service very easily. This easiness may increase the business opportunities of the company. Stock Quote ServiceFigure 10.1 shows a simple example of one-way services. This is a stock quote service that returns the current stock price of a specified company in an XML document. The company's name, such as IBM, is given as the request parameter of HTTP GET (a non-XML request).[3]
Figure 10.1. Stock quote service
The StockQuoteServlet.java servlet, shown in Listing 10.1, is an example implementation of the stock quote service running as a servlet. Listing 10.1 A servlet for a stock quote service, chap10/stockquote/StockQuoteServlet.java
package chap10.stockquote;
import java.io.Writer;
import java.io.IOException;
import java.io.StringReader;
import java.io.OutputStreamWriter;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import chap10.EscapeString;
public class StockQuoteServlet extends HttpServlet {
private static final String NAMESPACE_URI =
"http://www.example.com/xmlbook2/chap10/stockquote";
SAXParserFactory factory;
public void init() throws ServletException {
// Creates an instance of SAXParserFactory
factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
}
public void doGet(HttpServletRequest req,
HttpServletResponse res)
throws ServletException, IOException
{
// Gets the parameter named "company"
String company = req.getParameter("company");
// Escapes the string to prevent cross-site scripting attack
company = EscapeString.escape(company);
// Gets the stock price of the company
int price = getPrice(company);
// Creates a response XML document from a template
String xml =
("<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<StockQuote" +
" company=\"" + company + "\"" +
" xmlns=\"" + NAMESPACE_URI + "\">" +
" <price>" + price + "</price>" +
"</StockQuote>");
// Checks the response XML is well-formed
// by calling the parser
try {
SAXParser parser = factory.newSAXParser();
InputSource input =
new InputSource(new StringReader(xml));
parser.parse(input, new DefaultHandler());
} catch (ParserConfigurationException e) {
throw new ServletException(e);
} catch (SAXException e) {
throw new ServletException(e);
}
// Sets the Content-Type header
res.setContentType("application/xml; charset=utf-8");
// Creates a writer with the encoding parameter as "UTF-8"
Writer out = new OutputStreamWriter(res.getOutputStream(),
"UTF-8");
// Sends the response XML to the client
out.write(xml);
out.flush();
}
int getPrice(String company) {
// Pseudo implementation of getPrice
byte[] data = (company == null ? "" : company).getBytes();
int price = 0;
for (int i = 0; i < data.length; i++)
price += data[i] & 0xff;
price = 150 - price % 100;
return price;
}
}
Note that we will describe the cross-site scripting attack, which is mentioned in the comment, later in this section. Let's run StockQuoteClient, a standalone Java client for StockQuoteServlet, with the following command: R:\samples>java chap10.stockquote.StockQuoteClient http://demohost:8080/xmlbook2/chap10/StockQuoteServlet?company=IBM Listing 10.2 shows the result. The actual output may not contain line breaks, but we inserted them for readability. Listing 10.2 The result of StockQuoteServlet<?xml version="1.0" encoding="UTF-8" ?> <StockQuote company="IBM" xmlns="http://www.example.com/xmlbook2/chap10/stockquote"> <price>134</price> </StockQuote> This is an example of a one-way service that is implemented as a servlet. Regarding how a servlet works on the server side, see The Mechanism of a Servlet Container. Let's see the details of the program. The Mechanism of a Servlet ContainerA servlet container manages servlets on one or a few Java VM processes running on a server machine (see Figure 10.2). Figure 10.2. The mechanism of a servlet container
Once the Java VM process starts, it lasts as long as the server is working. Thus, servlets do not have the overhead of CGI, which starts a new process for every HTTP request. An HTTP server and a servlet container are connected by a container-specific interface or protocol. A servlet container dispatches a thread to handle an HTTP request. Then a servlet is called as an application program to handle the request. The overhead to start a thread is less than that of a process. Also, a servlet container stores instances of servlets and threads into a pool and reuses them whenever needed. Program DetailsLet's take a closer look at StockQuoteServlet.java. A servlet is defined as a subclass of the HttpServlet class.
public class StockQuoteServlet extends HttpServlet {
Servlet's init() method is called only once, when the instance of the servlet is initialized. The developer of a servlet can describe whatever they want to be called when the servlet is initialized梖or example, creating a database connection for later use. As for the lifecycle of a servlet, see Lifecycle of a Servlet. In this servlet, the init() method creates an instance of the SAXParserFactory class for later use for creating SAXParser objects. The coding is not very different from that of standalone Java applications.
SAXParserFactory factory;
public void init() throws ServletException {
// Creates an instance of SAXParserFactory
factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
}
The doGet() method is called whenever an HTTP GET request arrives. There are two parameters for this method: the HttpServletRequest object and the HttpServletResponse object. These are abstractions of HTTP request and response. For example, the HttpServletRequest object has a getReader() method to get a Reader object from which a servlet reads the content of an HTTP request.
public void doGet(HttpServletRequest req, HttpServletResponse res)
throws ServletException, IOException
{
...
}
A servlet provides a doPost() method to handle HTTP POST requests as well. If you want to write a servlet that can handle both GET and POST requests, you can provide a common method called by both the doGet() and doPost() methods, or override the service() method. Note that the service() method calls the doGet() and doPost() methods according to the type of HTTP request as the default behavior. Let's see the details of the doGet() method of StockQuoteServlet. First, the servlet gets the value of the parameter named company. This value represents the name of a company, such as IBM, to get its stock price. Then, the value is escaped as an XML string to prevent cross-site scripting attacks. Finally, the servlet gets a stock price for the specified company by calling the getPrice() method.
// Gets the parameter named "company"
String company = req.getParameter("company");
// Escapes the string to prevent cross-site scripting attack
company = EscapeString.escape(company);
// Gets a stock price of the company
int price = getPrice(company);
In a real application, the getPrice() method should have the code to get a stock price online梖or example, by connecting to the backend database system. In this sample, however, the method returns a pseudo stock price.
int getPrice(String company) {
// Pseudo implementation of getPrice
byte[] data = (company == null ? "" : company).getBytes();
int price = 0;
for (int i = 0; i < data.length; i++)
price += data[i] & 0xff;
price = 150 - price % 100;
return price;
}
Next, the servlet creates a response XML document by embedding the values into a template XML document.
// Creates a response XML document from a template
String xml =
("<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<StockQuote" +
" company=\"" + company + "\"" +
" xmlns=\"" + NAMESPACE_URI + "\">" +
" <price>" + price + "</price>" +
"</StockQuote>");
Before sending back the XML document as an HTTP response, we make sure that the XML document is well-formed according to the discussion in Chapter 3, Section 3.4.2. We use a SAXParser object for the check. If the document is not well-formed, the SAXParser object throws a SAXException exception. Thus, the servlet can detect that the generated XML document is wrong. The servlet rethrows the exception caused by parsing, enclosing it with a ServletException exception, and sends back an error report to the client. Note that a more appropriate way of reporting the error to the client is to use an XML document that represents the error. It may increase the opportunity that the machine client can handle the error automatically. However, we do not adopt this idea here to keep the example as simple as possible.
// Checks the response XML is well-formed
// by calling a parser
try {
SAXParser parser = factory.newSAXParser();
InputSource input =
new InputSource(new StringReader(xml));
parser.parse(input, new DefaultHandler());
} catch (ParserConfigurationException e) {
throw new ServletException(e);
} catch (SAXException e) {
throw new ServletException(e);
}
If the XML document is well-formed, the servlet sets the HTTP Content-Type header as application/xml; charset=utf-8 and then creates a writer object by specifying the encoding parameter as UTF-8. Finally, the servlet sends the XML document back to the client as an HTTP response.
// Sets the Content-Type header
res.setContentType("application/xml; charset=utf-8");
// Creates a writer with the encoding parameter as "UTF-8"
Writer out = new OutputStreamWriter(res.getOutputStream(),
"UTF-8");
// Sends the response XML to the client
out.write(xml);
out.flush();
Now that we have shown the details of StockQuoteServlet, we hope you understand how easy it is to write such a one-way service. Next let's look at cross-site scripting, one of the typical security vulnerabilities in Web applications and one that most developers are not aware of. Lifecycle of a ServletA servlet is generated, executed, and destroyed according to the following lifecycle.
Deployment of ServletsServlets should be deployed to a servlet container before they are in use. A file called a deployment descriptor is used to give the servlet container the information about how to deploy the servlets. A deployment descriptor defines a Web application, which is a set of servlets and JSPs that are typically related to each other. Servlets and JSPs in the same Web application are deployed under the same URL. Note that this Web application is not a generic term used in other parts of this book. The Web application used here is Servlet-specific terminology. We use italic for the Servlet-specific Web application to make a clear distinction. A deployment descriptor consists of a servlet name (which corresponds to a servlet definition), a class name for the servlet, init parameters, and mapping to URLs. A mapping to URLs is given by a URL pattern, which is a URL or a regular expression. Here is an example deployment descriptor:
This deployment descriptor indicates that a servlet named snoop is created from the chap10.SnoopServlet class, and the corresponding Web application's URL followed by /snoop is mapped to this servlet. In the same manner, the Web application's URL followed by any string ending with .xml is mapped to the servlet named hello梖or example, http://www.example.com/xmlbook2/chap10/index.xml, where http://www.example.com/xmlbook2/chap10/ is the Web Application's URL. Countermeasure for Cross-Site ScriptingListing 10.1 contained a comment about cross-site scripting. Cross-site scripting is one of the security vulnerabilities of Web applications caused by an embedded script as a parameter of an HTTP request. The CrossSiteScriptingServlet.java servlet, shown in Listing 10.3, is an example of such a vulnerability. Care should be taken because the CrossSiteScriptingServlet class has the cross-site scripting vulnerability. Therefore, the servlet example is disabled on the accompanying CD-ROM by default. Although you can try the example by enabling the definition of the servlet CrossSiteScriptingServlet in web.xml, you must not put the servlet on a public Web server. Listing 10.3 A servlet that has a cross-site scripting vulnerability, chap10/CrossSiteScriptingServlet.java
package chap10;
import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
/**
* *** WARNING ***
* Care should be taken when you deploy this servlet on your server.
* This servlet demonstrates the cross-site scripting attack,
* therefore it has a security vulnerability by nature.
*/
public class CrossSiteScriptingServlet extends HttpServlet {
public void doGet(HttpServletRequest req,
HttpServletResponse res)
throws ServletException, IOException
{
// Gets the parameter named "parameter"
String parameter = req.getParameter("parameter");
// Creates a response HTML document from a template
String html =
("<!DOCTYPE html PUBLIC " +
"\"-//W3C//DTD HTML 4.01//EN\">" +
"<HTML lang=\"en\">" +
"<HEAD>"+
"<TITLE>An example of cross-site" +
" scripting vulnerability</TITLE>"+
"<META http-equiv=\"Content-Type\"" +
"content=\"text/html; charset=us-ascii\">" +
"</HEAD>" +
"<BODY><P>" +
// Includes the parameter here
parameter +
"</P></BODY>" +
"</HTML>");
// Sends the response HTML to the client
res.setContentType("text/html; charset=utf-8");
res.getWriter().print(html);
}
}
The servlet CrossSiteScriptingServlet embeds the value of parameter, obtained from a GET request, into a response HTML document as is. The developer of this servlet assumes the following URL is used to access the servlet. In this case, there is no problem because the string Hello! is just shown on the Web browser. http://demohost:8080/xmlbook2/chap10/CrossSiteScriptingServlet? parameter=Hello! However, if someone accesses the servlet with the following URL and the Web browser supports the SCRIPT tag such as JavaScript does, what happens? Note that the original URL is just one line but is wrapped for printing.
http://demohost:8080/xmlbook2/chap10/CrossSiteScriptingServlet
?parameter=<SCRIPT%20language="JavaScript">alert('Hello!')</SCRIPT>
The Web browser will pop up a window with the message Hello! This is the result of executing the script as follows. Needless to say, the script originally came from the HTTP request and was embedded in the response HTML document.
<SCRIPT language="JavaScript">alert('Hello!')</SCRIPT>
You may wonder why this is a problem because in this case, a script embedded by a user is just executed on the user's Web browser. It seems to be the user's responsibility. The essence of the cross-site scripting problem is that the script is executed under the context that you are browsing this page. Let us assume that a malicious attacker publishes a Web page with a URL link to a Web site that you trust and the URL contains a malicious script. If you click the URL link on the malicious Web page, the script is executed under the context of the Web page you trust. Such a malicious script has a privilege to access Web browser's cookies, which should be accessed only by the Web server that provides the Web page. Therefore, if a cookie has critical information from a security point of view, an attacker may steal the information. On many shopping Web sites, once a user logs on to the server, the session information is stored in a cookie. A malicious attacker can capture the session and steal the user's personal information. This type of security vulnerability is called cross-site scripting, a serious attack for end users. You might think that the cross-site scripting attack does not affect the servlet described in this section because the servlet does not process any HTML documents, only XML documents. However, the XML documents created by the servlet can be embedded as a part of other HTML documents generated by other servlets or JSPs. Also, some Web browsers can execute the scripts in the XML documents without checking that the Content-Type header indicates not an HTML document but an XML document. Therefore, we show how to prevent a cross-site scripting attack. One of the simplest ways to prevent a cross-site scripting attack is to escape the characters having a special meaning in HTML and XML, like "<" in parameter values. Listing 10.4 shows EscapeString.java, which escapes the characters. Listing 10.4 Escaping special characters, chap10/EscapeString.java
package chap10;
final public class EscapeString {
private EscapeString() {}
public static String escape(String string) {
StringBuffer buf = new StringBuffer();
int length = string.length();
for (int i = 0; i < length; i++) {
char ch;
switch (ch = string.charAt(i)) {
case '<':
buf.append("<");
break;
case '>':
buf.append(">");
break;
case '"':
buf.append(""");
break;
case '\'':
buf.append("'");
break;
case '&':
buf.append("&");
break;
default:
buf.append(ch);
break;
}
}
return buf.toString();
}
}
You can see that this code replaces the special characters with an equivalent string. For example, the "<" is replaced by "<". By replacing the special characters, a Web browser receiving an HTML or XML document treats the malicious script as a string that is embedded in another tag instead of executing it as a script. We can improve CrossSiteScriptingServlet by replacing the code to get parameters with the following code:
// Gets the parameter named "parameter", escaping the characters
String parameter = EscapeString.escape(req.getParameter("parameter"));
Even though the way to prevent the cross-site scripting vulnerability is quite easy, a lot of servlets, JSPs, and CGIs in the world still have this vulnerability. Application developers need to be more careful about the cross-site scripting attack in programming. 10.2.2 Receiving XML DocumentsIn this section, we describe a request-and-response service that receives XML documents. Specifically, the service receives an XML document in an HTTP POST request, processes it, and returns an XML document as its response. We introduce this service in this section because it should be a natural extension of the one-way service described in Section 10.2.1. In general, an XML document is embedded in the payload of an HTTP POST request and is transferred to others. Alternatively, an XML document can be sent as a parameter of an HTML form. However, the former approach is better than the latter because a Multipurpose Internet Mail Extensions (MIME) media type can be written in a Content-Type header according to the type of the document (in this case, the type should be application/xml or something appropriate). This approach improves interoperability between a Web client and a servlet. A Bookstore ServiceWe use an online bookstore service like Amazon.com as an example in this section. Unlike Amazon.com, we assume this is an example of B2B instead of B2C; therefore, we do not use a Web browser as a Web client that sends a purchase order to this bookstore service. We use a Java program instead. The BookStoreServlet class is a servlet implementation of the bookstore service. It provides a virtual shopping cart so that a client can order books interactively. We also provide the BookStoreClient class as a client for the servlet. The BookStoreServlet and BookStoreClient objects communicate with each other to order books, as shown in Figure 10.3. Figure 10.3. BookStoreServlet and BookStoreClient in the bookstore service
Listings 10.5 and 10.6 are the XML documents processed by the servlet BookStoreServlet. Listing 10.5 XML document, chap10/data/addItem.xml
<?xml version="1.0" encoding="UTF-8"?>
<addItem
xmlns="http://www.example.com/xmlbook2/chap10/bookstore/BookStore">
<item id="0201123456" count="3"/>
<item id="0209876543" count="2"/>
</addItem>
Listing 10.6 XML document, chap10/data/order.xml
<?xml version="1.0" encoding="UTF-8"?>
<order
xmlns="http://www.example.com/xmlbook2/chap10/bookstore/BookStore"
/>
The addItem element is used to add a book order request to a shopping cart. Its child element, item, contains a book's ISBN and the number of books to order. The order element is used to issue an actual order for the books in the shopping cart. Listing 10.7 shows the result of calling the servlet BookStoreServlet by running the BookStoreClient class. The first argument of the BookStoreClient class indicates the URL of the servlet BookStoreServlet, the second argument indicates the URL of the addItem.xml file, and the third argument indicates the URL of the order.xml file. Listing 10.7 The result of running the BookStoreClient classR:\samples>java chap10.bookstore.BookStoreClient http://demohost:8080/xmlbook2/chap10/bookstore/BookStoreServlet data\addItem.xml data\order.xml REQUEST: data\addItem.xml <?xml version="1.0" encoding="UTF-8"?> <addItem xmlns="http://www.example.com/xmlbook2/chap10/bookstore/BookStore"> <item id="0201123456" count="3"/> <item id="0209876543" count="2"/> </addItem> The client will send the above request. Hit enter to proceed==>RESPONSE: <?xml version="1.0" encoding="UTF-8"?> <addItemResponse xmlns="http://www.example.com/xmlbook2/chap10/bookstore/BookStore"> <item count="3" id="0201123456"/> <item count="2" id="0209876543"/> </addItemResponse> The client has received the above response Hit enter to proceed==> REQUEST: data\order.xml <?xml version="1.0" encoding="UTF-8"?> <order xmlns="http://www.example.com/xmlbook2/chap10/bookstore/BookStore" /> The client will send the above request. Hit enter to proceed==> RESPONSE: <?xml version="1.0" encoding="UTF-8"?> <orderResponse xmlns="http://www.example.com/xmlbook2/chap10/bookstore/BookStore"> <item count="2" id="0209876543"/> <item count="3" id="0201123456"/> </orderResponse> The client has received the above response Hit enter to proceed==> Two kinds of book orders are added to the shopping cart by the addItem request. They are actually ordered by the order request. Here are three major functions to implement BookStoreServlet:
The second function has already been described in Section 10.2.1. In the rest of this section, we describe how to implement the first and third functions. Receiving Request XML Documents from ClientsYou may wonder why receiving request XML documents is so difficult that it is worth describing here. It may seem that you could just feed an input stream object, which can be obtained by calling the HttpServletRequest#getReader() method, to an XML processor. Actually, such a simple way of reading XML documents works in most applications. The problem, however, is not so simple as we might expect when we consider handling international character sets. In more general terms, we need to consider the interoperability issues. In the rest of this section, we describe a general way of maintaining interoperability, providing a set of generic libraries to receive XML documents contained in HTTP requests according to a standard way prescribed in Requests For Comments (RFCs). We need to handle an HTTP Content-Type header appropriately to receive an XML document from a Web client. Otherwise, we cannot ensure the interoperability between a Web client and a servlet. This process is more difficult than you would expect if you want to be exact. The HTTP Content-Type header is specified in the MIME specification. RFC 3023: XML Media Types specifies media types for XML documents.[4] We describe the right way to handle media types and charset parameters compliant with RFC 3023.
We overview what RFC 3023 specifies just to help you understand it. See the RFC for the exact and detailed specification.
A charset can be specified in an HTTP Content-Type header as well as an encoding declaration in an XML document in the HTTP payload. You may feel that it is redundant or confusing. RFC 3023 shows how to resolve charsets when both of them or one of them is specified. We use the following examples to describe how to resolve charsets.
When we implement a servlet that handles media types compliant with the RFC 3023 specification, we need to consider how to create a SAX InputSource object from an InputStream of an HTTP request and pass it to the XML processor. XmlMimeEntityHandler.java, shown in Listing 10.8, is a generic library to handle the processing described earlier. The getInputSource() method creates an InputSource object from an InputStream object according to a Content-Type header in the HTTP POST request received by the servlet. Also, the method confirms that the media type in a Content-Type header is an XML media type. Listing 10.8 A library class that handles XML MIME entities, chap10/XmlMimeEntityHandler.java
package chap10;
import java.io.Reader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import javax.mail.internet.ContentType;
import javax.mail.internet.ParseException;
import org.xml.sax.InputSource;
final public class XmlMimeEntityHandler {
private XmlMimeEntityHandler() {}
public static InputSource getInputSource(String ctype,
InputStream in)
throws XmlMimeEntityException
{
// Creates ContentType
ContentType contentType = null;
try {
[21] contentType = new ContentType(ctype);
} catch (ParseException e) {
throw new XmlMimeEntityException(e.getMessage());
}
[26] // Checks primitive type
[27] String primaryType = contentType.getPrimaryType();
[28] if (!"text".equals(primaryType) &&
[29] !"application".equals(primaryType))
[30] throw new XmlMimeEntityException(ctype);
[31]
[32] // Checks sub type
[33] String subType = contentType.getSubType();
[34] if (!"xml".equals(subType) && !subType.endsWith("+xml"))
[35] throw new XmlMimeEntityException(ctype);
// Gets charset parameter
String charset = contentType.getParameter("charset");
[39] if (charset == null) { // no charset
[40] // MIME type "text/*" omitted charset should be treated
[41] // as us-ascii
[42] if ("text".equals(contentType.getPrimaryType()))
[43] charset = "us-ascii";
[44] }
InputSource input;
if (charset == null) { // application/xml omitted charset
[48] input = new InputSource(in);
} else {
// Creats a reader with java charset
Reader reader = null;
try {
[53] reader = new InputStreamReader(in, charset);
} catch (UnsupportedEncodingException e) {
throw new XmlMimeEntityException(e.getMessage());
}
[57] input = new InputSource(reader);
}
return input;
}
}
First, the method creates an instance of the ContentType class from a given Content-Type header value (line 21). The ContentType class, which is part of the JavaMail API, is an abstraction of the value of the Content-Type header. Then, the method checks the ContentType object (lines 26?5). At this moment, if the charset parameter is omitted and the primary type is text, us-ascii is used as the charset (lines 39?4). If the charset cannot be determined eventually, an InputSource object is created from the InputStream object (line 48). In this case, the XML processor determines the charset according to the encoding declaration in the XML document. Once the charset is determined, an InputStreamReader object is created from the charset (line 53), and then an InputSource object is created from the InputStreamReader object (line 57). In this case, the XML processor handles the input as a Unicode string. Before looking at how a servlet invokes the XmlMimeEntityHandler.getInputSource() method, let's see GenericDOMServlet.java, shown in Listing 10.9. Listing 10.9 A generic servlet that handles an input XML document as a DOM tree, chap10/GenericDOMServlet.java
package chap10;
import java.io.InputStream;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.apache.xml.serialize.OutputFormat;
import org.apache.xml.serialize.XMLSerializer;
import chap10.XmlMimeEntityHandler;
import chap10.XmlMimeEntityException;
public abstract class GenericDOMServlet extends HttpServlet {
static final String NAMESPACE_URI =
"http://www.example.com/xmlbook2/chap10/GenericDOMServlet";
DocumentBuilderFactory factory;
public void init() throws ServletException {
factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
}
public void doPost(HttpServletRequest req,
HttpServletResponse res)
throws ServletException, IOException
{
DocumentBuilder parser = newDocumentBuilder();
Document resDoc;
try {
// Gets Content-Type header
String ctypeValue = req.getContentType();
// Gets an input source
InputStream in = req.getInputStream();
InputSource input =
XmlMimeEntityHandler.getInputSource(ctypeValue, in);
// Parses the input here
Document reqDoc = parse(input);
// Creates an output document
resDoc = doProcess(req, res, reqDoc);
} catch (XmlMimeEntityException e) {
e.printStackTrace();
// Creates an output document
resDoc = newDocument();
Element root =
resDoc.createElementNS(NAMESPACE_URI, "error");
root.setAttribute("xmlns", NAMESPACE_URI);
String name = e.getClass().getName();
String message = e.getMessage();
Node node =
resDoc.createTextNode(name + ": " + message);
root.appendChild(node);
resDoc.appendChild(root);
}
if (resDoc != null) {
// Sets the Content-Type header
res.setContentType("application/xml; charset=utf-8");
// Serializes the DOM into bytes,
// and then sends it back to the client
OutputFormat formatter = new OutputFormat();
formatter.setPreserveSpace(true);
XMLSerializer serializer =
new XMLSerializer(res.getOutputStream(), formatter);
serializer.serialize(resDoc);
}
}
protected DocumentBuilder newDocumentBuilder()
throws ServletException
{
try {
return factory.newDocumentBuilder();
} catch (ParserConfigurationException e) {
throw new ServletException(e);
}
}
protected Document newDocument()
throws ServletException
{
DocumentBuilder parser = newDocumentBuilder();
return parser.newDocument();
}
protected Document parse(InputSource in)
throws ServletException, IOException
{
try {
DocumentBuilder parser = newDocumentBuilder();
return parser.parse(in);
} catch (SAXException e) {
throw new ServletException(e);
}
}
public abstract Document doProcess(HttpServletRequest req,
HttpServletResponse res,
Document reqDoc)
throws ServletException, IOException;
}
GenericDOMServlet is a generic servlet to receive XML documents from HTTP requests in a correct way by calling the XmlMimeEntityHandler.getInputSource() method. This servlet abstracts the application logic by defining an abstract method, doProcess(). An application of this servlet, such as BookStoreServlet, is expected to extend this servlet and implement the doProcess() method. (The source code for BookStoreServlet is shown later in this section.) After processing the received XML document in the doPost() method, GenericDOMServlet calls the doProcess() method of its subclass. The doProcess() method takes three arguments: HttpServletRequest, HttpServletResponse, and Document. The first and second arguments are the same as the arguments for the doPost() method. The third argument, Document, is the result of parsing the input XML document, which can be accessed as a DOM object. The return value of the doProcess() method is also Document, which is sent back to the client as a response from the application. Let's take a closer look at the code fragment that is processing a received XML document in the doPost() method of GenericDOMServlet.
// Gets Content-Type header
String contentTypeValue = req.getContentType();
// Gets an input source
InputStream in = req.getInputStream();
InputSource input =
XmlMimeEntityHandler.getInputSource(contentTypeValue, in);
// Parses the input here
Document reqDoc = parser.parse(input);
First, the servlet gets the value of the Content-Type header by calling the HttpServletRequest#getContentType() method. Second, it gets an InputStream object by calling the HttpServletRequest#getInputStream() method to read raw bytes from the stream. Note that we do not use a Reader object by calling the HttpServletRequest#getReader() method because the encoding selected for this reader object depends on the implementation of a servlet container. Third, it creates an InputSource object to be passed to the XML processor by calling the XmlMimeEntityHandler.getInputSource() method. Finally, it creates a DOM tree by passing the InputSource object to the XML processor. In this way, a subclass of this servlet can obtain a node value as Unicode string from the DOM tree; that is, the servlet can accept multilanguage XML documents. Processing a Shopping CartTo implement a shopping cart, the servlet needs to hold a state across more than one HTTP request. The servlet can use the HttpSession class for that purpose. The HttpSession class is an API to manage HTTP sessions that are typically implemented by using cookies. Figure 10.4 shows the mechanism to manage an HTTP session in Tomcat. Figure 10.4. A servlet session using cookies
Tomcat adds the Set-Cookie header in an HTTP response to the first HTTP request from a client. Set-Cookie is a directive for the client to set the Cookie header in the subsequent HTTP requests. Tomcat also adds Set-Cookie2, which is an extension of Set-Cookie. Set-Cookie contains a value of JSESSIONID, which uniquely identifies the HttpSession object managed by Tomcat on the server side. The client sends the received as the Cookie header in subsequent HTTP requests. When Tomcat receives an HTTP request, it gets the value of JSESSIONID from the Cookie header and retrieves the HttpSession object associated with JSESSIONID from its local storage or memory. An HttpSession object can hold one or more objects (for example, a shopping cart object) as its attributes by using the HttpSession#setAttribute() method. This mechanism allows the servlet to use the same shopping cart object in subsequent HTTP requests. Before looking at how the servlet accesses the HttpSession object, let's see the source code for BookStoreServlet.java, shown in Listing 10.10. Listing 10.10 A servlet class for the bookstore service, chap10/bookstore/BookStoreServlet.java
package chap10.bookstore;
import java.io.IOException;
import java.util.Iterator;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpSession;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import chap10.GenericDOMServlet;
public final class BookStoreServlet extends GenericDOMServlet
implements BookStoreApplication
{
public Document doProcess(HttpServletRequest req,
HttpServletResponse res,
Document reqDoc)
throws ServletException, IOException
{
// Gets a shopping cart from a session if it already exists.
// If it does not exist, a shopping cart is newly created.
ShoppingCartBean shoppingCart;
HttpSession session = req.getSession();
synchronized (session) {
shoppingCart = (ShoppingCartBean)
[29] session.getAttribute(SHOPPING_CART_BEAN);
[30] if (shoppingCart == null) {
[31] shoppingCart = new ShoppingCartBean();
[32] session.setAttribute(SHOPPING_CART_BEAN,
[33] shoppingCart);
}
}
[37] // Gets the root element of reqDoc
[38] Element reqRoot = reqDoc.getDocumentElement();
[39] // Gets the local name of the root element
[40] String localName = reqRoot.getLocalName();
// Prepares a DOM object for the response
Document resDoc = newDocument();
[45] // If the input XML document is "addItem"
[46] if ("addItem".equals(localName)) {
[47] Element resRoot =
[48] resDoc.createElementNS(NAMESPACE_URI,
[49] localName + "Response");
[50] resRoot.setAttribute("xmlns", NAMESPACE_URI);
[51]
[52] // For each "item" element in the request
[53] NodeList list =
[54] reqRoot.getElementsByTagNameNS(NAMESPACE_URI,
[55] "item");
[56] int length = list.getLength();
[57] for (int i = 0; i < length; i++) {
[58] Element elemItem = (Element)list.item(i);
[59] ItemBean item = new ItemBean(elemItem);
[60] // Adds the item into the shopping cart
[61] shoppingCart.addItem(item);
[62] Node node = resDoc.importNode(elemItem, true);
[63] resRoot.appendChild(node);
[64] }
[65] resDoc.appendChild(resRoot);
[66] return resDoc;
[67] }
[68] // If the input XML document is "order"
[69] if ("order".equals(localName)) {
[70] Element resRoot =
[71] resDoc.createElementNS(NAMESPACE_URI,
[72] localName + "Response");
[73] resRoot.setAttribute("xmlns", NAMESPACE_URI);
[74]
[75] // For each item in the shopping cart
[76] Iterator iterator = shoppingCart.getItems();
[77] while (iterator.hasNext()) {
[78] ItemBean item = (ItemBean)iterator.next();
[79] //
[80] // Orders the items here
[81] //
[82] Element elemItem =
[83] resDoc.createElementNS(NAMESPACE_URI, "item");
[84] elemItem.setAttribute("id", item.getId());
[85] elemItem.setAttribute("count", ""+item.getCount());
[86] resRoot.appendChild(elemItem);
[87] }
[88] resDoc.appendChild(resRoot);
[89] return resDoc;
[90] }
throw new ServletException("Unknown request: " + localName);
}
}
As we stated earlier, BookStoreServlet is a subclass of GenericDOMServlet and implements the doProcess() method, which is an abstract method of GenericDOMServlet. The servlet gets an HttpSession object by calling the HttpServletRequest#getSession() method. HttpSession session = req.getSession(); The handler first gets a shopping cart object from the HttpSession object by calling its getAttribute() method (line 29). At the first HTTP request, the shopping cart object is newly created and stored into the HttpSession object because it does not exist (lines 30?3). Second, it gets the local part of the document element of the request DOM tree (lines 37?0). The local part is either addItem or order. Third, it handles the request according to the local name. In case of addItem, the handler converts an item element in the request DOM tree into an ItemBean object and adds it into the shopping cart object (lines 45?7). As described earlier, the handler can access the ItemBean object in subsequent HTTP requests. In case of order, the handler gets all the ItemBean objects in the shopping cart object and orders the items (lines 68?0). (This example does not actually order books, of course.) Finally, the handler returns the response DOM tree (lines 66 and 89). We described how a servlet can hold a state, such as a shopping cart, across multiple HTTP requests by using the HttpSession object. The Client Communicating with the ServletThe BookStoreClient class is a client program to communicate with the servlet BookStoreServlet. The BookStoreClient class assumes a URL for the target servlet as the first argument, and URLs for XML files to be sent as the subsequent arguments. It waits for keyboard input from a user before and after sending HTTP requests so that it can send the requests interactively. The following code fragment shows how the client creates a BookStoreClient object in the main() method. // Creates an HTTP client, // using the first parameter as the target URL String url = args[0]; BookStoreClient domClient = new BookStoreClient(url); The following code fragment shows how the client sends a request XML document. The BookStoreClient#send() method has two arguments: the media type, application/xml, and the charset parameter, utf-8. They are to be set as the Content-Type header in the HTTP request.
// Sends the XML document
InputSource request = new InputSource(new FileReader(args[i]));
Document resDoc =
domClient.send(request, "application/xml", "utf-8");
Let's look at the details of the BookStoreClient#send() method. First, the client creates a DOM tree from the InputSource object. // Creates a DOM tree from the InputSource Document reqDoc = factory.newDocumentBuilder().parse(request); Then, the client converts the DOM tree into an octet stream encoded in the specified charset. For information about the serialization of a DOM tree, refer to Chapter 3, Section 3.4.
// Converts the reqDoc into the specified charset
OutputFormat formatter = new OutputFormat("xml", charset, false);
formatter.setPreserveSpace(true);
ByteArrayOutputStream bout = new ByteArrayOutputStream();
XMLSerializer serializer = new XMLSerializer(bout, formatter);
serializer.serialize(reqDoc);
InputStream bin = new ByteArrayInputStream(bout.toByteArray());
Finally, the client sends the XML document, receives a response XML document, and converts it into a DOM tree. These steps are almost the same as those of the servlet BookStoreServlet.
// Sends the data to the servlet
HttpURLConnection con = httpClient.send(bin, mimeType, charset);
// Receives a response from the server
String contentTypeValue = con.getContentType();
InputStream in = con.getInputStream();
InputSource input =
XmlMimeEntityHandler.getInputSource(contentTypeValue, in);
// Parses the response XML, and returns it
return factory.newDocumentBuilder().parse(input);
In the send() method, the client uses the HttpClient object as an HTTP client. The Content-Type header in the HTTP response can be obtained by calling the HttpURLConnection#getContentType() method. Listing 10.11 shows HttpClient.java, to be used by the BookStoreClient object. Listing 10.11 A generic HTTP client that sends HTTP POST requests, chap10/HttpClient.java
package chap10;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.io.BufferedOutputStream;
import java.net.URL;
import java.net.HttpURLConnection;
final public class HttpClient {
// The target URL
final String url;
// On-memory storage for cookies
Cookie cookie = null;
// A constructor
public HttpClient(String url) { this.url = url; }
public HttpURLConnection send(InputStream in,
String mimeType,
String charset)
throws IOException
{
// Creats an HTTP connection
URL objURL = new URL(url);
HttpURLConnection con =
(HttpURLConnection)objURL.openConnection();
// Sets HTTP method
con.setRequestMethod("POST");
[30] // Sets Content-Type header
[31] if (mimeType != null) {
[32] String value =
[33] mimeType +
[34] (charset == null ? "" : "; charset=" + charset);
[35] con.setRequestProperty("Content-Type", value);
[36] }
[38] // Sets the received cookie
[39] if (cookie != null)
[40] con.setRequestProperty("Cookie",
[41] cookie.getCookieValue());
con.setDoOutput(true);
// Sends a request to the server
OutputStream out =
new BufferedOutputStream((con.getOutputStream()));
byte[] buf = new byte[2048];
int length;
while ((length = in.read(buf)) != -1)
out.write(buf, 0, length);
out.flush();
out.close();
[55] // Creates Cookie from Set-Cookie header
[56] String value;
[57] if ((value = con.getHeaderField("Set-Cookie2")) != null)
[58] cookie = new Cookie(value);
[59] else if ((value = con.getHeaderField("Set-Cookie")) != null)
[60] cookie = new Cookie(value);
// Returns the HTTP response
return con;
}
}
The HttpClient object sends the data as an HTTP POST request by calling the HttpURLConnection object. It also adds the Content-Type header in the HTTP request according to the given media type and the charset parameter (lines 30?6). By the way, the standard Java API, the HttpURLConnection class, does not have a function to handle a Cookie, which is required to implement the shopping cart. In general, a Cookie needs to be managed carefully from a privacy and security point of view. The reason the HttpURLConnection class does not provide the Cookie function is the policy that Cookie management is the application's responsibility. We prepare the Cookie class that abstracts the Cookie header. This class gets the value of the Set-Cookie header and creates the Cookie value to be set to the Cookie header. We do not show the source code for the Cookie class because space is limited. Refer to the full source code on the CD-ROM. The HttpClient object uses the Cookie object as follows.
Because the HttpClient is just a sample program, we did not pay attention to privacy and security. However, when you develop a real application, you are responsible for implementing a policy to decide whether the client can send a cookie or not. In this section, we described how to implement a request-and-response service as a servlet, using a bookstore service as an example. 10.2.3 Considerations for State ManagementCare should be taken for stateful servlet programming because multiple threads may simultaneously call a method of a servlet, and a servlet instance is not always identical across the method calls. In this section, we discuss considerations for state management for servlet programming. This is not an XML-specific topic, but we believe it is helpful for readers. Generally speaking, servlets need to hold their states for various reasons. If you write a multithreaded servlet, you must be careful how you manage shared resources. In such a case, we often use the synchronized block and the synchronized method to prevent simultaneous access of a shared resource. However, it would be difficult for application programmers to consider which part of a program should be protected. Is there any technique to hold a shared state across multiple requests or multiple instances without using the synchronized block? A servlet's states are categorized into the following five patterns, according to the scope of the state (a scope is the range in which an object is accessible):
In pattern 1, a state is held in a local variable. In this case, the state is the object obtained from the HttpServletRequest object (except for the HttpSession object) or newly created in a method. A typical example of the former is a parameter value of an HTTP request. In either case, objects are created and discarded in a method call. In other words, state objects must not be accessible outside the scope of the method call. Needless to say, this state does not need to be protected. In pattern 2, a state is held in an instance variable of the servlet. Mutual exclusion is required for this pattern, as we described earlier. Furthermore, the number of servlet instances could be more than one for the same servlet definition in some cases (for the details, refer to The Number of Instances of Servlets). In this case, holding the state as an instance variable is meaningless. In pattern 3, a state is held in an HttpSession object. In this case, a servlet can use the state without any care because more than one HTTP request does not arrive simultaneously during the same session. Note that the state is discarded when the HTTP session finishes. In pattern 4, a state is held in a ServletContext object, which corresponds to a Web application. In this case, mutual exclusion is required because an object stored in the ServletContext object can be simultaneously accessed by the instances of the servlets and JSPs that belong to the same Web application. The state held in the ServletContext object is discarded when the corresponding Web application terminates. In pattern 5, a state is held in external storage, such as a database. Also, an entity bean in EJB is available for this purpose (see Chapter 11, Section 11.7). With this pattern, the state is held persistently; that is, the state can survive after the server terminates. However, there is significant overhead in accessing the state and in management cost. Therefore, this pattern is appropriate when the number of accesses to the state is relatively smaller than those of patterns 1 through 4. As a result, it is easy and safe to use a combination of patterns 1 (HttpServletRequest), 3 (HttpSession), and 5 (database) according to the requirement level. The decision points for choosing the pattern could be how long the state is used, how often the state is to be updated, how critical it would be if the state were lost, and so on. If the state is temporarily used, frequently updated, and not critical, you may choose pattern 1 or 3 instead of 5, and vice versa. Another typical case is to use pattern 1 or 3 while a user is editing the state and move it to the pattern 5 after the editing is done. For example, in typical B2C applications, user information and order information are held in a database while pending user information or virtual shopping carts are held in HTTP request parameters or HttpSession objects. The Number of Instances of ServletsUsually, only one servlet instance is created for each servlet definition except for the following two cases:
In this section, we described a one-way servlet and a request-and-response servlet as basic techniques for exchanging XML documents among companies. Also, we discussed considerations for multiple threads in servlet programming. We can apply the same techniques to database access (see Chapter 11), XML messaging (see Chapter 12), and Web services (see Chapter 13). You might feel that returning an XML document using Servlet is somewhat rigid. We described two ways to generate an XML document using Servlet in the examples: one is using a string template of an XML document embedded in the program, as shown in Listing 10.1; the other is serializing a DOM tree that is generated by the servlet, as shown in Listing 10.9. In either case, we need to recompile the servlet when we want to change the XML document that is generated. In the next section, we describe JavaServer Pages as a more flexible way to generate XML documents. |
|||||||||||||||
| [ directory ] |
|