| [ directory ] |
When the term Servlet is mentioned, it is almost always implied that the Servlet is an instance of HttpServlet[3]. The explanation of this is simple. The HyperText Transfer Protocol (HTTP)[4] is used for the vast majority of transactions on the World Wide Webevery Web page you visit is transmitted using HTTP, hence the http:// prefix. Not that HTTP is the best protocol to ever be made, but HTTP does work and HTTP is already widely used. Servlet support for HTTP transactions comes in the form of the javax.servlet.http.HttpServlet class.
[3] Note that at the time of writing there is only one protocol-specific servlet and it is HTTP. However, at least one JSR is looking to add additional protocol-specific servlets. In this particular case, it is the SIP (Session Initiation Protocol).
[4] Voracious readers are advised to read the current HTTP specification, http://www.ietf.org/rfc/rfc2616.txt. This book is not a substitute for the complete specification. However, this book does provide more than enough detail for the average Web developer.
Before showing an example of an HttpServlet, it is helpful to reiterate the basics of the HyperText Transfer Protocol. Many developers do not fully understand HTTP, which is critical in order to fully understand an HttpServlet. HTTP is a simple, stateless protocol. The protocol relies on a client, usually a Web browser, to make a request and a server to send a response. Connections only last long enough for one transaction. A transaction can be one or more request/response pairs. For example, a browser will send a request for an HTML page followed by multiple requests for each image on that page. All of these requests and responses will be done over the same connection. The connection will then be closed at the end of the last response. The whole process is relatively simple and occurs each time a browser requests a resource from an HTTP server[5].
[5] HTTP 1.1 allows these "long-lived" connections automatically; in HTTP 1.0 you need to use the Connection: Keep-Alive header.
The first part of an HTTP transaction is when an HTTP client creates and sends a request to a server. An HTTP request in its simplest form is nothing more than a line of text specifying what resource a client would like to retrieve. The line of text is broken into three parts: the type of action, or method, that the client would like to do; the resource the client would like to access; and the version of the HTTP protocol that is being used. For example:
GET /index.html HTTP/1.0
The preceding is a completely valid HTTP request. The first word, GET, is a method defined by HTTP to ask a server for a specific resource; /index.html is the resource being requested from the server; HTTP/1.0 is the version of HTTP that is being used. When any device using HTTP wants to get a resource from a server, it would use something similar to the above line. Go ahead and try this by hand against Tomcat. Open up a telnet session with your local computer on port 80. From the command prompt this is usually accomplished with:
telnet 127.0.0.1 80
Something similar to Figure 2-2 should appear.

The telnet program has just opened a connection to Tomcat's Web server. Tomcat understands HTTP, so type[6] in the example HTTP statement. This HTTP request can be terminated by a blank line, so hit Enter a second time to place an additional blank line and finish the request[7].
[6] Microsoft's telnet input will not appear in the window as you type. To fix this, type LOCAL_ECHO and hit Return. Also note that if you are using Microsoft XP, then the telnet window is not cleared after it is connected.
[7] If using Microsoft Window's default telnet program, be aware that the connection is livethat is, type in the full request correctly (even if it does not appear when you are typing it) and do not hit Backspace or Delete.
GET /jspbook/index.html HTTP/1.0
The content of index.html is returned from the Web Application mapped to /jspbook (the application we started last chapter), as shown in Figure 2-3.

You just sent a basic HTTP request, and Tomcat returned an HTTP response. While usually done behind the scenes, all HTTP requests resemble the preceding. There are a few more methods to accompany GET, but before discussing those, let's take a closer look at what Tomcat sent back.
The first thing Tomcat returned was a line of text:
HTTP/1.1 200 OK
This is an HTTP status line. Every HTTP response starts with a status line. The status line consists of the HTTP version, a status code, and a reason phrase. The HTTP response code 200 means everything was fine; that is why Tomcat included the requested content with the response. If there was some sort of issue with the request, a different response code would have been used. Another HTTP response code you are likely familiar with is the 404 "File Not Found" code. If you have ever followed a broken hyperlink, this is probably the code that was returned.
HTTP Response CodesIn practice, you usually do not need to understand all of the specific HTTP response codes. JSP, Servlets, and Web servers usually take care of these codes automatically, but nothing stops you from sending specific HTTP response codes. Later on we will see examples of doing this with both Servlets and JSP. A complete list of HTTP response codes along with other HTTP information is available in the current HTTP specification, http://www.ietf.org/rfc/rfc2616.txt. |
Along with the HTTP response code, Tomcat also sent back a few lines of information before the contents of index.html, as shown in Figure 2-4.

All of these lines are HTTP headers. HTTP uses headers to send meta-information with a request or response. A header is a colon-delimited name:value pairthat is, it contains the header's name, delimited by a colon followed by the header's value. Typical response headers include content-type descriptions, content length, a time-stamp, server information, and the date the content was last changed. This information helps a client figure out what is being sent, how big it is, and if the data are newer than a previously seen response. An HTTP request will always contain a few headers[8]. Common request headers consist of the user-agent details and preferred formats, languages, and content encoding to receive. These headers help tell a server what the client is and how they would prefer to get back information. Understanding HTTP headers is important, but for now put the concept on hold until you learn a little more about Servlets. HTTP headers provide some very helpful functionality, but it is better to explain them further with some HttpServlet examples.
[8] There are no mandatory headers in HTTP 1.0; in HTTP 1.1 the only mandatory header is the Host header.
The first relatively widely used version of HTTP was HTTP 0.9. This had support for only one HTTP method, or verb; that was GET. As part of its execution, a GET request can provide a limited amount of information in the form of a query string[9]. However, the GET method is not intended to send large amounts of information. Most Web servers restrict the length of complete URLs, including query strings, to 255 characters. Excess information is usually ignored. For this reason GET methods are great for sending small amounts of information that you do not mind having visible in a URL. There is another restriction on GET; the HTTP specification defines GET as a "safe" method which is also idempotent[10]. This means that GET must only be used to execute queries in a Web application. GET must not be used to perform updates, as this breaks the HTTP specification.
[9] A query string is a list started by a question mark, ?, and followed by name-value pairs in the following format, paramName=paramValue, and with an ampersand, &, separating pairs, for example, /index.html?fname=bruce&lname=wayne&password=batman.
[10] An idempotent operation is an operation that if run multiple times has no affect on statethat is, it is query only not update.
To overcome these limitations, the HTTP 1.0 specification introduced the POST method. POST is similar to GET in that it may also have a query string, but the POST method can use a completely different mechanism for sending information. A POST sends an unlimited amount of information over a socket connection as part of the HTTP request. The extra information does not appear as part of a URL and is only sent once. For these reasons the POST method is usually used for sending sensitive[11] or large amounts of information, or when uploading files. Note that POST methods do not have to be idempotent. This is very important, as it now means applications have a way of updating data in a Web application. If an application needs to modify data, or add new data and is sending a request over HTTP, then the application must not use GET but must instead use POST. Notice that POST requests may be idempotent; that is, there is nothing to stop an application using POST instead of GET, and this is often done when a retrieval requires sending large amounts of data[12]. However, note that GET can never be used in place of POST if the HTTP request is nonidempotent.
[11] However, realize that the data are still visible to snoopers; it just doesn't appear in the URL.
[12] The other issue is that GET sends data encoded using the application/x-www-urlencoded MIME type. If the application needs to send data in some other format, say XML, then this cannot be done using GET; POST must be used. For example, SOAP mandates the use of POST for SOAP requests to cover this exact problem.
In the current HTTP version, 1.1, there are in total seven HTTP methods that exist: GET, PUT, POST, TRACE, DELETE, OPTIONS, and HEAD. In practice only two of these methods are usedthe two we have already talked about: GET and POST.
The other five methods are not very helpful to a Web developer. The HEAD method requests only the headers of a response. PUT is used to place documents directly to a server, and DELETE does the exact opposite. The TRACE method is meant for debugging. It returns an exact copy of a request to a client. Lastly, the OPTIONS method is meant to ask a server what methods and other options the server supports for the requested resource.
As far as this book is concerned, the HTTP methods will not be explained further. As will soon be shown, it is not important for a Servlet developer to fully understand exactly how to construct and use all the HTTP methods manually. HttpServlet objects take care of low-level HTTP functionality and translate HTTP methods directly into invocations of Java methods.
An HTTP server takes a request from a client and generates a response. Responses, like requests, consist of a response line, headers, and a body. The response line contains the HTTP version of the server, a response code, and a reason phrase. The reason phrase is some text that describes the response, and could be anything, although a recommended set of reason phrases is given in the specification. Response codes themselves are three-digit numbers that are divided into groups. Each group has a meaning as shown here:
1xx: Informational: Request received, continuing process.
2xx: Success: The action was successfully received, understood, and accepted.
3xx: Redirection: Further action must be taken in order to complete the request.
4xx: User-Agent Error: The request contains bad syntax or cannot be fulfilled.
5xx: Server Error: The server failed to fulfill an apparently valid request.
Each Status: Code has an associated string (reason phrase).
The status code you'll see most often is 200. This means that everything has succeeded and you have a valid response. The others you are likely to see are:
401: you are not authorized to make this request
404: cannot find the requested URI
405: the HTTP method you have tried to execute is not supported by this URL (e.g., you have sent a POST and the URL will only accept GET)
500: Internal Server Error. You are likely to see this if the resource to where you are browsing (such as a Servlet) throws an exception.
If you send a request to a Servlet and get a 500 code, then the chances are your Servlet has itself thrown an exception. To discover the root cause of this exception, you should check the application output logs. Tomcat's logs are stored in /logs[13] directory of the Tomcat installation.
[13] Note that you can configure Tomcat to log output to the console window. This is often done during development because it is easier to read the console than open a log file. See the Tomcat documentation if you would like to do this.
| [ directory ] |