This first chapter answers the question "What is a Servlet?", shows
typical uses for Servlets, compares Servlets to CGI programs and
explains the basics of the Servlet architecture and the Servlet
lifecycle. It also gives a quick introduction to HTTP and its
implementation in the HttpServlet
class.
Servlets are modules of Java code that run in a server application
(hence the name "Servlets", similar to "Applets" on the client side)
to answer client requests. Servlets are not tied to a specific
client-server protocol but they are most commonly used with HTTP
and the word "Servlet" is often used in the meaning of "HTTP Servlet".
Servlets make use of the Java standard extension classes in the packages
javax.servlet
(the basic Servlet framework) and
javax.servlet.http
(extensions of the Servlet framework
for Servlets that answer HTTP requests). Since Servlets are written
in the highly portable Java language and follow a standard framework,
they provide a means to create sophisticated server extensions in a
server and operating system independent way.
Typical uses for HTTP Servlets include:
-
Processing and/or storing data submitted by an HTML form.
-
Providing dynamic content, e.g. returning the results of a
database query to the client.
-
Managing state information on top of the stateless HTTP, e.g.
for an online shopping cart system which manages shopping carts
for many concurrent customers and maps every request to the
right customer.
The traditional way of adding functionality to a Web Server is the
Common
Gateway Interface (CGI), a language-independent interface that
allows a server to start an external process which gets information
about a request through environment variables, the command line and its
standard input stream and writes response data to its standard output
stream. Each request is answered in a separate process by a separate
instance of the CGI program, or CGI script (as it is often called
because CGI programs are usually written in interpreted languages like
Perl).
Servlets have several advantages over CGI:
-
A Servlet does not run in a separate process. This removes the
overhead of creating a new process for each request.
-
A Servlet stays in memory between requests. A CGI program (and probably
also an extensive runtime system or interpreter) needs to be loaded
and started for each CGI request.
-
There is only a single instance which answers all requests
concurrently. This saves memory and allows a Servlet to easily manage
persistent data.
-
A Servlet can be run by a Servlet Engine in a restrictive
Sandbox
(just like an Applet runs in a Web Browser's
Sandbox) which allows secure use of untrusted and
potentially harmful Servlets.
A Servlet, in its most general form, is an instance of a class which
implements the javax.servlet.Servlet
interface. Most
Servlets, however, extend one of the standard implementations of that
interface, namely javax.servlet.GenericServlet
and
javax.servlet.http.HttpServlet
. In this tutorial we'll
be discussing only HTTP Servlets which extend the
javax.servlet.http.HttpServlet
class.
In order to initialize a Servlet, a server application loads the Servlet
class (and probably other classes which are referenced by the Servlet)
and creates an instance by calling the no-args constructor. Then it
calls the Servlet's init(ServletConfig config)
method. The
Servlet should performe one-time setup procedures in this method and
store the ServletConfig object so that it can be retrieved later by
calling the Servlet's getServletConfig()
method. This is
handled by GenericServlet
. Servlets which extend
GenericServlet
(or its subclass HttpServlet
)
should call super.init(config)
at the beginning of the
init
method to make use of this feature. The
ServletConfig
object contains Servlet parameters and
a reference to the Servlet's ServletContext
. The init
method is guaranteed to be called only once during the Servlet's
lifecycle. It does not need to be thread-safe because the service
method will not be called until the call to init
returns.
When the Servlet is initialized, its service(ServletRequest req,
ServletResponse res)
method is called for every request to the
Servlet. The method is called concurrently (i.e. multiple threads may
call this method at the same time) so it should be implemented in a
thread-safe manner. Techniques for ensuring that the service
method is not called concurrently, for the cases where this is not
possible, are described in
section 4.1.
When the Servlet needs to be unloaded (e.g. because a new version
should be loaded or the server is shutting down) the destroy()
method is called. There may still be threads that execute the
service
method when destroy
is called, so
destroy
has to be thread-safe. All resources which were
allocated in init
should be released in destroy
.
This method is guaranteed to be called only once during the Servlet's
lifecycle.
A typical Servlet lifecycle
|
Before we can start writing the first Servlet, we need to know some
basics of HTTP ("HyperText Transfer Protocol"), the protocol which
is used by a WWW client (e.g. a browser) to send a request to
a Web Server.
HTTP is a request-response oriented protocol. An HTTP request
consists of a request method, a URI, header fields and a body
(which can be empty). An HTTP response contains a result code
and again header fields and a body.
The service
method of HttpServlet
dispatches
a request to different Java methods for different HTTP request methods.
It recognizes the standard HTTP/1.1 methods and should not be overridden
in subclasses unless you need to implement additional methods. The
recognized methods are GET, HEAD, PUT, POST, DELETE, OPTIONS and TRACE.
Other methods are answered with a Bad Request HTTP error.
An HTTP method XXX is dispatched to a Java method
doXxx
, e.g. GET -> doGet
.
All these methods expect the parameters "(HttpServletRequest req,
HttpServletResponse res)
". The methods doOptions
and
doTrace
have suitable default implementations and are
usually not overridden. The HEAD method (which is supposed to return the
same header lines that a GET method would return, but doesn't
include a body) is performed by calling doGet
and ignoring
any output that is written by this method. That leaves us with the methods
doGet
, doPut
, doPost
and
doDelete
whose default implementations in
HttpServlet
return a Bad Request HTTP error. A
subclass of HttpServlet
overrides one or more of these
methods to provide a meaningful implementation.
The request data is passed to all methods through the first argument
of type HttpServletRequest
(which is a subclass of the
more general ServletRequest
class). The response can be
created with methods of the second argument of type
HttpServletResponse
(a subclass of ServletResponse
).
When you request a URL in a Web Browser, the GET method is used for
the request. A GET request does not have a body (i.e. the body is
empty). The response should contain a body with the response data
and header fields which describe the body (especially
Content-Type
and Content-Encoding
).
When you send an HTML form, either GET or POST can be used.
With a GET request the parameters are encoded in the URL, with
a POST request they are transmited in the body. HTML editors
and upload tools use PUT requests to upload resources to a Web Server
and DELETE requests to delete resources.
The complete HTTP specifications can be found in RFCs
1945
(HTTP/1.0) and
2068
(HTTP/1.1).