Asynchronous processing support in Servlet 3.0

Why asynchronous processing is the new foundation of Web 2.0

Even as a mid-level API ensconced in modern UI component-based Web frameworks and Web services technologies, the incoming Servlet 3.0 specification (JSR 315) will have groundbreaking impact on Java Web application development. Author Xinyu Liu explains in detail why asynchronous processing is foundational for the collaborative, multi-user applications that define Web 2.0. He also summarizes Servlet 3.0's other enhancements such as ease of configuration and pluggability. Level: Intermediate

The Java Servlet specification is the common denominator for most server-side Java Web technologies, including JavaServer Pages (JSP), JavaServer Faces (JSF), numerous Web frameworks, SOAP and RESTful Web services APIs, and newsfeeds. The servlets running underneath these technologies make them portable across all Java Web servers (servlet containers). Any proposed changes to this widely accepted API for handling HTTP communications will potentially affect all affiliated server-side Web technologies.

The upcoming Servlet 3.0 specification, which passed public review in January 2009, is a major release with important new features that will change the lives of Java Web developers for the better. Here's a list of what you can expect in Servlet 3.0:

  • Asynchronous support
  • Ease of configuration
  • Pluggability
  • Enhancements to existing APIs

Asynchronous support is Servlet 3.0's most significant enhancement, intended to make the server-side processing of Ajax applications much more efficient. In this article I'll focus on asynchronous support in Servlet 3.0, starting by explaining the connection and thread-consumption issues that underlie the need for asynchronous support. I'll then explain how real-world applications today utilize asynchronous processing in server push implementations like Comet, or reverse Ajax. Finally, I'll touch on Servlet 3.0's other enhancements such as pluggability and ease of configuration, leaving you with a good impression of Servlet 3.0 and its impact on Java Web development.

Asynchronous support: Background concepts

Web 2.0 technologies drastically change the traffic profile between Web clients (such as browsers) and Web servers. Asynchronous support introduced in Servlet 3.0 is designed to respond to this new challenge. In order to understand the importance of asynchronous processing, let's first consider the evolution of HTTP communications.

HTTP 1.0 to HTTP 1.1

A major improvement in the HTTP 1.1 standard is persistent connections. In HTTP 1.0, a connection between a Web client and server is closed after a single request/response cycle. In HTTP 1.1, a connection is kept alive and reused for multiple requests. Persistent connections reduce communication lag perceptibly, because the client doesn't need to renegotiate the TCP connection after each request.

Thread per connection

Figuring out how to make Web servers more scalable is an ongoing challenge for vendors. Thread per HTTP connection, which is based on HTTP 1.1's persistent connections, is a common solution vendors have adopted. Under this strategy, each HTTP connection between client and server is associated with one thread on the server side. Threads are allocated from a server-managed thread pool. Once a connection is closed, the dedicated thread is recycled back to the pool and is ready to serve other tasks. Depending on the hardware configuration, this approach can scale to a high number of concurrent connections. Experiments with high-profile Web servers have yielded numerical results revealing that memory consumption increases almost in direct proportion with the number of HTTP connections. The reason is that threads are relatively expensive in terms of memory use. Servers configured with a fixed number of threads can suffer the thread starvation problem, whereby requests from new clients are rejected once all the threads in the pool are taken.

On the other hand, for many Web sites, users request pages from the server only sporadically. This is known as a page-by-page model. The connection threads are idling most of the time, which is a waste of resources.

Thread per request

Thanks to the non-blocking I/O capability introduced in Java 4's New I/O APIs for the Java Platform (NIO) package, a persistent HTTP connection doesn't require that a thread be constantly attached to it. Threads can be allocated to connections only when requests are being processed. When a connection is idle between requests, the thread can be recycled, and the connection is placed in a centralized NIO select set to detect new requests without consuming a separate thread. This model, called thread per request, potentially allows Web servers to handle a growing number of user connections with a fixed number of threads. With the same hardware configuration, Web servers running in this mode scale much better than in the thread-per-connection mode. Today, popular Web servers -- including Tomcat, Jetty, GlassFish (Grizzly), WebLogic, and WebSphere -- all use thread per request through Java NIO. For application developers, the good news is that Web servers implement non-blocking I/O in a hidden manner, with no exposure whatsoever to applications through servlet APIs.

Meeting Ajax challenges

To offer a richer user experience with more-responsive interfaces, more and more Web applications use Ajax. Users of Ajax applications interact with the Web server much more frequently than in the page-by-page model. Unlike ordinary user requests, Ajax requests can be sent frequently by one client to the server. In addition, both the client and scripts running on the client side can poll the Web server regularly for updates. More simultaneous requests cause more threads to be consumed, which cancels out the benefit of the thread-per-request approach to a high degree.

Slow running, limited resources

Some slow-running back-end routines worsen the situation. For example, a request could be blocked by a depleted JDBC connection pool, or a low-throughput Web service endpoint. Until the resource becomes available, the thread could be stuck with the pending request for a long time. It would be better to place the request in a centralized queue waiting for available resources and recycle that thread. This effectively throttles the number of request threads to match the capacity of the slow-running back-end routines. It also suggests that at a certain point during request processing (when the request is stored in the queue), no threads are consumed for the request at all. Asynchronous support in Servlet 3.0 is designed to achieve this scenario through a universal and portable approach, whether Ajax is used or not. Listing 1 shows you how it works.

Listing 1. Throttling access to resources

@WebServlet(name="myServlet", urlPatterns={"/slowprocess"}, asyncSupported=true)
public class MyServlet extends HttpServlet {
   
    public void doGet(HttpServletRequest request, HttpServletResponse response) {
        AsyncContext aCtx = request.startAsync(request, response); 
        ServletContext appScope = request.getServletContext();
        ((Queue<AsyncContext>)appScope.getAttribute("slowWebServiceJobQueue")).add(aCtx);
    }
}

@WebServletContextListener
public class SlowWebService implements ServletContextListener {

    public void contextInitialized(ServletContextEvent sce) {
        Queue<AsyncContext> jobQueue = new ConcurrentLinkedQueue<AsyncContext>();
        sce.getServletContext().setAttribute("slowWebServiceJobQueue", jobQueue);
        // pool size matching Web services capacity
        Executor executor = Executors.newFixedThreadPool(10);
        while(true)
        {
            if(!jobQueue.isEmpty())
            {
                final AsyncContext aCtx = jobQueue.poll();
                executor.execute(new Runnable(){
                    public void run() {
                        ServletRequest request = aCtx.getRequest();
                        // get parameteres
                        // invoke a Web service endpoint
                        // set results
                        aCtx.forward("/result.jsp");
                    }                    
                });             
            }
        }
    }
    
    public void contextDestroyed(ServletContextEvent sce) {
    }
}

When the asyncSupported attribute is set to true, the response object is not committed on method exit. Calling startAsync() returns an AsyncContext object that caches the request/response object pair. The AsyncContext object is then stored in an application-scoped queue. Without any delay, the doGet() method returns, and the original request thread is recycled. In the ServletContextListener object, separate threads initiated during application launch monitor the queue and resume request processing whenever the resources become available. After a request is processed, you have the option of calling ServletResponse.getWriter().print(...), and then complete() to commit the response, or calling forward() to direct the flow to a JSP page to be displayed as the result. Note that JSP pages are servlets with an asyncSupported attribute that defaults to false.

In addition, the AsyncEvent and AsynListener classes in Servlet 3.0 give developers elaborate control of asynchronous lifecycle events. You can register an AsynListener through the ServletRequest.addAsyncListener() method. After the startAsync() method is called on the request, an AsyncEvent is sent to the registered AsyncListener as soon as the asynchronous operation has completed or timed out. The AsyncEvent also contains the same request and response objects as in the AsyncContext object.

Server push

A more interesting and vital use case for the Servlet 3.0 asynchronous feature is server push. GTalk, a widget that lets GMail users chat online, is an example of server push. GTalk doesn't poll the server frequently to check if a new message is available to display. Instead it waits for the server to push back new messages. This approach has two obvious advantages: low-lag communication without requests being sent, and no waste of server resources and network bandwidth.

Ajax allows a user to interact with a page even if other requests from the same user are being processed at the same time. A common use case is to have a browser regularly poll the server for updates of state changes without interrupting the user. However, high polling frequencies waste server resources and network bandwidth. If the server could actively push data to browsers -- in other words, deliver asynchronous messages to clients on events (state changes) -- Ajax applications would perform better and save precious server and network resources.

The HTTP protocol is a request/response protocol. A client sends a request message to a server, and the server replies with a response message. The server can't initiate a connection with a client or send an unexpected message to the client. This aspect of the HTTP protocol seemingly makes server push impossible. But several ingenious techniques have been devised to circumvent this constraint:

  • Service streaming (streaming) allows a server to send a message to a client when an event occurs, without an explicit request from the client. In real-world implementations, the client initiates a connection to the server through a request, and the response returns bits and pieces each time a server-side event occurs; the response lasts (theoretically) forever. Those bits and pieces can be interpreted by client-side JavaScript and displayed through the browser's incremental rendering ability.
  • Long polling, also known as asynchronous polling, is a hybrid of pure server push and client pull. It is based on the Bayeux protocol, which uses a topic-based publish-subscribe scheme. As in streaming, a client subscribes to a connection channel on the server by sending a request. The server holds the request and waits for an event to happen. Once the event occurs (or after a predefined timeout), a complete response message is sent to the client. Upon receiving the response, the client immediately sends a new request. The server, then, almost always has an outstanding request that it can use to deliver data in response to a server-side event. Long polling is relatively easier to implement on the browser side than streaming.
  • Passive piggyback: When the server has an update to send, it waits for the next time the browser makes a request and then sends its update along with the response that the browser was expecting.

Service streaming and long polling, implemented with Ajax, are known as Comet, or reverse Ajax. (Some developers call all interactive techniques reverse Ajax, including regular polling, Comet, and piggyback.)

Ajax improves single-user responsiveness. Server-push technologies like Comet improve application responsiveness for collaborative, multi-user applications without the overhead of regular polling.

The client aspect of the server push techniques -- such as hidden iframes, XMLHttpRequest streaming, and some Dojo and jQuery libraries that facilitate asynchronous communication -- are outside this article's scope. Instead, our interest is on the server side, specifically how the Servlet 3.0 specification helps implement interactive applications with server push.

1 2 Page 1
Page 1 of 2