Connection Close in HttpClient

I recently made a mistake using Java Jakarta Commons HttpClient. I decided to dig deeper into the issue.

My code uses HttpClient to send HTTP requests to another machine. The load is very high. Over time, I often see exceptions of “Too many open files” in logs. But the problem may auto recover. Using netstat, I found that there were a lot of tcp connections in CLOSE_WAIT state on the machine. So the problem is that the application did not close the connections.

My code is very similar to the example in HttpClient tutorial.

HttpClient client = new HttpClient();
GetMethod httpget = new GetMethod("http://www.whatever.com/");
try {
    client.executeMethod(httpget);
    ...
} finally {
    httpget.releaseConnection();
}

The code calls releaseConnection at the end as specified by the tutorial. But what does this method do?  To understand it, we need to understand what’s behind HttpClient object. Each HttpClient has an HttpConnectionManager responsible for maintaining connections. If we don’t pass an HttpConnectionManager to the constructor, HttpClient will initiate a SimpleHttpConnectionManager by default. SimpleHttpConnectionManager maintains a single connection and can only be used by a single thread. The main job of SimpleHttpConnectionManager is to keep the connection alive if the next request is to the same host. So the above releaseConnection call will not close the socket. If the next method to be executed is to a different host, it will close the prior connection at that time. Otherwise, it may reuse the connection.

The mistake I made is that I created  HttpClient objects on demand instead of reusing a single instance as documented here. If the peer closes the socket first (by sending FIN), the connection will be in CLOSE_WAIT state on my side until my application layer closes the socket.  CLOSE_WAIT is a state that will not time out. (It is not TIME_WAIT.) It is application’s responsibility to close it. So how to force HttpClient to close the socket? Actually HttpConnectionManager interface does not define a way to close the socket. But SimpleHttpConnectionManager introduced shutdown method since 3.1. So one possible way to close the connection is as follows.


HttpConnectionManager mgr = client.getHttpConnectionManager();
if (mgr instanceof SimpleHttpConnectionManager) {
    ((SimpleHttpConnectionManager)mgr).shutdown();
}

But why isn’t the problem deterministic?  Shouldn’t it never recover once the problem starts to happen? The magic is Java garbage collection. I reproduced the effect by forcing garbage collection. It will clean CLOSE_WAIT connections. But, to be accurate, JVM garbage collection does not handle socket closing by itself. It only frees memory. It is Socket object who closes sockets in finanize method as discussed here.

SimpleHttpConnectionManager is not thread safe. If you need to maintain a reusable HttpClient instance shared by multiple threads, you should use MultiThreadedHttpConnectionManager. For example,


protected static HttpClient m_client = null;
 static {
    MultiThreadedHttpConnectionManager mgr = new MultiThreadedHttpConnectionManager();
    mgr.getParams().setDefaultMaxConnectionsPerHost(1000);
    mgr.getParams().setMaxTotalConnections(1000);
    m_client = new HttpClient(mgr);
 }

12 thoughts on “Connection Close in HttpClient

  1. Pingback: Tweets that mention Connection Close in HttpClient -- Topsy.com

  2. Thanks a lot !
    This seems to be a very thorugh explanation for a problem we’re facing right now. Will try your solution immediately ;)

  3. Hello Yun, That was a very good explanation indeed. I’m currently facing similar problems in one of my code-snipptets. Would give this a try, and would see if that works well. Thanks for the detailed explanation again mate.

  4. Hello Yun,
    Thanks a lot for the explanation. I was facing the exact same problem and your Insight in this regard was very helpful.

    Also, I think the problem can be solved using the constructor –> public SimpleHttpConnectionManager(boolean alwaysClose).

    sample code:
    SimpleHttpConnectionManager smp = new SimpleHttpConnectionManager(true);
    HttpClient client = new HttpClient(smp);
    ..
    .
    .

    Regards
    Ravi Kiran

  5. I was looking for an issue reagarding post parse parameter error caused in weblogic.
    In my app we have impl of reusing the httlclient.
    Where we import large data set.
    step:
    1.send begin import get status 200
    2:Do import(write data to o/p stream) .get status 200
    3:end import.

    During the 2 stages the parse exception i am getting.

    Do you see any issue with reusing connection.

    Thanks,
    Raju

  6. Thank you very much. I’ve been wrestling with this issue for a couple of days -high volume/multi-threaded. thanks again.

  7. I wrote a programm that downloads (in threads) images from a webserver with HTTPClient (3.x), during the download i always got the following exception:

    org.apache.commons.httpclient.HttpMethodDirector executeWithRetry

    Now it works! Thanks a lot!

  8. Thank you very much.

    You research and analysis are quite helpful.
    I was having the same problem. Now, it’s fixed.

  9. Pingback: Guido Serra » Archive » don’t use jMeter to test Apache!

  10. Pingback: Kasper Suits