I’m not even sure how I managed to figure out this problem, but there wasn’t a problem report/solution on google, so I thought I’d share it here so maybe someone else will find it.
I needed to deploy an old Java webapp on a jetty running java 1.4. This webapp used to run in a really old tomcat install along with some other webapps (including proprietary code), I wanted to isolate it in it’s own container. Possibly this webapp would run under a more modern java, but it’s been running under 1.4 for years, so operating under the only change one thing at a time principle. (Once it is isolated in it’s own container of course it will be easier to change the JVM version for just this webapp).
First problem was getting a Jetty to run under Java 1.4. Turns out that, as documented, all Jetty 1.6.x does run under Java 1.4 — but oddly, some of the sample/example webapps shipped with the binaries for the latest 1.6.x, 1.6.26, are not compiled to work under Java 1.4. But delete all the sample/example webapps, and it works fine.
So then I had my webapp running in the jetty and mostly working. But something odd was going on — sometimes the web app would just ‘lock up’, waiting a few seconds in the middle of the response before returning the rest, or sometimes waiting at the beginning before returning any bytes.
Not being that familiar with jetty (or Java servlet deployment in general), I spent honestly a couple days trying different google queries and reading random pages to try to figure out what might be causing this. It may be not a lot of people have run into this, because not a lot of people are trying to run old webapps on a newly downloaded/configured jetty 1.6.26 under Java 1.4.
First discovery was that most (but not neccesarily all) of the problem exhibited only when the response size was greater than the jetty “responseBufferSize“. (Not even sure how I figured that out, just trying/examining random things).
Okay, so further reading revealed that when the response exceeds the responseBufferSize, that’s when jetty will first ‘commit’ the response, sending some bytes to the client, under HTTP 1.1. “chunked” encoding, so additional “chunks” will be sent when the buffer is filled again.
Okay, but there’s no good reason that should result in several second lockups when the buffer is full. Theoretically I could make the responseBufferSize really really big, and just hope it never gets exceeded — but that’s not the way jetty is meant to operate, and a response could always come around that’s bigger than whatever responseBufferSize I configure (within reason and available RAM), and I didn’t like that uncertainty.
Eventually, and I have no idea quite how I thought to try this, I figured out the problem was the
org.mortbay.jetty.nio.SelectChannelConnector configured in jetty.xml. Jetty 6.1.26 comes with a jetty.xml where in the ‘addConnector’ configuration, by default there is a
org.mortbay.jetty.nio.SelectChannelConnector configured, but there’s a commented out older
org.mortbay.jetty.bio.SocketConnector implemetnation, with the comment “Use this connector if NIO is not available.”
I know “nio” stands for “Non-blocking IO”. Apparently there is some problem with the combination of java 1.4 (or the particular JVM I’m running, whcih is whatever was installed on this ancient machine, might be HotSpot etc), jetty 6.1.26, and “nio” . “Non-blocking” turned into extreme blocking/dead-locking instead. One clue was a post I found on Google somewhere (that I think I’ve lost now) that said there were some known bugs with jetty 6.1.x and “nio”, and that recent jetty 6.1.x had workarounds in place, but the underlying bugs themselves were not planned to be fixed in the 6.1 branch (which is the last that runs under Java 1.4). Apparently in addition to some known bugs with workarounds, there might be some additional bugs that do not have working workarounds in the codebase.
Commenting out the nio.SelectChannelConnector configuration, and uncommenting the configuration for the older implementation bio.SocketConnector, seems to have my webapp functioning correctly and predictably now. Phew, that was like 3 days of my time figuring that one out.