So once last week, and then once again this week, I got reports that our EZProxy server was timing out.
When it happened this week, I managed to investigate while the problem was still occuring, and noticed that the EZProxy process on the server was taking ~100% of available CPU, it was maxing out the CPU. As normally the EZProxy process doesn’t get above 10 or 20% of CPU in `top`, even during our peak times, something was up.
Looking at the EZProxy logs, I noticed a very high volume of requests logged with the “%r” LogFormat placeholder as eg:
"GET http://proxy1.library.jhu.edu:80http://ib.adnxs.com/ttj?id=3018854&size=728x90&cb=[CACHEBUSTER]&referrer=[REFERRER_URL]&pubclick=[INSERT_CLICK_TAG] HTTP/1.0"
ib.adnxs.com seems to be related to serving ads; and these requests were coming from hundreds of different IP’s. So first guess is that this is some kind of bot-net trying to register clicks on web ads for profit. (And that guess remains my assumption).
I was still confused about exactly what that logged request meant — two URLs jammed together like that, what kind of request are these clients actually making?
Eventually OCLC EZProxy support was able to clarify that this is what’s logged when a client tries to make a standard HTTP Proxy request to EZProxy, as if EZProxy were an standard HTTP Proxy server. Ie,
curl --proxy proxy1.library.jhu.edu:80 http://ib.adnxs.co/ttj...
Now, EZProxy isn’t a standard HTTP Proxy server, so does nothing with this kind of request. My guess is that some human or automated process noticed a DNS hostname involving the word ‘proxy’, and figured it was worth a try to sic a bot army on it. But it’s not accomplishing what it wanted to accomplish, this ain’t an open HTTP proxy, or even a standard HTTP proxy at all.
But, the sheer volume of them was causing problems. Apparently EZProxy needs to run enough logic in order to determine it can do nothing with this request that the volume of such requests were making EZProxy go to 100% CPU utilization, even though it would do nothing with them.
It’s not such a large volume of traffic that it overwhelms the OS network stack or anything; if I block all the IP addresses involved in EZProxy config with `RejectIP`, then everything’s fine again, CPU utilization is back to <10%. It’s just EZProxy the app that is having trouble dealing with all these.
So first, I filed a feature/bug request with OCLC/EZProxy, asking EZProxy to be fixed/improved here, so if something tries making a standard HTTP Proxy request against it, it ignores it in a less CPU-intensive way, so it can ignore a higher volume of these requests.
Secondly, our local central university IT network security thinks they may have the tools to block these requests at the network perimeter, before they even reach our server. Any request that looks like a standard HTTP Proxy request can be blocked at the network perimeter before it even reaches our server, as there is no legitimate reason for these requests and nothing useful that can be done with them by EZProxy.
If all this fails, I may need to write a cronjob script which regularly scans the EZProxy logs for lines that look like standard HTTP Proxy requests, notes the IP’s, and then automatically adds the IP’s to an EZProxy config file with `RejectIP` (restarting EZProxy to take effect). This is a pain, would have some delay before banning abusive clients (you don’t want to go restarting EZProxy ever 60 seconds or anything), and would possibly end up banning legitimate users (who are infected by malware? But they’d stay banned even after they got rid of the malware. Who have accidentally configured EZProxy as an HTTP Proxy in their web browser, having gotten confused? But again, they’d stay banned even after they fixed it).
I guess another alternative would be putting EZProxy behind an apache or nginx reverse proxy, so we could write rules in the front-end web server to filter out these requests before they make it to EZProxy.
Or having a log scanning cronjob, which actually blocks bad ip’s with the OS iptables (perhaps using the ‘fail2ban’ script), rather than with EZProxy `RejectIP` config (thus avoiding need for an EZProxy restart when adding blocked IP’s).
But the best solution would be EZProxy fixing itself to not take excessive CPU when under a high volume of HTTP Proxy requests, but simply ignore them in a less CPU-intensive way. I have no idea how likely it is for OCLC to fix EZProxy like this.