HackerNews down, unwisely returning http 200 for outage message

So HackerNews is currently down. It happens, we’ll probably find out why when it returns.

For a while yesterday, all HN URLs were returning error messages from CloudFlare, apparently their CDN. But today, all HN URLs are returning an apparently intentional outage message, “Sorry for the downtime. We hope to be back soon.”

But they are returning an HTTP 200 “OK” status code with this message. From all URLs.

This seems like a big mistake. You are telling any interested software (such as, say, Google), “All is well, and this is the proper content for the URL you requested.” For every single URL. Google might index this content. Google might decide that since there are a bazillion URLs at your hostname that all have the same content, relevancy/pagerank decisions should be made based on this (probably harming your visibility; why would a big website with a million URLs all of which say “Sorry for the downtime. We hope to be back soon” be given good visibility by a search engine). Etc.

I don’t know if Google in particular really does this; perhaps Google is smart enough to deal with improper 200’s for error pages, somehow using heuristics to guess that it’s really a temporary error that should be ignored. Not sure how that would work, but Google is often clever, I dunno.

But it’s still a bad idea. Don’t return 200’s for error pages. Use an appropriate response code for temporary outages. Google itself seems to suggest using 503 Service Temporarily Unavailable, which makes a lot of sense. If you can’t do this for some reason, perhaps you could use a 307 Temporary Redirect to redirect to an outage message — you’re saying it’s a ‘temporary’ redirect which shouldn’t be considered long-term content by indexers and such. (A 301 Permanent Redirect, or a 404 Not Found, seems just as bad as a 200).

In HackerNews’s case, it may actually be  CloudFlare returning the 200, through misconfiguration or poorly thought out feature from CloudFlare.  Either way, it seems like a bad idea.

Use HTTP response codes responsibly, and software agents consuming your web page will be happier!  And there are some software agents (like Google), you really want to keep happy.

Actually, now that I look at those headers — those cache control headers seem unwise too. Am I wrong, or is the response telling agents they can cache the “Sorry for the downtime” message for 10 years? That doesn’t seem wise either, does it?

$ curl -i https://news.ycombinator.com
HTTP/1.1 200 OK
Server: cloudflare-nginx
Date: Mon, 06 Jan 2014 15:22:04 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=[ommitted]; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.ycombinator.com; HttpOnly
Last-Modified: Mon, 06 Jan 2014 13:14:48 GMT
Vary: Accept-Encoding
Expires: Thu, 04 Jan 2024 13:14:48 GMT
Cache-Control: max-age=315352364
Cache-Control: public
CF-RAY: [omitted]

  <link rel="stylesheet" type="text/css" href="/news.css">
  <link rel="shortcut icon" href="/favicon.ico">
  <title>Hacker News</title>
    <table border="0" cellpadding="0" cellspacing="0" width="85%" bgcolor="#f6f6ef">
        <td bgcolor="#ff6600">
          <table border="0" cellpadding="0" cellspacing="0" width="100%" style="padding:2px">
              <td style="width:18px;padding-right:4px">
                <a href="http://ycombinator.com">
                  <img src="/y18.gif" width="18" height="18" style="border:1px #ffffff solid;" />
              <td style="line-height:12pt; height:10px;">
                <span class="pagetop"> <b><a href="/news">Hacker News</a></b>
      <tr style="height:10px"></tr>
          Sorry for the downtime. We hope to be back soon.
          <img src="s.gif" height="10" width="0" />
          <table width="100%" cellspacing="0" cellpadding="1">
              <td bgcolor="#ff6600"></td>
          <br />
This entry was posted in General. Bookmark the permalink.

31 Responses to HackerNews down, unwisely returning http 200 for outage message

  1. Ben says:

    Odd. http://hackernews.org works just fine, although news.ycombinator.com is down, as you mention. A ghost of Christmas past wreaking havoc on Teh Intertubes?

  2. jrochkind says:

    hackernews.org is apparently not the thing most people call Hacker News, I think news.ycombinator.org is the original thing people call Hacker News, hackernews.org is something else.


  3. megan says:

    the -I (capital I) flag for curl will return just the status code and the headers, which makes the output much easier to handle when that’s all you care about

  4. stefan says:

    If I had to guess (which I do, because I have no first hand knowledge) I’d say that CloudFire, if it sees a non 2xx reply from its upstream, will not replicate that out to its edge servers. therefore, you’d end up with some kind of situation where all the users are still banging on the upstream servers with comments and logins and posts etc, but the upstreams would be unable to deal with them, leading to a less-than-optimal user-experience. The 200 from the CDN edge is non-optimal, but acceptable, as is the caching change, as the next time that the requests are made from the server they’ll be proper. All interested parties, at some point in the future, will re-load the page and see the “fixed” version of everything.

  5. Headers do seem to indicate caching for 10 years is OK.

  6. jrochkind says:

    So, when HN finally comes back up — is everyone going to have to do a hard-refresh in their browsers to actually see it up, instead of just seeing the cached outage message?

  7. When the site is back up you can simply flush the cached pages in Cloudflare’s management console.

    BTW perhaps they’re returning 200 so that they can display the problem to the site users. A 500 error would not allow that since the browser takes over.

  8. jrochkind says:

    Those cache headers aren’t just seen by cloudflare, they are being sent along with the outage message to actual user-agents. If a browser or intermediary cache has a cached copy of the outage message, and thinks the cached copy is good for 10 years — there’s nothing you can do in a cloudflare management console to change that, is there?

    I am not sure it’s true that all/most browsers won’t let you have custom content for a 504, but if it is, then I still say a 307 Temporary Redirect to an outage page would be preferable, as far as telling software agents accurate information.

  9. Ah, yes I think you’re right. Any clients that have seen the max-age may need to have a force refresh as you said.

  10. Returning 200 is the correct way of doing it.

    Server is online and responding to your request as author intended, even if it’s to show a message that they are down (which is not a ‘server/application’ level error).

    Arguing that is pretty silly.

  11. Also search engine crawlers do not care about expires header and I doubt that users will have to hard refresh non-static content.

    Even if this was an issue, all they have to do is set headers in past to fix it.

  12. jrochkind says:

    I was not expecting so much attention for this post, I guess it’s cause people are bored with HackerNews being down, more than that they are interested in the finer points of HTTP headers. And I guess with so much attention, it’s gonna attract some haters.

    But, well, you are mistaken on several things.

  13. Joey Blake says:

    Sure you weren’t expecting traffic. Just that “HACKER NEWS IS DOWN ZOMG!!!!” :-). But, say all of the hackernews pages get indexed by google as wrong and they drop from google altogether. It’s a site of links to other sites that changes perpetually. Does it really matter? I have never found a hacker news post in a google search result, ever. I GO to hackernews to see what is posted. My peers that read hacker news do the same, they go there. Where search indexing is completely irrelevant. I’d love to hear if it actually affects them long term. But I doubt it will. I keep refreshing.

  14. jrochkind says:

    yep, I think that’s so, others have posted that HackerNews doesn’t generally care about getting indexed by google anyway.

    I wasn’t trying to post an HN “gotcha”, so much as use it as an opportunity to discuss using proper HTTP statuses in order to make your site intelligible to software agents — an issue dear to my heart. Google is just usually (not always, just usually) the software agent website producers care about the most — but if you use proper HTTP status codes, you make your site generally more intelligible to all sorts of software agents, and the more people that do this, the more robust and powerful a web we have — even if it doesn’t necessarily immediately bring you more traffic or money or something, it’s just good web hygiene, sort of web public health, heh.

    Someone else on proggit brought up the example of http://www.downforeveryoneorjustme.com/, another thing that will report “It’s up” if you return 200, but would report ‘down’ if it returned a proper error code. Again, most web site producers probably don’t really care, and this particular service is not super exciting, but it’s an example of how good http status codes and software agent intelligibility make many ‘higher order’ services possible, and make the web better for everyone.

  15. Joey Blake says:

    Absolutely. Responses should be consistent. If a site is down, we should be able to know on all levels, if an there is an api somewhere in error, we should know why. Not an empty response. Definitely not an “everything is great” response. This is a good opportunity to discuss that. And link bait intention or not. This is an excellent example to get a discussion started.

  16. Josh says:

    At http://KillSwit.ch we rely on HTTP Headers (Among other things soon to come) to determine the status of a site being up or down. Returning 200 isn’t a best practice by anymeans.

  17. Pingback: HackerNews down, unwisely returning http 200 for outage message | Enjoying The Moment

  18. Bob says:

    wow “Freedom Dumlao (@APIguy)” and others citing him. You should _really_ read the article you post yourself. Because you’re talking crap.

  19. Juanito says:

    So what? You’re making a big deal out of something which is not relevant.

    Get a life.

  20. Lal Krishna says:

    I had to do a hard refresh. I saw it was still down and thought it was odd after such a long time, so I Ctrl-F5 and here we are.

  21. yo says:

    You are right, this is so outrageous! HN should be ashamed in public forever for this instead of just having just privately messaged them about that!

  22. D says:

    It’s giving a 200 because it’s up just not the correct cache ur looking for dork face fix ur face

  23. Just a head’s up… http://www.downforeveryoneorjustme.com/ is shortened to http://isup.me/ (and yeah, the 200 response threw me off too, sigh).

  24. Gautam says:

    Even i had to hard refresh. I thought it was still down, but only when i read this article, i tried doing a hard refresh.
    This article is completely correct!

  25. asefeffeffffffffffff says:

    It’s really disappointing to read blogs like this from someone who obviously doesn’t have a clue. The server is responding properly and server health has ZERO to do with the availability of content in a CMS. The server is working fine, responding with code 200. Getting the server health and the content health confused is a huge noob red flag.

  26. jrochkind says:

    I’m sorry, you are mistaken that HTTP response status codes are only about ‘server health’. Which is a Huge Noob Red Flag, whatever that means. (Also, pretty obviously the server was not ‘healthy’!) But I’m sorry you were disapointed, I recommend spending less time on the internet.

  27. jtest says:

    ahhh definitely a pet peeve of mine and so many frameworks appear to be doing the same. Custom error pages are great and all, but give me the correct status code!!!! makes automated testing the security of websites a total nuisance.

  28. Pingback: Planning to Go Down, HTTP Edition | tedious ramblings

  29. JV says:

    Only now, after reading this blog I thought of – hey, am I still seeing the cached downtime message? CTRL + F5 and HN is UP :)

  30. Sonakshi says:

    This post is proof that even big and established website can also be broken or not follow the rules, I guess except Google.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s