HackerNews down, unwisely returning http 200 for outage message

So HackerNews is currently down. It happens, we’ll probably find out why when it returns.

For a while yesterday, all HN URLs were returning error messages from CloudFlare, apparently their CDN. But today, all HN URLs are returning an apparently intentional outage message, “Sorry for the downtime. We hope to be back soon.”

But they are returning an HTTP 200 “OK” status code with this message. From all URLs.

This seems like a big mistake. You are telling any interested software (such as, say, Google), “All is well, and this is the proper content for the URL you requested.” For every single URL. Google might index this content. Google might decide that since there are a bazillion URLs at your hostname that all have the same content, relevancy/pagerank decisions should be made based on this (probably harming your visibility; why would a big website with a million URLs all of which say “Sorry for the downtime. We hope to be back soon” be given good visibility by a search engine). Etc.

I don’t know if Google in particular really does this; perhaps Google is smart enough to deal with improper 200’s for error pages, somehow using heuristics to guess that it’s really a temporary error that should be ignored. Not sure how that would work, but Google is often clever, I dunno.

But it’s still a bad idea. Don’t return 200’s for error pages. Use an appropriate response code for temporary outages. Google itself seems to suggest using 503 Service Temporarily Unavailable, which makes a lot of sense. If you can’t do this for some reason, perhaps you could use a 307 Temporary Redirect to redirect to an outage message — you’re saying it’s a ‘temporary’ redirect which shouldn’t be considered long-term content by indexers and such. (A 301 Permanent Redirect, or a 404 Not Found, seems just as bad as a 200).

In HackerNews’s case, it may actually be  CloudFlare returning the 200, through misconfiguration or poorly thought out feature from CloudFlare.  Either way, it seems like a bad idea.

Use HTTP response codes responsibly, and software agents consuming your web page will be happier!  And there are some software agents (like Google), you really want to keep happy.

Actually, now that I look at those headers — those cache control headers seem unwise too. Am I wrong, or is the response telling agents they can cache the “Sorry for the downtime” message for 10 years? That doesn’t seem wise either, does it?

$ curl -i https://news.ycombinator.com
HTTP/1.1 200 OK
Server: cloudflare-nginx
Date: Mon, 06 Jan 2014 15:22:04 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=[ommitted]; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.ycombinator.com; HttpOnly
Last-Modified: Mon, 06 Jan 2014 13:14:48 GMT
Vary: Accept-Encoding
Expires: Thu, 04 Jan 2024 13:14:48 GMT
Cache-Control: max-age=315352364
Cache-Control: public
CF-RAY: [omitted]

  <link rel="stylesheet" type="text/css" href="/news.css">
  <link rel="shortcut icon" href="/favicon.ico">
  <title>Hacker News</title>
    <table border="0" cellpadding="0" cellspacing="0" width="85%" bgcolor="#f6f6ef">
        <td bgcolor="#ff6600">
          <table border="0" cellpadding="0" cellspacing="0" width="100%" style="padding:2px">
              <td style="width:18px;padding-right:4px">
                <a href="http://ycombinator.com">
                  <img src="/y18.gif" width="18" height="18" style="border:1px #ffffff solid;" />
              <td style="line-height:12pt; height:10px;">
                <span class="pagetop"> <b><a href="/news">Hacker News</a></b>
      <tr style="height:10px"></tr>
          Sorry for the downtime. We hope to be back soon.
          <img src="s.gif" height="10" width="0" />
          <table width="100%" cellspacing="0" cellpadding="1">
              <td bgcolor="#ff6600"></td>
          <br />

31 thoughts on “HackerNews down, unwisely returning http 200 for outage message

  1. the -I (capital I) flag for curl will return just the status code and the headers, which makes the output much easier to handle when that’s all you care about

  2. If I had to guess (which I do, because I have no first hand knowledge) I’d say that CloudFire, if it sees a non 2xx reply from its upstream, will not replicate that out to its edge servers. therefore, you’d end up with some kind of situation where all the users are still banging on the upstream servers with comments and logins and posts etc, but the upstreams would be unable to deal with them, leading to a less-than-optimal user-experience. The 200 from the CDN edge is non-optimal, but acceptable, as is the caching change, as the next time that the requests are made from the server they’ll be proper. All interested parties, at some point in the future, will re-load the page and see the “fixed” version of everything.

  3. So, when HN finally comes back up — is everyone going to have to do a hard-refresh in their browsers to actually see it up, instead of just seeing the cached outage message?

  4. When the site is back up you can simply flush the cached pages in Cloudflare’s management console.

    BTW perhaps they’re returning 200 so that they can display the problem to the site users. A 500 error would not allow that since the browser takes over.

  5. Those cache headers aren’t just seen by cloudflare, they are being sent along with the outage message to actual user-agents. If a browser or intermediary cache has a cached copy of the outage message, and thinks the cached copy is good for 10 years — there’s nothing you can do in a cloudflare management console to change that, is there?

    I am not sure it’s true that all/most browsers won’t let you have custom content for a 504, but if it is, then I still say a 307 Temporary Redirect to an outage page would be preferable, as far as telling software agents accurate information.

  6. Returning 200 is the correct way of doing it.

    Server is online and responding to your request as author intended, even if it’s to show a message that they are down (which is not a ‘server/application’ level error).

    Arguing that is pretty silly.

  7. I was not expecting so much attention for this post, I guess it’s cause people are bored with HackerNews being down, more than that they are interested in the finer points of HTTP headers. And I guess with so much attention, it’s gonna attract some haters.

    But, well, you are mistaken on several things.

  8. Sure you weren’t expecting traffic. Just that “HACKER NEWS IS DOWN ZOMG!!!!” :-). But, say all of the hackernews pages get indexed by google as wrong and they drop from google altogether. It’s a site of links to other sites that changes perpetually. Does it really matter? I have never found a hacker news post in a google search result, ever. I GO to hackernews to see what is posted. My peers that read hacker news do the same, they go there. Where search indexing is completely irrelevant. I’d love to hear if it actually affects them long term. But I doubt it will. I keep refreshing.

  9. yep, I think that’s so, others have posted that HackerNews doesn’t generally care about getting indexed by google anyway.

    I wasn’t trying to post an HN “gotcha”, so much as use it as an opportunity to discuss using proper HTTP statuses in order to make your site intelligible to software agents — an issue dear to my heart. Google is just usually (not always, just usually) the software agent website producers care about the most — but if you use proper HTTP status codes, you make your site generally more intelligible to all sorts of software agents, and the more people that do this, the more robust and powerful a web we have — even if it doesn’t necessarily immediately bring you more traffic or money or something, it’s just good web hygiene, sort of web public health, heh.

    Someone else on proggit brought up the example of http://www.downforeveryoneorjustme.com/, another thing that will report “It’s up” if you return 200, but would report ‘down’ if it returned a proper error code. Again, most web site producers probably don’t really care, and this particular service is not super exciting, but it’s an example of how good http status codes and software agent intelligibility make many ‘higher order’ services possible, and make the web better for everyone.

  10. Absolutely. Responses should be consistent. If a site is down, we should be able to know on all levels, if an there is an api somewhere in error, we should know why. Not an empty response. Definitely not an “everything is great” response. This is a good opportunity to discuss that. And link bait intention or not. This is an excellent example to get a discussion started.

  11. wow “Freedom Dumlao (@APIguy)” and others citing him. You should _really_ read the article you post yourself. Because you’re talking crap.

  12. So what? You’re making a big deal out of something which is not relevant.

    Get a life.

  13. You are right, this is so outrageous! HN should be ashamed in public forever for this instead of just having just privately messaged them about that!

  14. It’s giving a 200 because it’s up just not the correct cache ur looking for dork face fix ur face

  15. Even i had to hard refresh. I thought it was still down, but only when i read this article, i tried doing a hard refresh.
    This article is completely correct!

  16. It’s really disappointing to read blogs like this from someone who obviously doesn’t have a clue. The server is responding properly and server health has ZERO to do with the availability of content in a CMS. The server is working fine, responding with code 200. Getting the server health and the content health confused is a huge noob red flag.

  17. I’m sorry, you are mistaken that HTTP response status codes are only about ‘server health’. Which is a Huge Noob Red Flag, whatever that means. (Also, pretty obviously the server was not ‘healthy’!) But I’m sorry you were disapointed, I recommend spending less time on the internet.

  18. ahhh definitely a pet peeve of mine and so many frameworks appear to be doing the same. Custom error pages are great and all, but give me the correct status code!!!! makes automated testing the security of websites a total nuisance.

  19. Only now, after reading this blog I thought of – hey, am I still seeing the cached downtime message? CTRL + F5 and HN is UP :)

  20. This post is proof that even big and established website can also be broken or not follow the rules, I guess except Google.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s