apache conf for Rails asset pipeline

The Rails asset pipeline produces ‘fingerprinted’ (aka ‘digested’, but I find ‘fingerprinted’ the better term) names of assets (JS, CSS, images, etc), such that the name will change if the content changes — this is intended to make the asset URLs ‘infinitely cacheable’ by browsers.  (for instance “application-76142bb0898d0922b3f1518cd20a19e6.css” instead of just “application.css”).

And the asset pipeline produces gzipped versions of all assets, that can theoretically be served up to browsers that will accept them (any modern user-agent) to save bandwidth.

But neither of these two things will happen without some specific configuration in apache (or nginx, or other front-end web server that’s serving your static compiled assets). I suspect most Rails users aren’t doing the specific configuration. (And in the case of the gzip’d stuff, it’s unclear exactly how to do so in apache).

Far-future expires for fingerprinted assets

Even back pre-asset pipeline (pre Rails 3.1), Rails generated similar fingerprints in query strings, meant for the same purpose (that technique didn’t work as well as asset pipeline/sprockets fingerprinting does, but was intended to do the same thing).  But back in those days, there was no rails documentation pointing out you actually had to get your web server to serve far-future expires headers for these fingerprinted assets.

Fortunately, current Rails Asset Pipeline Guide does mention this, and provides sample configuration for both apache and nginx (Section 4.1.1).

However, the configuration that the Guide recommends will have the server send far-future expires headers for everything on your web server at “/assets”. (“infinite” or “far-future” caching means “one year” in http specs, that’s the max user agents are supposed to cache for).

But current Rails produces non-fingerprinted assets in there too, at URLs that represent the current releases version.  If the server sends far-future expires for these, then browsers will cache them forever, and keep using old cached versions even if new releases have new versions. That’s bad.  In “typical” Rails use, nothing is referencing these un-fingerprinted asset URLs. But:

  • Maybe you have an old app that began as Rails2 (or Rails1), that still references un-fingerprinted names in CSS (background images?) or JS.  Yeah, that’s a mistake, but do you want the mistake to result in browsers caching old wrong versions forever (if you apply Rails Guide’s suggestion), or do you want the mistake to result in browsers not caching the asset but checking with a conditional GET every time, not as performant as it could be but at least you get the right behavior. I know I want the second one.
  • I have some cases where I intentionally send people to the un-fingerprinted URL. For ‘integration’ purposes, where a link is embedded on some ‘third party’ software or web page intentionally pointing at “the latest version”. (A JS widget, a re-usable image, re-usable styles in CSS, etc).  (People have been making sounds about removing generation of un-fingerprinted versions in Rails4 altogether, which is going to be a huge pain for me, I’m relying on them, but that’s another story).

So I make this tweak to the Rails asset guide suggested apache configuration, to have the far-future expires headers apply only to URLs that are actually fingerprinted. There’s no great iron-clad way to do this, so I set the regex on the config to apply to files that “look like” finger-printed URLs — they have a hyphen followed by exactly 32 hex digits in them.  This is unlikely to get any ‘false positives’, although if sprockets/Rails changes it’s fingerprinting methods (not exactly 32 digits?), it might stop sending far-future expires headers.

Again, I’d rather have a failure mode that stops caching things it could have cached (performance hit but functionality intact), then one that causes things to be cached forever that shouldn’t have been (functionality broken, and nothing you can do about it except get all effected users to manually delete their browser caches, which is not feasible. So this something it’s really important to avoid).

# Rails finger-printed assets, make them cached forever.
# Try only match if the asset actually has a fingerprint in it.
<LocationMatch "^/assets/.*-[0-9a-f]{32}.*$">
  Header unset ETag
  FileETag None
  # RFC says only cache for 1 year
  ExpiresActive On
  ExpiresDefault "access plus 1 year"

Gzipped versions

Now, the asset pipeline also produces a “.gz” version of all assets.

The idea here is that the web server can serve this gzipped version, to a user agent that advertises it’s willing to accept it, to make the asset transfers smaller and faster. Apparently in general, this results in end-user performance improvements (some of the various ‘web optimization’ tools definitely recommend “consider gzipping your CSS/JS”).

The problem again is that the asset pipeline is producing them — but they just won’t be used. The rails app generates URLs to your assets that don’t include the “.gz”, the browser asks for the assets without the “.gz”, the web server serves the ordinary non-gzipped assets, those gzipped ones are never used at all. (In fact, getting Rails to generate URLs using “.gz” is not the right solution, and it would be hard to get browsers to reliably do the right thing with those too, at least not without some web server configuration to fix the content-type headers, a browser doesn’t want application/gzip when it thinks it’s getting a stylesheet).

The Rails Asset Pipeline Guide gives a sample config for nginx, but for Apache, sadly just says “A robust configuration for Apache is possible but tricky; please Google around.”  Googling around actually isn’t too much help (if there was a no-need-to-think answer on Google, someone would have added it to the Guide already probably!). 

The guide used to include a suggested apache config, you can see the commit where it was removed for a variation of the present warning.  This lets you see what the recommended config used to be, and also why fxn thought it wasn’t good enough:

A robust Apache configuration for this feature seems to be tricky, one that takes into account Accept-Encoding, sets Vary, ensures Content-Type is right, etc.

If you do google around, this stack overflow answer seems to the best, it’s a variation of the advice that used to be in the guide, and it seems to take account of Accept-Encoding, and ensuring the right Content-Type.  I’m not sure if it ‘sets Vary’ (I only vaguely understand what Vary headers do, and don’t understand the problem related to them here), let alone if it does “etc.” — so I can’t say for sure if it solves all problems that exist, or all that fxn was aware of, I don’t feel personally that comfortable using it myself.

So I don’t have my server serving gzipped versions of assets.

But then, the asset pipeline is still creating those gzipped versions, no doubt part of the pain of the horribly-slow-asset-precompile that we’ve all felt — and yet they never get used at all. Anyone know if there’s a simple way to get asset pipeline compilation to stop producing them?

Another option — Apache already has it’s own solution to serving gzipped versions of static files: mod_deflate . Yeah, the apache mod_deflate will take time to create the gzipped version on first access, rather than the rails asset pipeline precompile that theoretically does it “out of band”.  I am not certain if Apache will re-gzip the file on every request, or is smart enough to cache the gzipped version for at least life of server instance. But at least the apache mod_deflate solution already has coded into it solutions for “takes into account Accept-Encoding, sets Vary, ensures Content-Type is right, etc.”. You would think anyway?

If anyone has a config that uses apache mod_deflate for sending gzipped rails assets (instead of using the pre-zipped rails asset pipeline output, for which there’s no documented robust solution), and can confirm it takes care of everything you care about and if the performance really matters — would be nice to share with the rest of us! And if you were doing that, you’d still want to get the asset pipeline to stop creating the pre-computed gzipped versions that aren’t getting used.

Update 21Nov: Apache serve-gzipped config that seems to work

I started with sbutler’s suggestion from stackoverflow, but modified it somewhat:

  • To explicitly set the Vary header that fxn’s commit in the Rails guide pointed to as neccesary.
  • To in general be more conservative about what it applies to, only applying to .css.gz and .js.gz files (not anything that has a .gz equivalent), and only apply in the /assets subdir. When I’m not completely sure if things will work, I want to err on the side of conservatism in what it applies to.
# Let apache serve the pre-compiled .gz version of static assets,
# if available, and the user-agent can handle it. Set all headers
# correctly when doing so.
# SOMEWHAT EXPERIMENTAL. If you think it's causing problems,
# just remove the following three LocationMatch.
<LocationMatch "^/assets/.*\.(css|js)$">
  RewriteEngine on

  # Make sure the browser supports gzip encoding before we send it,
  # and that we have a precompiled .gz version.
  RewriteCond %{HTTP:Accept-Encoding} \b(x-)?gzip\b
  RewriteCond %{REQUEST_FILENAME}.gz -s
  RewriteRule ^(.+)$ $1.gz

# Make sure Content-Type is set for 'real' type, not gzip,
# and Content-Encoding is there to tell browser it needs to
# unzip to get real type.
# Make sure Vary header is set; while apache docs suggest it
# ought to be set automatically by our RewriteCond that uses an HTTP
# header, does not seem to be reliably working.
<LocationMatch "^/assets/.*\.css\.gz$">
    ForceType text/css
    Header set Content-Encoding gzip
    Header add Vary Accept-Encoding

<LocationMatch "^/assets/.*\.js\.gz$">
    ForceType application/javascript
    Header set Content-Encoding gzip
    Header add Vary Accept-Encoding

Seems to be working well. Looked in Chrome ‘network’ tag to make sure all response headers were as expected, including the Vary header.

And yeah, this definitely speeds up page load a lot for my very large asset files (a 300K minified-but-un-gzipped application.js! jquery-ui, grr.)

Is it possible there’s some edge case that will go wrong? Sure. There’s confusing things here. I wish apache (or passenger?) would just do this for me, done by people smarter than me about HTTP.  But it seems to be working.

About these ads
This entry was posted in General. Bookmark the permalink.

One Response to apache conf for Rails asset pipeline

  1. Pingback: Speed up your Rails app’s page load | Bibliographic Wilderness

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s