Rails Asset Pipeline: Serve assets easily and cheaply from CloudFront

Rails 4 comes packed with all kinds of great features; among my favorites is the improved asset pipeline. In this post, I'll cover a neat way of using the asset pipeline with Amazon's Cloudfront to quickly, cheaply, and easily serve assets for your Rails application. (No S3 required!)

But first, some basics.

What is the asset pipeline? Why should I use it?

The asset pipeline manages how your assets (images, fonts, stylesheets, javascripts, etc) are served to your site's visitors, to optimize for performance and costs.

It performs several key functions:

  • Translates asset routes: When referencing assets from your application (e.g. image_path('photo.jpg')), it translates these short names into fully qualified URLs, appropriate for the environment you're running the application in (development/test/production.)
  • Compiles assets: Converts CoffeeScripts, SASS, and other JS/CSS files into their compiled, optimized forms.
  • Optimizes caching: Tags files with digests based on their content, to bust stale caches and ensure the right assets are served & used.

These features implement best practices out-of-the-box, allowing developers to take advantage of the speed & cost benefits of CDNs, while avoiding the usual headaches that caching can bring.

Behind the scenes: how it works

When a Rails application using the asset pipeline is deployed to a production environment, assets are precompiled using rake assets:precompile.

For each asset it finds in the app/assets folder it:

  1. Compiles it, if it is a javascript or stylesheet (e.g. SASS). During this process, it typically combines smaller referenced files into a single JS or CSS file.
  2. Computes the file's digest (looks like an MD5 hash) based on its content. This unique hash effectively acts as a cache key. If its content has not changed, its hash will not change either.
  3. Copies the compiled file to the public/assets folder, appending the digest to the end of the filename. (e.g. photo-a6c9ef.jpg)
  4. Adds the asset to public/assets/manifest.json file, which maps the asset's logical path to its compiled path in the public/assets folder. (e.g. "photo.png":"photo-a6c9ef.jpg")

After the assets are precompiled and the server is started, the asset pipeline goes to work.

Asset pipeline serving strategy

In the diagram above, when a visitor requests an action that uses the asset pipeline via a helper like image_path, the pipeline will check the manifest file. It will attempt to resolve the logical path (photo.jpg in this case) to a digest version of that file.

If the asset pipeline finds a mapping, it will serve the digest asset back. If the asset pipeline does not, it will attempt to find & serve a non-digest asset that matches the logical path.

Since Rails applications running in development mode do not utilize precompiled assets by default, it will skip this process and look within app/assets & public/assets for the file instead.

Using CloudFront as an Asset Host

If your site uses lots of static assets or rich media, serving content from your webserver can be both slow and costly. In its basic form, the asset pipeline doesn't do much to address this, as it still serves assets off the local file system.

It does, however, include asset host functionality, which allows us to specify an external host for all of our assets. Whenever an asset is requested, Rails will redirect the request of an asset to a CDN of our choosing. Just set config.action_controller.asset_host = "assets.yoursite.com" in your environment config file.

Using Amazon's Route53 for DNS, Amazon's CloudFront as a CDN, and a few tricks, we can create a CDN that uses our Rails application for its source, creating a quasi-loop of sorts. This 'loop' basically creates a seamless asset-caching layer, which allows us to leverage CloudFront's caches for speed and reduced-cost, without ever having to upload any of our assets to another data store manually (like S3.)

Here's how it looks:

Asset pipeline serving strategy

Starting from home#index, when a user requests a page:

  1. The Rails application begins to process the view for home#index. Assuming there is a reference to image_path('photo.jpg') somewhere in the view, the asset pipeline will kick in, and translate 'photo.jpg' to an asset-host URL. Utilizing both the manifest and the asset host setting, it will resolve photo.jpg to assets.yoursite.com/assets/photo-a6c9ef.jpg
  2. Using the DNS provider, the browser then resolves assets.yoursite.com to the CloudFront distribution domain name. (Which can be accomplished by creating a CNAME record in Route53.)
  3. The CloudFront distribution checks its internal cache for the file. When it cannot find the file, it requests the asset file from the Origin (which is set to yoursite.com.)
  4. The Rails server receives the request for /assets/photo-a6c9ef.jpg from CloudFront. However, the asset pipeline will not generate a request to the CDN for this particular URI (forming an infinite loop; yikes!) Rather, because /assets/photo-a6c9ef.jpg is not a logical path that exists within the manifest file, Rails will check for the file in the 'public/assets' folder instead, find it, and serve back public/assets/photo-a6c9ef.jpg to the CDN.
  5. CloudFront caches the asset file (/assets/photo-a6c9ef.jpg) it receives from the Rails application, and returns a copy of the asset to the visitor who originally requested it, completing the process.

When another user visits home#index the future, steps #1-3 will occur, except CloudFront will have a copy of the file requested and serve it back immediately, skipping steps #4-5.

Watch out!

As robust as this setup is, it is not without some pitfalls.

CloudFront is clingy

When a file is cached by CloudFront, it has a tendency to cling to it tooth and nail, so you might find yourself with stale caches if you aren't careful with assigning unique cache keys. This is where the asset pipeline digests come into play; by effectively renaming a file every time its content changes, it prevents CloudFront from serving an old, cached copy of that file. Always ensure all files cached by CloudFront have a digest in their file name/url.

If a file is accidentally cached without a digest, you can use CloudFront's invalidations to instruct it to delete its cached copy of a file. Keep note though; they can take about 10 minutes to complete, and Amazon limits you to 1000 invalidated files/month.

Firefox is picky

You might find that after implementing this setup, custom fonts served through CloudFront render in Chrome, but do not render in Firefox. This is due to a policy that Firefox enforces, called Cross-origin resource sharing (CORS). When serving assets from another subdomain, the CDN, assets.yoursite.com, must set correct response headers to grant the application, yoursite.com, permission to use the assets.

On a typical request between CloudFront and the web server, the CloudFront will set the following in its request headers:

Origin: http://yoursite.com

The web server that hosts the Rails application (typically Apache, or nginx) must set response headers that exactly match the Origin from the request header:

Access-Control-Allow-Origin: http://yoursite.com
Access-Control-Allow-Credentials: true

If the Access-Control-Allow-Origin field omits the 'http://', any subdomain (including 'www'), has multiple domains separated by spaces/commas, or appears multiple times, Firefox will reject the font. The only exception is that it may specify '*' to allow all sources. However, this allows anyone to use the CDN as a free font repository (at cost to you), so be warned.

Access-Control-Allow-Origin: *

For Apache 2.4, the following can be set in the VirtualHost:

SetEnvIf Origin "http(s)?://(www\.)?(yoursite.com)$" org=$0
Header set Access-Control-Allow-Origin %{org}e env=org
Header set Access-Control-Allow-Credentials "true"

...then restart the web server with apachectl restart. To verify its working, you can use an application (like hurl.it) to send a GET request to the web server for one of the assets, with Origin set to yoursite.com in the request header.

Finally, CloudFront does cache response headers (e.g. 'Access-Control-Allow-Origin'). If the responses from the web server look good, but CloudFront cached the files without the appropriate headers, all the affected assets may need to be invalidated.

Rails 4

The asset pipeline underwent a few important changes between Rails 3 to 4. Notably, during the precompile step, Rails 3 copies both a digest and non-digest version of a file to the public/assets folder.

In Rails 4, it does not create non-digest files for assets by default. This means assets that do not use the appropriate asset pipeline helpers may show up in development and test environments, but will magically disappear in production. Be sure all assets utilize their appropriate helpers!

comments powered by Disqus