Tag: Nginx

Microcaching with Nginx for WordPress

This is by no-means an authoritative guide on the subject, only a chronicle of what I learned setting this up for myself. This caching method is only appropriate for users on a VM or dedicated server with SSH access and moderate to advanced command line skills.

Purpose

I’ve found the Batcache plugin for WordPress offers excellent performance when Memcached or APC are available for the object cache. For nearly all scenarios, simply installing Batcache with the default values will be sufficient to provide a fast and scalable website.

However, while Batcache can store keys and values in Memcached or APC (or Redis or whatever you use for object caching in WordPress), the basic code for setting and retrieving keys from the cache is written in PHP. This means every time a page is served from Batcache, PHP and 1.12% of WordPress must be loaded. PHP is not the fastest program out there and just loading it creates a non-trivial load on a server. If you have a small server, this limits the amount of concurrent traffic you can serve as you will eventually run out of resources to load PHP processes. This is not an issue for the vast majority of users out there. However, for those running WordPress at high scale with popular content (especially viral content that will generate very spikey traffic surges), or those just trying to squeeze maximum performance out of a cheap VM, finding a way to serve full pages without loading PHP can net some pretty fantastic results.

At this point, somebody usually says “why not use Varnish?”. Varnish will indeed do full page caching without loading any PHP. But Varnish will not do SSL termination, and if you want to use SPDY (which I plan to at some point), you’ll need to have the entire site loaded with SSL. Since Nginx does SSL termination and has built-in caching capabilities, it made the most sense to try to figure out how to make it work.

The Concept

Batcache will handle caching with no problems at low, medium, and high scale situations. Only at very high scale will Batcache run into the limits of PHP. Therefore, if we can “rate-limit” the amount of requests reaching Batcache, we’ll avoid running into these limits at the extremely high scale. This is where the term “microcaching” comes from – we will use Nginx to serve full page caches only for an extremely short amount of time (5 seconds for example), essentially rate-limiting the amount of requests reaching WordPress and Batcache for any single page to 1 request every 5 seconds. If a page is experiencing a large traffic surge, this can make all the difference.

I considered using Nginx for the full page cache and dropping Batcache altogether, but the difficulty with this is you need to intelligently purge the cache when posts are added or updated. The plugin repo has a number of plugins that purport to do just this, but most of what I found was designed to work with Nginx in front of Apache (not my preferred setup) or using the fastcgi_cache_purge Nginx module, which isn’t included in some common Nginx repos. I was looking for something that would work without compiling from source and could be dropped in with minimal extra configuration. This solution allows us to forget about cache purges and invalidation in the Nginx cache (cache values expire too quickly to be a bother) and take advantage of the solid logic provided by Batcache for the main page cache.

The Most Important Thing

The Nginx fastcgi cache respects the expiration values set by the X-Accel-Expires/Expires/Cache-Control headers. These headers are set by WordPress and/or Batcache.

There are a number of ways to deal with this, but the easiest thing to do is just tell the Nginx cache to ignore these headers. Add this to /etc/nginx/nginx.conf:

fastcgi_ignore_headers Cache-Control Expires;

Nginx Setup

First we define where the cache files will be saved (use a tmpfs folder for even greater performance) with this setting in the http block in the nginx.conf file:

fastcgi_cache_path /etc/nginx/cache levels=1:2 keys_zone=thelastcicada:100m inactive=10m max_size=100m;

If the /etc/nginx/cache directory doesn’t exist, create it. The keys_zone name (thelastcicada in this example) will be used later to reference this collection of settings. In the key_zone, the second parameter is the size of the “shared memory zone” used for cache. I believe this is how much RAM/Swap will be used by Nginx for caching, but I’m not entirely clear on what the “shared memory zone” actually is. The “inactive” parameter is the time before garbage collection removes an old and unused cache item – in this case, 10 minutes. The max_size is the limit for the on disk cache files. Anything above this limit will be removed by garbage collection, presumably based on inactivity. For further explanation of what these settings mean, check out this Digital Ocean tutorial and the Nginx documentation.

The cache key in Nginx is an MD5 hash based on this setting, which we add just under the fastcgi_cache_path:

fastcgi_cache_key "$scheme://$host$request_method$request_uri";

Update: fastcgi_cache_key has been altered to include $request_method based on a tip from Innoscale. Adding this field prevents a weird issue whereby Nginx will occasionally cache a redirect rather than the page contents.

Next we move to the server block to define what pages should not be cached. The logic for which pages not to cache are very similar to the logic used by Batcache and all WordPress caching solutions. These lines can be added either in the php location block or in the root of the server block:

#Cache everything by default
set $no_cache 0;

#Don't cache logged in users or commenters
if ( $http_cookie ~* "comment_author_|wordpress_(?!test_cookie)|wp-postpass_" ) {
        set $no_cache 1;
}

#Don't cache the following URLs
if ($request_uri ~* "/(wp-admin/|wp-login.php)")
{
        set $no_cache 1;
}

In the php location block, insert the fastcgi cache settings for this particular site:

#matches keys_zone in fastcgi_cache_path
fastcgi_cache thelastcicada;

#don't serve pages defined earlier
fastcgi_cache_bypass $no_cache;

#don't cache pages defined earlier
fastcgi_no_cache $no_cache;

#defines the default cache time
fastcgi_cache_valid any 10s;

#unsure what the impacts of this variable is
fastcgi_max_temp_file_size 2M;

#Use stale cache items while updating in the background
fastcgi_cache_use_stale updating error timeout invalid_header http_500;
fastcgi_cache_lock on;
fastcgi_cache_lock_timeout 10s;

One last setting to add anywhere in the server block:

add_header X-Cache $upstream_cache_status;

This will add a header that can be checked in the browser and will indicate whether the cache was hit or missed.

Restart Nginx and reload your website a few times and look for “X-Cache: HIT” in the headers (I use Chrome’s inspector to see my site’s headers). At the time of writing this post, this site uses Nginx microcaching and you should see the X-Cache header setting indicating when the Nginx cache is active.

My entire nginx.conf and server-specific conf file are available here for reference. Note that I’m using a WordPress multisite install with subfolders, which is a little different than the single-site setup.

Resources

How to Setup FastCGI Caching with Nginx on your VPS
How I built “Have Baby. Need Stuff!”
Nginx HttpFastcgiModule Documentation