StatCache: Optimized CMS Website Performance

Sep 13, 2014

StatCache: Optimized CMS Website Performance

As you probably know, website performance can make a big difference to a website owner's bottom line. Luckily, if your site is on the MODX Revolution content management system, you have a myriad of tools available to help your site meet heavy traffic demands.

Enter statcache

I've often posted about various caching strategies in MODX, and this post is yet another one :) Statcache is a MODX plugin written by chief architect Jason Coward that writes the static html representation of a MODX-generated webpage to a file, in a dedicated cache partition.

To make the magic happen, configure your server to respond to a request by first trying to serve the statically cached file, and only if that fails, to pass the request on to MODX. It's easier than it sounds. The statcache documentation contains examples of how to set it up on both nginx and Apache web servers.

Why is it so effective?

No matter how optimized your MODX site is, it's nearly impossible to make PHP process a response, faster than a web server can deliver a static file. Nginx can serve literally thousands of html files per second without breaking a sweat – there's just no comparison.

So, statcache leverages what a web server does best, in order to deliver lightning fast performance. Here are the results of a load test I ran recently using blitz.io:

Figure 1 – before using statcache:

Before using the statcache MODX plugin

Figure 2 – after enabling statcache:

After enabling the statcache MODX plugin

As you can see, the statcache-enabled performance is an order of magnitude better than without statcache.

Note: getting uncached response times down to 1 second is the goal, so that you're not relying on caching alone to deliver performance. These numbers are to illustrate the potential performance gains of using statcache, not to benchmark "good" performance. The worse your site performs without statcache, the more impressive the improvement will be with statcache, so it's tempting (but ill-advised) to use it as a crutch.

The devil in the details

If you've had enough coffee today, you may have noticed something about this caching mechanism: the fact that the server circumvents MODX entirely when serving cached content, means that dynamic content won't be dynamic anymore. It will be as static as the statically cached files.

Fortunately, statcache is very configurable and flexible - just like MODX itself – and has some interesting tools to deal with this problem...

Statcache Properties and Events

As of version 1.4.x, the statcache plugin supports several more MODX system events over previous versions:

  • OnDocFormSave – if you set the plugin property "regenerate_on_save" to "Yes", and enable this event, then every time a resource is edited and saved in the MODX Manager, its static cache file will be rewritten with the updated output.
  • OnSiteRefresh - with this event enabled and the "regenerate" property set to "Yes", then when someone clicks the "Clear Cache" menu item in the MODX Manager, all the resources that meet the conditions of the plugin properties will be rewritten. (There are several properties that you can customize to define what is cacheable by statcache, and what isn't – see Figure 3 below.)
  • OnDocUnPublished/OnresourceDelete - when these events are enabled, the plugin will attempt to delete the static cache file for a resource when it is unpublished or deleted by someone in the MODX Manager.

Figure 3 shows the default property set – there are plenty of options to make statcache behave the way you want.

Figure 3 – screenshot of statcache properties:

Statcache plugin properties

This newly expanded set of properties, and supported events, allow a much tighter integration between website content and the statically cached files. For example, let's explore a couple of cases where the properties related to the OnSiteRefresh event can be customized for scalability:

  1. Sites with tens of thousands of resources – Imagine a PHP script that makes 100k cURL requests in rapid succession to the server it's running on. This is what happens when someone hits the "Clear Cache" button on a site with that many resources – IF all of those resources are configured to be regenerated OnSiteRefresh. Luckily, this can be disabled in those edge cases.
  2. Sites with a sustained level of very high traffic, and/or huge traffic spikes, such as the top 100,000 sites on the Alexa traffic ranking – if you remove all of the statcache files at once, MODX will get the full brunt of the website's traffic all at the same instant. This isn't any worse than if someone hits the "Clear Cache" button on a site without statcache – unless you have the "regenerate" property turned on. In that case, the plugin will make cURL requests to every cacheable resource in rapid succession, adding to the already high load.

In these cases, simply disable the OnSiteRefresh event for the statcache plugin, because even though the problems above are uncommon, they kick up a lot of fuss when they do occur. The other events help to ensure that when a resource is edited, deleted, or unpublished, its counterpart cachefile is likewise updated. The result is a very persistent, "proxy-like" cache, sitting in front of your MODX site, yet still integrated with content management actions.

With this configuration, the only way to clear the statcache partition completely would be to manually delete it on the filesystem.

More Options for Dynamic Content

You may have noticed that all the cache-refreshing magic happens on manual, Manager actions. What if someone updates a resource, and another resource that is statically cached "references", or is "dependent" upon the updated content? The cache file of this, dependent resource, is not refreshed.

This is a fundatmental issue to deal with, when it comes to caching. The plugin can't possibly be aware of an arbitrary number of other resources that need their cache files refreshed, along with the one being edited. The standard MODX cache handler clears the entire cache when any resource is edited, and this is why the statcache plugin was developed to fire OnSiteRefresh – to give editors the same option.

So, what do we do with this use case – a site with dynamic content but scalability requirements that preclude the use of the OnSiteRefresh action? Here are some tactics you can employ to deal with it...

AJAX

Good ol' Asynchronous Javascript can help us here. Oftentimes, it's only a few elements on the page that are "dependent" on other, frequently updated data sources.

In these cases, the html can be statcached and delivered to the browser, and then the dynamic content inserted with JS. If the dynamic content is generated by MODX, you can setup a resource specifically to serve it. Other caching mechanisms, like getCache can help keep those AJAX responses performant, and yet fresh as a daisy ;)

StatCacheOnDemand

Another potential use case: a custom snippet creates resources from an XML feed, and is executed via cron job. Other resources are "dependent" on this, new content – for example, a parent resource that lists child resources, which are programaticaly added to the database every few hours. You want to clear the parent resource's statcache file when new resources are created.

In this case, you could do something like this "StatCacheOnDemand" snippet, and call it from within your XML import snippet, like:

    $modx->runSnippet('StatCacheOnDemand', array(   'resources' => '234,567'  ));

Some notes about this snippet:

  1. The &resources property accepts a comma-separated string of resource IDs. This is especially important to remember when using runSnippet, as you don't want to pass it an array of IDs by accident.
  2. For each resource ID provided, the MODX cache is cleared and a cURL request is made for the resource to trigger the default cache file generation. If you feel like it's a bit heavy to clear the entire MODX cache, at least your statcache partition is still intact, so the impact on MODX's processing demands should be low.
  3. The statcache plugin must be installed and and properly configured for this to work.
  4. If you run this Snippet on many, many resources at once, a performance boost can be had by passing the array of $ids to getIterator as criteria, and looping through the returned object. That said, this snippet should never be called on a front-end request, but only when some editing or database operation is performed, so arguably performance isn't a key factor for the snippet itself.

Yet More Options

There are other proxy cache solutions that can deliver similar performance, like Varnish, for example, although it's not as easy to integrate with MODX. For that reason alone, statcache is my first choice. There's even a way to set it up in a load-balanced environment, although that gets a bit hairy. Regardless of how you use it, statcache is an extremely powerful tool that is highly customizable – and all you need in order to wield it, is MODX :)