Chapter 5. Invalidating the Cache

In this chapter I’ll highlight several cache invalidation strategies. These strategies allow you to remove certain items from cache even though their time-to-live hasn’t expired yet.

In the world of caching, there’s only one thing worse than a low hit rate, and that’s caching for too long. That statement sounds quite weird, right? Here I am trying to convince you to cache everything, all the time, yet I’m saying that caching for too long is the worst thing to do. Allow me to explain.

Caching for Too Long

Throughout this book, I’ve always kept the best interests of the website owner, developer, and sysadmin in mind. The reality is that the site, API, and application are primarily services that the end user consumes. In the end, it’s all about the end user.

  • Why do we want to make the site fast? For the user!

  • Why do we want to keep the site available? For the user!

  • Why do we cache? So that the user has a good experience!

Caching data for too long would mess with the integrity of the data, giving the user a bad experience when up-to-date output is important. This is especially the case for news websites.

We already talked about the use of Cache-control and Expires headers. It’s important to estimate the right time-to-live and set the right values for these headers. The more accurate the time-to-live, the better the balance.

Unfortunately, in many cases the data will be out-of-date even before the object expires. Setting it to a lower value could jeopardize the health and responsiveness of your backend. Talk about being stuck between a rock and a hard place!

Worry not—Varnish has your back! Varnish offers various mechanisms to evict objects from the cache based on certain criteria. By accessing these eviction mechanisms from your code, you can actively invalidate objects, even if they haven’t expired. That way your breaking news will be correctly displayed on the front page of your website, even if the object still has two hours to live according to the time-to-live.

The Varnish documentation has a page dedicated to cache invalidation. Have a look if you’re interested.

Purging

Purging is the easiest way to invalidate the cache. In the following example, you can see that in VCL you can perform a return (purge) from within the vcl_recv subroutine. This will explicitly evict the object from cache. The object will be identified by the criteria set in vcl_hash, so by default that is the hostname and the URL. Memory is freed up immediately and cache variations are also evicted.

acl purge {
    "localhost";
    "192.168.55.0"/24;
}

sub vcl_recv {
    # allow PURGE from localhost and 192.168.55...

    if (req.method == "PURGE") {
        if (!client.ip ~ purge) {
            return(synth(403,"Not allowed."));
        }
        return (purge);
    }
}

There’s a bit more housekeeping involved when you want to do it right: the preceding example protects you from unauthorized invalidations by enforcing an ACL. Only purges from localhost or from the 192.168.55.0/24 subnet are allowed.

And then there’s the PURGE request method that is checked. By requesting the resource with PURGE instead of GET, you’re basically telling Varnish that this HTTP request is not a regular data retrieval request, but a purging request.

Important

You probably remember “When Does Varnish Completely Bypass the Cache?”. In that section, I mentioned that only certain request methods will be considered valid by Varnish. PURGE is not one of them. That’s why it’s important to do the PURGE check before the request method validation happens. Otherwise, you’ll go into pipe mode and the request will be sent to the backend either returning a valid HTTP 200 status code or if your webserver doesn’t allow PURGE, an HTTP 405 error.

You can implement a purge call anywhere in your code and you’ll typically use an HTTP client that is supported by your programming language or framework. In many cases, that client will be cURL-based. Here’s a purging example using the cURL binary:

curl -XPURGE http://example.com/some/page

This example uses the -X parameter in cURL to set the request method. As expected, we’re setting it to PURGE and setting the URL to http://example.com/some/page. That’s the resource we’re removing from cache.

Banning

Purging is easy: it uses the object’s hash, it evicts just that one object, and it can be executed with a simple return(purge).

But when you have a large number of purges to perform or you’re not exactly sure which resources are stale, exact URL invalidations might feel restrictive to you. A pattern-based invalidation mechanism would solve that problem, and banning does just that.

Banning should not be an unknown concept to you; in “Ban”, we talked about the ban function that executes these bans.

Basically, bans use a regular expression match to mark objects that should be removed from cache. These marked objects are put on the so-called ban list. Banning does not remove items from cache immediately and hence does not free up any memory directly.

Bans are checked when an object is hit and executed accordingly based on the ban list. There’s also a so-called ban lurker background thread that checks for bans that match against any variable of the obj object.

Note

The obj object only stores the response headers, response body, and metadata. It has no request information. The ban lurker doesn’t have any of this information either, which is why the ban lurker thread can only remove items from cache if the ban matches objects that have no request context, like obj.

All other bans are removed at request time and aren’t done in the background.

Here’s a basic BAN example. It does exactly the same thing as the PURGE example, but adds URL pattern-matching capabilities:

acl ban {
    "localhost";
    "192.168.55.0"/24;
}

sub vcl_recv {
    if (req.method == "BAN") {
        if (!client.ip ~ ban) {
            return(synth(403, "Not allowed."));
        }
        ban("req.http.host == " + req.http.host +
            " && req.url ~ " + req.url);
        return(synth(200, "Ban added"));
    }
}
Warning

When you accumulate lots of bans based on req object variables for resources that are not frequently accessed, Varnish might run into CPU performance problems.

Bans are kept on the ban list until all objects in cache have been checked against the list. If the banned objects do not get a new hit, they remain on the list. The longer the list, the more CPU time is required to check the list upon every hit.

That’s why it’s advised to use lurker-friendly bans.

Lurker-Friendly Bans

The ban lurker is in charge of asynchronously checking and cleaning up the ban list. I mentioned that the ban lurker has a limited scope to invalidate objects because of its lack of request information: the ban lurker only knows the obj context.

But if we copy request information from the req object, we can actually write lurker-friendly bans. Have a look at the following VCL snippet:

acl ban {
    "localhost";
    "192.168.55.0"/24;
}

sub vcl_backend_response {
  set beresp.http.x-host = bereq.http.host;
  set beresp.http.x-url = bereq.url;
}

sub vcl_deliver {
  unset resp.http.x-host;
  unset resp.http.x-url;
}

sub vcl_recv {
    if (req.method == "BAN") {
        if (!client.ip ~ ban) {
            return(synth(403, "Not allowed."));
        }
        ban("obj.http.x-host == " + req.http.host +
            " && obj.http.x-url ~ " + req.url);
        return(synth(200, "Ban added"));
    }
}

The trick is to add the host and the URL of the request as a response header when the object is stored in cache. By doing this, the missing request context is actually there. I know—it’s trickery, but it does the job.

set beresp.http.x-host = bereq.http.host; will set a custom x-host header containing the host of the request, and set beresp.http.x-url = bereq.url; will set the URL as a custom x-url response header.

At this point, the invalidation will not just happen at request time on the next hit, but also asynchronously by the ban lurker. The ban lurker will have the necessary requestion information to process bans on the ban list that contain a URL match.

After the ban we’ll just remove these custom headers; because these are for internal purposes, the user has no business using them. That is done in vcl_deliver.

The ban lurker doesn’t remove items from the ban list immediately; there are three parameters that influence its behavior:

ban_lurker_age

Bans have to be at least this old until they are removed by the ban lurker. The default value is 60 seconds.

ban_lurker_batch

The number of bans the ban lurker processes during a single run. The default value is 1000 items.

ban_lurker_sleep

The amount of seconds the ban lurker sleeps in between runs. The default value is 0.010 seconds.

If you write lurker-friendly bans and your ban list is still long, you might want to take a look at these parameters and tune them accordingly.

More Flexibility

Let’s have one more ban example that puts it all together and gives you even more flexibility:

acl ban {
    "localhost";
    "192.168.55.0"/24;
}

sub vcl_backend_response {
  set beresp.http.x-host = bereq.http.host;
  set beresp.http.x-url = bereq.url;
}

sub vcl_deliver {
  unset resp.http.x-host;
  unset resp.http.x-url;
}

sub vcl_recv {
    if (req.method == "BAN") {
        if (!client.ip ~ ban) {
            return(synth(403, "Not allowed."));
        }
        if(req.http.x-ban-regex) {
            ban("obj.http.x-host == " + req.http.host + "
            && obj.http.x-url ~ " + req.http.x-ban-regex);
        } else {
            ban("obj.http.x-host == " + req.http.host + "
            && obj.http.x-url == " + req.url);
        }
        return(synth(200, "Ban added"));
    }
}

This example combines the benefits of the previous ban examples. It gives you the flexibility to choose between an exact URL match or a regular expression match. If you set the x-ban-regex request header when banning, the value will be used to match the URL pattern. If the header is not set, the URL itself (and nothing more) is banned. And this, of course, is a lurker-friendly ban.

Here’s an example using the cURL binary:

curl -XBAN http://example.com/ -H"x-ban-regex: ^/product/[0-9]+/details"

In this example, we’re purging all product details pages based on the ^/product/[0-9]+/details regular expression. If you only want to purge a single product detail page, the curl call could look like this:

curl -XBAN http://example.com/product/121/details
Note

The name of the request method we’re using to ban or purge doesn’t really matter. As long as you can identify an invalidation request, you’re fine. We’re just calling it BAN or PURGE. Choose a request method name of your liking—just make sure it doesn’t clash with another method you use in your backend application.

Viewing the Ban List

If you’re interested in seeing the current state of the ban list, you can issue a ban.list command on the varnishadm administration program. Just execute the following command on your Varnish server:

varnishadm ban.list

And this could be the output:

Present bans:
0xb75096d0 1318329475.377475    10      obj.http.x-host == example.com
&& obj.http.x-url ~ ^/product/[0-9]+/details
0xb7509610 1318329470.785875    20C     obj.http.x-host == example.com
&& obj.http.x-url ~ ^/category

Wondering what each field means? Here we go:

  • The first field contains the unique identifier of the ban.

  • The second field is the timestamp.

  • The third field represents the amount of objects in cache that match this ban. Optionally, there could be a C attached to the third field—this is a completed ban match, usually for duplicate bans.

  • The fourth field is the ban expression itself.

Banning from the Command Line

The previous two sections approached the execution of cache invalidation from an HTTP perspective, meaning that your regular requests and your purge/ban requests all pass through the same channel. I mentioned the upside: it’s very easy to code in VCL and just as easy to implement in your backend.

There are also some downsides:

  • You have to write additional VCL code, which adds complexity.

  • There is no uniform way of implementing banning and purging in Varnish; your application will depend on the invalidation implementation in VCL.

  • Although the ACLs provide a level of security, there is no isolation from a networking perspective.

Luckily, Varnish offers an admin console that allows you to issue ban statements. This can be done locally or remotely through the varnishadm program. In “CLI address binding”, I mentioned how you can configure your Varnish to accept remote connections on the admin interface.

varnishadm is just a client that connects to the Varnish CLI socket. You can also make a TCP connection to this socket and issue ban commands directly from within your code. Have a look at the documentation page on the Varnish Command Line Interface to learn more about the commands, remote connections, and the authentication protocol.

Here’s an example of our product detail invalidation, but this time using the CLI:

varnishadm> ban obj.http.x-host == example.com && obj.http.x-url ~^/product/