Now that we have set up Varnish, it’s time to use it. In Chapter 2 we talked about the configuration settings, so by now you should have the correct networking settings that allow you to receive HTTP requests either directly on port 80 or through another proxy or load balancer.
Out-of-the-box Varnish can already do a lot for you. There is a default behavior that is expressed by the built-in VCL and there are a set of rules that Varnish follows. If your backend application complies with these rules, you’ll have a pretty decent hit rate.
Varnish uses a lot of HTTP best practices to decide what gets cached, how it gets cached, and how long it gets cached. As a web developer, I strongly advise that you apply these best practices in the day-to-day development of your backend applications. This empowers you and helps you avoid having to rely on custom Varnish configurations that suit your application. It keeps the caching logic portable.
Unlike many other proxies, Varnish is an HTTP accelerator. That means Varnish does HTTP and HTTP only. So it makes sense to know HTTP and how it behaves.
There are five ways in which Varnish respects HTTP best practices:
Idempotence
State
Expiration
Conditional requests
Cache variations
Let’s have a look at each of these and explore how Varnish deals with them.
Varnish will only cache resources that are requested through an idempotent HTTP verb, which are HTTP verbs that do not change the state of the resource. To put it simply, Varnish will only cache requests using the following methods:
GET
HEAD
And that makes perfect sense: if you issue a request using POST or PUT, the method itself implies that a change will happen. In that respect, caching wouldn’t make sense because you would be caching stale data right from the get-go.
So if Varnish sees a request coming in through, let’s say, POST, it will pass the request to the backend and will not cache the returned response.
For the sake of completeness, these are the HTTP verbs/methods that Varnish can handle:
GET (can be cached)
HEAD (can be cached)
PUT (cannot be cached)
POST (cannot be cached)
TRACE (cannot be cached)
OPTIONS (cannot be cached)
DELETE (cannot be cached)
All other HTTP methods are considered non-RFC2616 compliant and will completely bypass the cache.
Now that you know about idempotence and how HTTP request methods shouldn’t change the state of the resource, let’s look at other mechanisms in HTTP that can control state. I’m not talking about global state, but more specifically about user-specific data. There are two ways to keep track of state for users:
Authorization headers
Cookies
Whenever Varnish sees one of these, it will pass the request off to the backend and not cache the response. This happens because when an authentication header or a cookie is sent, it implies that the data will differ for each user performing that request.
If you decide to cache the response of a request that contains an authentication header or cookie, you would be serving a response tailored to the first user that requested it. Other users will see it, too, and the response could potentially contain sensitive or irrelevant information.
But let’s face it: cookies are our main instrument to keep track of state, and websites that do not uses cookies are hard to come by. Unfortunately, the internet uses too many cookies and often for the wrong reasons.
We use cookies to establish sessions in our application. We can also use cookies to keep track of language, region, and other preferences. And then there are the tracking cookies that are used by third parties to “spy” on us.
In terms of HTTP, cookies appear both in the request and the response process. It is the backend that sets one or more cookies by issuing a Set-Cookie response header. The client receives that response and stores the cookies in its local cookie store.
As you can see in the example below, a cookie is a set of key-value pairs, delimited by an ampersand.
Set-Cookie: language=en&country=us
When a client has stored cookies for a domain, it will use a Cookie request header to send the cookies back to the server upon every subsequent request. The cookies are also sent for requests that do not require a specific state (e.g., static files).
Cookie: language=en&country=us
This two-step process is how cookies are set and announced. Just remember the difference between Cookie and Set-Cookie. The first is a request header; the second is a response header.
I urge web developers to not overuse cookies. Do not initiate a session that triggers a Set-Cookie just because you can. Only set sessions and cookies when you really need to. I know it’s tempting, but consider the impact.
As mentioned, Varnish doesn’t like to cache cookies. Whenever it sees a request with a Cookie header, the request will be passed to the backend and the response will not be cached.
When a request does not contain a cookie but the response includes a Set-Cookie header, Varnish will not store the result in cache.
HTTP has a set of mechanisms in place to decide when a cached object should be removed from cache. Objects cannot live in cache forever: you might run out of cache storage (memory or disk space) and Varnish will have to evict items using an LRU strategy to clear space. Or you might run into a situation where the data you are serving is stale and the object needs to be synchronized with a new response from the backend.
Expiration is all about setting a time-to-live. HTTP has two different kinds of response headers that it uses to indicate that:
An absolute timestamp that represents the expiration time.
The amount of seconds an item can live in cache before becoming stale.
Varnish gives you a heads-up regarding the age of a cached object. The Age header is returned upon every response. The value of this Age header corresponds to the amount of time the object has been in cache. The actual time-to-live is the cache lifetime minus the age value.
For that reason, I advise you not to set an Age header yourself, as it will mess with the TTL of your objects.
The Expires header is a pretty straight forward one: you just set the date and time when an object should be considered stale. This is a response header that is sent by the backend.
Here’s an example of such a header:
Expires: Sat, 09 Sep 2017 14:30:00 GMT
Do not overlook the fact that the time of an Expires header is based on Greenwich Mean Time. If you are located in another time zone, please express the time accordingly.
The Cache-control header defines the time-to-live in a relative way: instead of stating the time of expiration, Cache-control states the amount of seconds until the object expires. In a lot of cases, this is a more intuitive approach: you can say that an object should only be cached for an hour by assigning 3,600 seconds as the time-to-live.
This HTTP header has more features than the Expires header: you can set the time to live for both clients and proxies. This allows you to define distinct behavior depending on the kind of system that processes the header; you can also decide whether to cache and whether to revalidate with the backend.
Cache-control: public, max-age=3600, s-maxage=86400
The preceding example uses three important keywords to define the time-to-live and the ability to cache:
publicIndicates that both browsers and shared caches are allowed to cache the content.
max-ageThe time-to-live in seconds that must be respected by the browser.
s-maxageThe time-to-live in seconds that must be respected by the proxy.
It’s also important to know that Varnish only respects a subset of the Cache-control syntax. It will only respect the keywords that are relevant to its role as a reverse caching proxy:
Cache-control headers sent by the browser are ignored
The time-to-live from an s-maxage statement is prioritized over a max-age statement
Must-revalidate and proxy-revalidate statements are ignored
When a Cache-control response header contains the terms private, no-cache, or no-store, the response is not cached
Although Varnish respects the public and private keywords, it doesn’t consider itself a shared cache and exempts itself from some of these rules. Varnish is more like a surrogate web server because it is under full control of the web server and does the webmaster’s bidding.
Varnish respects both Expires and Cache-control headers. In the Varnish Configuration Language, you can also decide what the time-to-live should be regardless of caching headers. And if there’s no time-to-live at all, Varnish will fall back to its hardcoded default of 120 seconds.
Here’s the list of priorities that Varnish applies when choosing a time-to-live:
If beresp.ttl is set in the VCL, use that value as the time-to-live.
Look for an s-maxage statement in the Cache-control header.
Look for a max-age statement in the Cache-control header.
Look for an expires header.
Cache for 120 seconds under all other circumstances.
Expiration is a valuable mechanism for updating the cache. It’s based on the concept of checking the freshness of an object at set intervals. These intervals are defined by the time-to-live and are processed by Varnish. The end user doesn’t really have a say in this.
After the expiration, both the headers and the payload are transmitted and stored in cache. This could be a very resource-intensive matter and a waste of bandwidth, especially if the requested data has not changed in that period of time.
Luckily, HTTP offers a way to solve this issue. Besides relying on a time-to-live, HTTP allows you to keep track of the validity of a resource. There are two separate mechanisms for that:
The Etag response header
The Last-Modified response header
Most web browsers support conditional requests based on the Etags and Last-Modified headers, but Varnish supports this as well when it communicates with the backend.
An Etag is an HTTP response header that is either set by the web server or your application. It contains a unique value that corresponds to the state of the resource.
A common strategy is to create a unique hash for that resource. That hash could be an md5 or a sha hash based on the URL and the internal modification date of the resource. It could be anything as long as it’s unique.
HTTP/1.1 200 OK Host: localhost Etag: 7c9d70604c6061da9bb9377d3f00eb27 Content-type: text/html; charset=UTF-8 Hello world output
As soon as a browser sees this Etag, it stores the value. Upon the next request, the value of the Etag will be sent back to the server in an If-None-Match request header.
GET /if_none_match.php HTTP/1.1 Host: localhost User-Agent: curl/7.48.0 If-None-Match: 7c9d70604c6061da9bb9377d3f00eb27
The server receives this If-None-Match header and checks if the value differs from the Etag it’s about to send.
If the Etag value is equal to the If-None-Match value, the web server or your application can return an HTTP/1.1 304 Not Modified response header to indicate that the value hasn’t changed.
HTTP/1.0 304 Not Modified Host: localhost Etag: 7c9d70604c6061da9bb9377d3f00eb27
When you send a 304 status code, you don’t send any payload, which can dramatically reduce the amount of bytes sent over the wire. The browser receives the 304 and knows that it can still output the old data.
If the If-None-Match value doesn’t match the Etag, the web server or your application will return the full payload, accompanied by the HTTP/1.1 200 OK response header and, of course, the new Etag.
This is an excellent way to conserve resources. Whereas the primary goal is to reduce bandwidth, it will also help you to reduce the consumption of memory, CPU cycles, and disk I/O if you implement it the right way.
Here’s an implementation example. It’s just some dummy script that, besides proving my point, serves no real purpose. It’s written in PHP because PHP is my language of choice. The implementation is definitely not restricted to PHP. You can implement this in any server-side language you like.
<?php$etag=md5(__FILE__.filemtime(__FILE__));header('Etag: '.$etag);if(isset($_SERVER['HTTP_IF_NONE_MATCH'])&&$_SERVER['HTTP_IF_NONE_MATCH']==$etag){header('HTTP/1.0 304 Not Modified');exit;}sleep(5);?><h1>Etag example</h1><?phpechodate("Y-m-d H:i:s").'<br />';
ETags aren’t the only way to do conditional requests; there’s also an alternative technique based on the Last-Modified response header. The client will then use the If-Modified-Since request header to validate the freshness of the resource.
The approach is similar:
Let your web server or application return a Last-Modified response header
The client stores this value and uses it as an If-Modified-Since request header upon the next request
The web server or application matches this If-Modified-Since value to the modification date of the resource
Either an HTTP/1.1 304 Not Modified or a HTTP/1.1 200 OK is returned
The benefits are the same: reduce the bytes over the wire and load on the server by avoiding the full rendering of output.
The timestamps are based on the GMT time zone. Please make sure you convert your timestamps to this time zone to avoid weird behavior.
The starting point in the following example is the web server (or the application) returning a Last-Modified response header:
HTTP/1.1 200 OK Host: localhost Last-Modified: Fri, 22 Jul 2016 10:11:16 GMT Content-type: text/html; charset=UTF-8 Hello world output
The browser stores the Last-Modified value and uses it as an If-Last-Modified in the next request:
GET /if_last_modified.php HTTP/1.1 Host: localhost User-Agent: curl/7.48.0 If-Last-Modified: Fri, 22 Jul 2016 10:11:16 GMT
The resource wasn’t modified, a 304 is returned, and the Last-Modified value remains the same:
HTTP/1.0 304 Not Modified Host: localhost Last-Modified: Fri, 22 Jul 2016 10:11:16 GMT
The browser does yet another conditional request:
GET /if_last_modified.php HTTP/1.1 Host: localhost User-Agent: curl/7.48.0 If-Last-Modified: Fri, 22 Jul 2016 10:11:16 GMT
The resource was modified in the meantime and a full 200 is returned, including the payload and a new Last-Modified_header.
HTTP/1.1 200 OK Host: localhost Last-Modified: Fri, 22 Jul 2016 11:00:23 GMT Content-type: text/html; charset=UTF-8 Some other hello world output
Time for another implementation example for conditional requests, this time based on the Last-Modified header. Again, it’s dummy code, written in PHP:
<?phpheader('Last-Modified: '.gmdate('D, d M Y H:i:s',filemtime(__FILE__)).' GMT');if(isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])&&strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'])>=filemtime(__FILE__)){header('HTTP/1.0 304 Not Modified');exit;}sleep(5);?><h1>Last-Modified example</h1><?phpechodate("Y-m-d H:i:s").'<br />';
Just like in the previous implementation example, we fake the delay caused by heavy load and use a sleep statement to make the application seem slower than it really is.
When Varnish spots an If-Modified-Since or If-None-Match header in the request, it keeps track of the Last-Modified timestamp and/or the Etag. Regardless of whether or not Varnish has the object in cache, a 304 status code will be returned if the Last-Modified or the Etag header matches.
From a client point of view, Varnish reduces the amount of bytes over the wire by returning the 304.
On the other hand, Varnish also supports conditional requests when it comes to backend communication: when an object is considered stale, Varnish will send If-Modified-Since and If-None-Match headers to the backend if the previous response from the backend contained either a Last-Modified timestamp or an Etag.
When the backend returns a 304 status code, Varnish will not receive the body of that response and will assume the content hasn’t changed. As a consequence, the stale data will have been revalidated and will no longer be stale. The Age response header will be reset to zero and the object will live in cache in accordance to the time-to-live that was set by the web server or the application.
Typically, stale data is revalidated by Varnish, but there is a VCL variable that allows you to manipulate that behavior: the beresp.keep variable decides how long stale objects will be returned while performing a conditional request. It’s basically an amount of time that is added to the time-to-live. This allows Varnish to perform the conditional requests asynchronously without the client noticing any delays. The beresp.keep variable works independently from the beresp.grace variable.
Both beresp.keep and beresp.grace, as well as many other VCL objects and variables, will be discussed in Chapter 4.
In general, an HTTP resource is public and has the same value for every consumer of the resource. If data is user-specific, it will, in theory, not be cacheable. However, there are exceptions to this rule and HTTP has a mechanism for this.
HTTP uses the Vary header to perform cache variations. The Vary header is a response header that is sent by the backend. The value of this header contains the name of a request header that should be used to vary on.
The value of the Vary header can only contain a valid request header that was set by the client. You can use the value of custom X- HTTP headers as a cache variation, but then you need to make sure that they are set by the client.
A very common example is language detection based on the Accept-Language request header. Your browser will send this header upon every request. It contains a set of languages or locales that your browser supports. Your application can then use the value of this header to determine the language of the output. If the desired language is not exposed in the URL or through a cookie, the only way to know is by using the Accept-Language header.
If no vary header is set, the cache (either the browser cache or any intermediary cache) has no way to identify the difference and stores the object based on the first request. If that first request was made in Dutch, all other users will get output in Dutch—regardless of the browser language—for the duration of the cache lifetime.
That is a genuine problem, so in this case, the application returns a Vary header containing Accept-Language as its value. Here’s an example:
The browser language is set to Dutch:
GET / HTTP/1.1 Host: localhost Accept-Language: nl
The application sets a Vary header that instructs the cache to keep a separate version of the cached object based on the Accept-Language value of the request.
HTTP/1.1 200 OK Host: localhost Vary: Accept-Language Hallo, deze pagina is in het Nederlands geschreven.
The cache knows there is a Dutch version of this resource and will store it separately, but it will still link it to the cached object of the main resource. When the next request is sent from a browser that only supports English, the cached object containing Dutch output will not be served. A new backend request will be made and the output will be stored separately.
Be careful when you perform cache variations based on request headers that can contain many different values. The User-Agent and the Cookie headers are perfect examples.
In many cases, you don’t have full control over the cookie value. Tracking cookies set by third-party services can add unique values per user to the cookie. This could result in too many variations, and the hit rate would plummet.
The same applies to the User-Agent: almost every device has its own User-Agent. When using this as a cache variation, the hit rate could drop quite rapidly.
Varnish respects the Vary header and adds variations to the cache on top of the standard identifiers. The typical identifiers for a cached object are the hostname (or the IP if no hostname was set) and the URL.
When Varnish notices a cache variation, it will create a cache object for that version. Cache variations can expire separately, but when the main object is invalidated, the variations are gone, too.
You have to find a balance between offering enough cache variations and a good hit rate. Choose the right request header to vary on and look for balance.
Now that we know how Varnish deals with HTTP, we can summarize how Varnish behaves right out of the box. Here’s a set of questions we can ask ourselves:
When is a request considered cacheable in Varnish?
When does Varnish completely bypass the cache?
How does Varnish identify an object?
When does Varnish cache an object?
What happens if an object is not stored in cache?
How long does Varnish cache an object?
Sounds mysterious, huh? Let me provide answers and allow me to explain how Varnish respects HTTP best practices.
When Varnish receives a request, it has to decide whether or not the response can be cached or even served from cache. The rules are simple and based on idempotence and state.
A request is cacheable when:
The request method is GET or HEAD
There are no cookies being sent by the client
There is no authorization header being sent
When these criteria are met, Varnish will look the resource up in cache and will decide if a backend request is needed, or if the response can be served from cache.
If a request is not cacheable, the request is passed, a backend connection is made and the result is stored in the hit-for-pass cache. An example of this is a POST request.
But all of this happens under the assumption that the request method is a valid one that complies to RFC2616. Other request methods will not be processed by Varnish and will be piped to the backend.
When Varnish goes in to pipe mode, it opens a TCP connection to the backend, transmits the original request and immediately returns the response. There’s no further processing of the request or response.
Here’s a list of valid request methods according to the built-in VCL:
GET
HEAD
PUT
POST
DELETE
TRACE
OPTIONS
All other request methods will be piped to the backend.
RFC 2616 does not support request methods like PATCH, LINK, or UNLINK. Those were introduced in RFC 2068. If you require support for either of those methods, you’ll need to customize your VCL and include those methods.
“A Real-World VCL File” offers a solution for that.
Once we decide that an object is cacheable, we need a way to identify the object in order to retrieve it from cache. A hash key is composed of several values that serve as a unique identifier.
If the request contains a Host header, the hostname will be added to the hash.
Otherwise, the IP address will be added to the hash.
The URL of the request is added to the hash.
Based on that hash, Varnish will retrieve the object from cache.
If an object is not stored in cache or when it’s considered stale, a backend connection is made. Based on the backend response, Varnish will decide if the returned object will be stored in cache or if the cache is going to be bypassed.
A response will be stored in cache when:
The time-to-live is more than zero.
The response doesn’t contain a Set-Cookie header.
The Cache-control header doesn’t contain the terms no-cache, no-store, or private.
The Vary header doesn’t contain *, meaning vary on all headers.
If after the backend response Varnish decides that an object will not be stored in cache, it puts the object on a “blacklist”—the so-called hit-for-pass cache.
For a duration of 120 seconds, the next requests will immediatly connect with the backend, directly serving the response, without attempting to store the response in cache.
After 120 seconds, upon the next request, the response can be re-evaluated and a decision can be made whether or not to store the object in cache.
Once an object is stored in cache, a decision must be made on the time-to-live. I mentioned this before, but there’s a list of priorities that Varnish uses to decide which value it will use as the TTL.
Here’s the prioritized list:
If beresp.ttl is set in the VCL, use that value as the time-to-live.
Look for an s-maxage statement in the Cache-control header.
Look for a max-age statement in the Cache-control header.
Look for an Expires header.
Cache for 120 seconds under all other circumstances.
When the object is stored in the hit-for-pass cache, it is cached for 120 seconds, unless you change the value in VCL.
When you’re up and running and sending your HTTP traffic through Varnish, there will be a certain behavior that will impact the cacheability of your website.
This behavior does not reflect arbitrary rules and policies that were defined by Varnish itself. Varnish respects conventional HTTP best practices that were defined in industry-wide, accepted RFCs.
Even if you don’t add any VCL code, the best practices will make sure that your website is properly cached, assuming that your code respects the best practices as well.
An additional advantage is that the cacheability of your website and the portability of the caching behavior can go beyond the scope of Varnish. You can swap out Varnish for another kind of reverse proxy, or even a CDN.
At this point you will know what a Cache-control header is and how it compares to an Expires header. You’ll have a pretty solid idea how to leverage those headers to control the cacheability of your pages. By now, you’re no stranger to Cache variations and conditional requests.
Finally and most importantly: you can only cache GET or HEAD requests, because they are idempotent. Nonidempotent requests like, for example, POST, PUT, and DELETE cannot be cached.