Although the strategy is to minimize the use of the backend, it is still a crucial part of the equation. Without the backend, we cannot cache any objects, and preferably it’s the backend that decides what gets cached and how long objects are stored there.
In this chapter, we’ll talk about backends and how you can configure access to them using VCL. We’ll also group backends and perform load balancing using directors. Dealing with healthy and unhealthy backends will also be covered in this chapter.
In previous chapters, I mentioned that there are two ways to announce the backend to Varnish:
By adding a backend to your VCL file.
Or by omitting VCL completely and using the -b flag at startup time.
That’s how you do it automatically. But there are situations where there are multiple backends and you want to control which request goes to which backend. You can define multiple backends and use req.backend_hint to assign a backend other than the default one.
Here’s an example:
vcl 4.0;
backend public {
.host = "web001.example.com";
}
backend admin {
.host = "web002.example.com";
}
sub vcl_recv {
if(req.url ~ "^/admin(/.*)?") {
set req.backend_hint = admin;
} else {
set req.backend_hint = public;
}
}
You notice that we defined two backends:
A backend that handles the public traffic and that resides on web001.example.com.
A backend that serves the admin panel and that resides on web002.example.com.
By incorporating the req.backend_hint variable in our VCL logic, we can perform content-aware load balancing. Each backend could be tuned to its specific task.
Yes, Varnish has load-balancing capabilities. However, I don’t consider Varnish a true load balancer. To me, HAProxy is a superior open source load balancer, whereas Varnish is a superior HTTP accelerator. If you don’t need some of the more advanced features that HAProxy offers, Varnish will do the job just fine.
In “Backends and Health Probes”, we discussed how health checks can be performed using health probes. Let’s take our previous example and add a health probe:
vcl 4.0;
probe healthcheck {
.url = "/";
.interval = 5s;
.timeout = 1s;
.window = 10;
.threshold = 3;
.initial = 1;
.expected_response = 200;
}
backend public {
.host = "web001.example.com";
.probe = healthcheck;
}
backend admin {
.host = "web002.example.com";
.probe = healthcheck;
}
sub vcl_recv {
if(req.url ~ "^/admin(/.*)?") {
set req.backend_hint = admin;
} else {
set req.backend_hint = public;
}
}
The health checks in this example are done based on an HTTP request to the homepage. This check is done every five seconds with a timeout of one second. We expect an HTTP 200 status code. The backends are only considered healthy if 3 out of 10 health checks succeed.
You can view the health of your backends by executing the following command:
varnishadm backend.list
This could be the output of that command:
Backend name Admin Probe boot.public probe Healthy 10/10 boot.admin probe Healthy 10/10
Both backends are listed and their health is automatically checked by a probe (see the Admin column). It is considered healthy because 10 out of 10 checks were successful. As a quick reminder: only three need to succeed to have a healthy backend.
To get more verbose output, use the -p parameter:
varnishadm backend.list -p
This could be the output:
Backend name Admin Probe boot.public probe Healthy 10/10 Current states good: 10 threshold: 3 window: 10 Average response time of good probes: 0.575951 Oldest ================================================== Newest 4444444444444444444444444444444444444444444444444444444444444444 Good IPv4 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Good Xmit RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR Good Recv HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Happy boot.admin probe Healthy 3/10 Current states good: 3 threshold: 3 window: 10 Average response time of good probes: 0.642713 Oldest ================================================== Newest 4444444444444444444444444444444444444444444444---------------444 Good IPv4 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX---------------XXX Good Xmit RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR---------------RRR Good Recv HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH---------------HHH Happy
You see that the admin backend’s last three checks were successful, and those three are enough to be considered healthy. This is a backend that recovered from being “sick.”
If you continuously run this command (the watch command on Linux can help you with that), you can see the evolution of the health checks (Xmit), the responses (Recv), and the happiness of your server.
You can also manually override the probe’s decision on the backend health by issuing the following commands:
varnishadm backend.set_health boot.public sick varnishadm backend.set_health boot.admin healthy
The output is the following:
Backend name Admin Probe boot.public sick Healthy 10/10 boot.admin healthy Healthy 10/10
As you can see, the Admin column no longer contains the value probe, but sick and healthy. To give the control back to the health probes, simply issue the following commands:
varnishadm backend.set_health boot.public auto varnishadm backend.set_health boot.admin auto
The std.healthy() function that is part of the vmod_std can tell you whether or not a backend is healthy. To check the health of the current backend, just use std.healthy(req.backend_hint).
A director is a built-in VMOD that groups several backends and presents them as one. Directors offer different decision-making strategies to decide which request is handled by which backend.
The goal of directors is to avoid backend latency by distributing requests across multiple backends: by balancing out backend requests, backend servers will experience less load, which reduces latency. This is basically a horizontal scalability strategy.
Besides pure load balancing, directors also make sure that unhealthy backend nodes are not used; instead, another healthy node is used to handle those requests. This is essentially a high availability strategy.
In order to use directors, you first need to import the directors VMOD and initialize it in vcl_init. Here’s an example:
vcl 4.0;
import directors;
backend web001 {
.host = "web001.example.com";
.probe = healthcheck;
}
backend web002 {
.host = "web002.example.com";
.probe = healthcheck;
}
sub vcl_init {
new loadbalancing = directors.round_robin();
loadbalancing.add_backend(web001);
loadbalancing.add_backend(web002);
}
sub vcl_recv {
set req.backend_hint = loadbalancing.backend();
}
Let’s take this example step by step:
First we import the directors VMOD.
We declare two backends.
We initialize the director and decide on the load-balancing strategy (in this case, it’s “round robin”).
We assign both backends to the director.
We now have a new backend that we named loadbalancing.
We assign the loadbalancing backend to Varnish by issuing set req.backend_hint = loadbalancing.backend();.
More information about directors can be found on the Varnish documentation page about vmod_directors.
The previous example contained an example that referred to the round-robin director, but what does that mean? Round-robin is a basic distribution strategy where every backend takes its turn sequentially. If you have three backends, the sequence is:
Backend 1
Backend 2
Backend 3
Backend 1
Backend 2
Backend 3
…
The load is distributed equally—no surprises whatsoever. In most cases, round-robin is a good algorithm, but there are scenarios where it won’t perform well. Take, for example, a situation where your backend servers don’t have the same server resources. The server with the least amount of memory or CPU will still have to do an equal amount of work.
Here’s an example of a round-robin director declaration that uses three backends:
sub vcl_init {
new loadbalancing = directors.round_robin();
loadbalancing.add_backend(backend1);
loadbalancing.add_backend(backend2);
loadbalancing.add_backend(backend3);
}
After we declare the director, it needs to be assigned in vcl_recv:
sub vcl_recv {
set req.backend_hint = loadbalancing.backend();
}
The random director distributes load over the backends using a weighted random-probability distribution algorithm. By default, the weight for all backends is the same, which means load will be (somewhat) equally distributed. In that respect, the random director has the same effect as the round-robin director, with some slight deviations.
As soon as you start assigning specific weights to the backends, the deviations will increase. This makes sense if you want to “spare” one or more servers, such as in a situation where a server is under-dimensioned or hosts other business-critical applications.
Based on the weights, each backend receives 100 × (weight / (sum(all_added_weights))) percent of the total load.
Here’s an example:
sub vcl_init {
new loadbalancing = directors.random();
loadbalancing.add_backend(backend1,1.0);
loadbalancing.add_backend(backend2,2.0);
}
The preceding example declares a new random director with two backends:
Backend 1 receives about 33% of the load
Backend 2 receives about 67% of the load
And then you assign the backend:
sub vcl_recv {
set req.backend_hint = loadbalancing.backend();
}
The hash director chooses the backend based on a SHA256 hash of a given value. The value it hashes is passed to the backend() method of the director object.
The hash director is often used to facilitate sticky sessions by hashing either the client IP address or a session cookie. By hashing either of those values, Varnish assures that requests for the same user or session always reach the same backend. This is important for backend servers that store their session data locally.
You could also hash by request URL. Requests for the URL will always be sent to the same backend.
The risk of hashing by client IP or request URL is that the load will not be equally distributed, such as in the following cases:
The client IP address could be the IP of a proxy server used by multiple users, causing heavy load on one specific backend.
One URL could be far more popular than another, causing heavy load on one specific backend.
This is how you declare a hash director:
sub vcl_init {
new loadbalancing = directors.hash();
loadbalancing.add_backend(backend1,1.0);
loadbalancing.add_backend(backend2,1.0);
}
This example assigns two backends with the same weight, which is the recommended value. If you change the weights, one server will get more requests than the other, but requests for the same user, URL, or session will continuously be sent to the same backend.
With hash directors, the magic is in the assignment, not the declaration. Here’s an example of a hash director that hashes by client IP address:
sub vcl_recv {
set req.backend_hint = loadbalancing.backend(client.ip);
}
As I explained, sticky-IP hashing is risky. Here’s an example in which the session cookie is hashed:
sub vcl_recv {
set req.backend_hint = loadbalancing.backend(regsuball(req.http.Cookie,
"^.*;? ?PHPSESSID=([a-zA-Z0-9]+)( ?|;| ;).*$","\1"));
}
This example uses the PHPSESSID cookie that contains the session ID generated by the session_start() function in PHP.
And here’s a final example, in which we do URL hashing:
sub vcl_recv {
set req.backend_hint = loadbalancing.backend(req.url);
}
The final director I’m going to feature is the fallback director. For this director, the order of backend assignments is very important: the fallback director will try each backend and return the first one that is healthy.
Please ensure that your backends have a health probe attached; otherwise, the fallback director has no way to determine whether or not backends are healthy.
Without the presence of a health probe, the fallback director will not try the next backend and will return an HTTP 503 error.
Let’s have a look at an example of the fallback director:
vcl 4.0;
import directors;
probe healthcheck {
.url = "/";
.interval = 2s;
.timeout = 1s;
.window = 3;
.threshold = 2;
.initial = 1;
.expected_response = 200;
}
backend web001 {
.host = "web001.example.com";
.probe = healthcheck;
}
backend web002 {
.host = "web002.example.com";
.probe = healthcheck;
}
sub vcl_init {
new loadbalance = directors.fallback();
loadbalance.add_backend(web001);
loadbalance.add_backend(web002);
}
sub vcl_recv {
set req.backend_hint = loadbalance.backend();
}
In this example, web001 is the preferred backend. The health probe will check the availability of the homepage every two seconds. If two out of three checks succeed, the backend is considered healthy; otherwise, the fallback director will switch to web002.
It is possible to stack directors and to use directors as members of other directors. This is a way to combine loadbalancing strategies.
For example, you can have round-robin directors that each have two backends in two data centers. To ensure round-robin load balancing and still have high-availability, you can add both round-robin directors as members of a fallback director.
Throughout this chapter, we’ve focused on providing a stable backend service so that Varnish can access backend data without the slightest hiccup:
Measuring backend health
Adding health check probes
Offering directors to distribute load
Leveraging directors to use healthy nodes instead of unhealthy ones
But one important question remains unanswered: what do we do if no backends are available?
We can either live with it or compensate for it. I’d rather go for the latter, and that’s where grace mode comes into play.
If we assign a certain amount of grace time, we’re basically telling Varnish that it can serve objects beyond their time-to-live. These objects are considered “stale” and are served as long as there’s no updated object for a duration defined by the grace time.
The built-in VCL hit logic explains it best:
sub vcl_hit {
if (obj.ttl >= 0s) {
// A pure unadultered hit, deliver it
return (deliver);
}
if (obj.ttl + obj.grace > 0s) {
// Object is in grace, deliver it
// Automatically triggers a background fetch
return (deliver);
}
// fetch & deliver once we get the result
return (miss);
}
If the object hasn’t expired (obj.ttl >= 0s), keep on serving the object
If the object has expired, but there is some grace time left (obj.ttl + obj.grace > 0s), keep serving the object but fetch a new version asynchronously
Otherwise, fetch a new version and queue the request
It’s important to get a major misconception out of the way: grace mode only works for items that are stored in cache. It’s the only way to make the reload go unnoticed. As long as the object is in cache, it can be served and the backend call is asynchronous.
But when the requested object is not stored in cache, the backend request is synchronous and the end user has to wait for the result, even though the backend fetch is done by a separate thread.
Enabling grace mode is quite easy: you just assign a value to beresp.grace in the vcl_backend_response subroutine:
sub vcl_backend_response {
set beresp.grace = 30s;
}
In this example, we’re allowing Varnish to keep serving stale data up to 30 seconds beyond the object’s time-to-live. When that time expires, we revert back to synchronous fetching. This means that slow backends will feel slow again and if the backend is down, an HTTP 503 error will be returned.
You’ve spent a lot of time learning how Varnish works and how to get data in the cache. A lot of attention has gone to the client-side aspect: serving cached data to the end user. But let’s not forget that Varnish relies on a healthy backend to do its work.
At this point, you know how to link Varnish to one or multiple backends. You can configure all kinds of timeouts, and you’re able to check the health of a backend. You should also feel confident circumventing unhealthy backends with directors or by using grace mode.
By using some of the tips and tricks from this chapter, you’ll have more uptime. Regardless of your hit rate, you will still need your backend for content revalidation, so be sure to think about the availability of your backend servers.