Chapter 2. Go, Go, Go and Get Started!

Now that you know what Varnish is all about, you’re probably eager to learn how to install, configure, and use it. This chapter will cover the basic installation procedure on the most commonly supported operating systems and the typical configuration parameters that you can tune to your liking.

Installing Varnish

Varnish is supported on the following operating systems:

  • Linux

  • FreeBSD

  • Solaris

You can get it to work on other UNIX-like systems (OS X, OpenBSD, NetBSD, and Windows with Cygwin), but there’s no official support for those.

Note

In reality, you’ll probably install Varnish on a Linux system. For development purposes, you might even run it on OS X. Linux is the most commonly used operating system for production systems. Some people do local development on a Mac and want to test their code locally. Therefore, it could make sense to install Varnish on OS X, just to see how your code behaves when it gets cached by Varnish.

The supported Linux distributions are:

  • Ubuntu

  • Debian

  • Red Hat

  • CentOS

You can easily install Varnish using the package manager of your operating system, but you can also compile Varnish from source.

Installing Varnish Using a Package Manager

Compiling from source is all fun and games, but it takes a lot of time. If you get one of the dependencies wrong or you install the wrong version of a dependency, you’re going to have a bad day. Why bother doing it the hard way (unless you have your reasons) if you can easily install Varnish using the package manager of your operating system?

Here’s a list of package managers you can use according to your operating system:

  • APT on Ubuntu and Debian

  • YUM on Red Hat and CentOS

  • PKG on FreeBSD

Warning

Even though FreeBSD officially supports Varnish, I will skip it for the rest of this book. In reality, few people run Varnish on FreeBSD. That doesn’t mean I don’t respect the project and the operating system, but I’m writing this book for the mainstream and let’s face it: FreeBSD is not so mainstream.

Installing Varnish on Ubuntu and Debian

In simple terms, we can say that the Ubuntu and the Debian distributions are related. Ubuntu is a Debian-based operating system. Both distributions use the APT package manager. But even though the installation of Varnish is similar on both distributions, there are subtle differences. That’s why there are different APT repository channels for Ubuntu and Debian.

Here’s how you install Varnish on Ubuntu, assuming you’re running the Ubuntu 14.04 LTS (Trusty Tahr) version:

apt-get install apt-transport-https
curl https://repo.varnish-cache.org/GPG-key.txt | apt-key add -
echo "deb https://repo.varnish-cache.org/ubuntu/ trusty varnish-4.1" \
     >> /etc/apt/sources.list.d/varnish-cache.list
apt-get update
apt-get install varnish
Note

Packages are also available for other Ubuntu versions. Varnish only supports LTS versions of Ubuntu. Besides Trusty Tahr, you can also install Varnish on Ubuntu 12.04 LTS (Precise Pangolin) and Ubuntu 10.04 LTS (Lucid Lynx). You can do this by replacing the trusty keyword with either precise or lucid in the previous example.

If you’re running Debian, here’s how you can install Varnish on Debian 8 (Jessie):

apt-get install apt-transport-https
curl https://repo.varnish-cache.org/GPG-key.txt | apt-key add -
echo "deb https://repo.varnish-cache.org/debian/ jessie varnish-4.1"\
    >> /etc/apt/sources.list.d/varnish-cache.list
apt-get update
apt-get install varnish
Note

If you’re running an older version of Debian, there are packages available for Debian 5 (Lenny), Debian 6 (Squeeze), and Debian 7 (Wheezy). Just replace the jessie keyword with either lenny, squeeze, or wheezy in the preceding statements.

Installing Varnish on Red Hat and CentOS

There are three main distributions in the Red Hat family of operating systems:

  • Red Hat Enterprise: the paid enterprise version

  • CentOS: the free version

  • Fedora: the bleeding-edge desktop version

All three of them have the YUM package manager, but we’ll primarily focus on both Red Hat and CentOS, which have the same installation procedure.

If you’re on Red Hat or CentOS version 7, here’s how you install Varnish:

yum install epel-release
rpm --nosignature -i https://repo.varnish-cache.org/redhat/varnish-4.1.el7.rpm
yum install varnish

If you’re on Red Hat or CentOS version 6, here’s how you install Varnish:

yum install epel-release
rpm --nosignature -i https://repo.varnish-cache.org/redhat/varnish-4.1.el6.rpm
yum install varnish

Configuring Varnish

Now that you have Varnish installed on your system, it’s time to configure some settings so that you can start using it.

Varnish has a bunch of startup options that allow you to configure the way you interact with it. These options are located in a configuration file and assigned to the varnishd program at startup time. Here are some examples of typical startup options:

  • The address and port on which Varnish processes its incoming HTTP requests

  • The address and port on which the Varnish CLI runs

  • The location of the VCL file that holds the caching policies

  • The location of the file that holds the secret key, used to authenticate with the Varnish CLI

  • The storage backend type and the size of the storage backend

  • Jailing options to secure Varnish

  • The address and port of the backend that Varnish will interact with

Note

You can read more about the Varnish startup options on the official varnishd documentation page.

The Configuration File

The first challenge is to find where the configuration file is located on your system. This depends on the Linux distribution, but also on the service manager your operating system is running.

If your operating system uses the systemd service manager, the Varnish configuration file will be located in a different folder than it usually would be. Systemd is enabled by default on Debian Jessie and CentOS 7. Ubuntu Trusty Tahr still uses Sysv.

If you want to know where the configuration file is located on your operating system (given that you installed Varnish via a package manager), have a look at Table 2-1.

Table 2-1. Location of the Varnish configuration file
SysV Systemd

Ubuntu/Debian

/etc/default/varnish

/etc/systemd/system/varnish.service

Red Hat/CentOS

/etc/sysconfig/varnish

/etc/varnish/varnish.params

Warning

If you use systemd on Ubuntu or Debian, the /etc/systemd/system/varnish.service configuration file will not yet exist. You need to copy it from /lib/systemd/system/.

If you change the content of the configuration file, you need to reload the Varnish service to effectively load these settings. Run the following command to make this happen:

sudo service varnish reload

Some Remarks on Systemd on Ubuntu and Debian

If you’re on Ubuntu or Debian and you’re using the systemd service manager, there are several things you need to keep in mind.

First of all, you need to copy the configuration file to the right folder in order to override the default settings. Here’s how you do that:

sudo cp /lib/systemd/system/varnish.service /etc/systemd/system

If you’re planning to make changes to that file, don’t forget that the results are cached in memory. You need to reload systemd in order to have your changes loaded from the file. Here’s how you do that:

sudo systemctl daemon-reload

That doesn’t mean Varnish will be started with the right startup options, only that systemd knows the most recent settings. You will still need to reload the Varnish service to load the configuration changes, like this:

sudo service varnish reload

Startup Options

By now you already know that the sole purpose of the configuration file is to feed the startup options to the varnishd program. In theory, you don’t need a service manager: you can manually start Varnish by running varnishd yourself and manually assigning the startup options.

usage: varnishd [options]
    -a address[:port][,proto]    # HTTP listen address and port (default: *:80)
                                 #   address: defaults to loopback
                                 #   port: port or service (default: 80)
                                 #   proto: HTTP/1 (default), PROXY
    -b address[:port]            # backend address and port
                                 #   address: hostname or IP
                                 #   port: port or service (default: 80)
    -C                           # print VCL code compiled to C language
    -d                           # debug
    -F                           # Run in foreground
    -f file                      # VCL script
    -h kind[,hashoptions]        # Hash specification
                                 #   -h critbit [default]
                                 #   -h simple_list
                                 #   -h classic
                                 #   -h classic,<buckets>
    -i identity                  # Identity of varnish instance
    -j jail[,jailoptions]        # Jail specification
                                 #   -j unix[,user=<user>][,ccgroup=<group>]
                                 #   -j none
    -l vsl[,vsm]                 # Size of shared memory file
                                 #   vsl: space for VSL records [80m]
                                 #   vsm: space for stats counters [1m]
    -M address:port              # Reverse CLI destination
    -n dir                       # varnishd working directory
    -P file                      # PID file
    -p param=value               # set parameter
    -r param[,param...]          # make parameter read-only
    -S secret-file               # Secret file for CLI authentication
    -s [name=]kind[,options]     # Backend storage specification
                                 #   -s malloc[,<size>]
                                 #   -s file,<dir_or_file>
                                 #   -s file,<dir_or_file>,<size>
                                 #   -s file,<dir_or_file>,<size>,<granularity>
                                 #   -s persistent (experimental)
    -T address:port              # Telnet listen address and port
    -t TTL                       # Default TTL
    -V                           # version
    -W waiter                    # Waiter implementation
                                 #   -W epoll
                                 #   -W poll

The varnishd documentation page has more detailed information about all of the startup options.

Let’s take a look at some of the typical startup options you’ll encounter when setting up Varnish. The examples I use represent the ones coming from /etc/default/varnish on an Ubuntu system that uses Sysv as the service manager.

Common startup options

The list of configurable startup options is quite extensive, but there’s a set of common ones that are just right to get started. The following example does that:

DAEMON_OPTS="-a :80 \
             -a :81,PROXY \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -S /etc/varnish/secret \
             -s malloc,3g \
             -j unix,user=www-data \"

Network binding

The most essential networking option is the -a option. It defines the address, the port, and the protocol that are used to connect with Varnish. By default, its value is :6081. This means that Varnish will be bound to all available network interfaces on TCP port 6081. In most cases, you’ll immediately switch the value to 80, the conventional HTTP port.

You can also decide which protocol to use. By default, this is HTTP, but you can also set it to PROXY. The PROXY protocol adds a so-called “preamble” to your TCP connection and contains the real IP address of the client. This only works if Varnish sits behind another proxy server that supports the PROXY protocol. The PROXY protocol will be further discussed in “What About TLS/SSL?”.

You can define multiple listening addresses by using multiple -a options. Multiple listening addresses can make sense if you’re combining HTTP and PROXY support, as previously illustrated.

CLI address binding

The second option we will discuss is the -T option. It is used to define the address and port on which the Varnish CLI listens. In “Banning from the Command Line”, we’ll need CLI access to invalidate the cache.

By default, the Varnish CLI is bound to localhost on port 6082. This means the CLI is only locally accessible.

Caution

Be careful when making the CLI remotely accessible because although access to the CLI requires authentication, it still happens over an unencrypted connection.

Security options

The -j option allows you to jail your Varnish instance and run the subprocesses under the specified user. By default, all processes will run using the varnish user.

The jailing option is especially useful if you’re running multiple Varnish instances on a single server. That way, there is a better process isolation between the instances.

The -S option is used to define the location of the file that contains the secret key. This secret key is used to authenticate with the Varnish CLI. By default, the location of this file is /etc/varnish/secret. It automatically contains a random value.

You can choose not to include the -S parameter to allow unauthenticated access to the CLI, but that’s something I would strongly advise against. If you want to change the location of the secret key value, change the value of the -S parameter. If you just want to change the secret key, edit /etc/varnish/secret and reload Varnish.

Storage options

Objects in the cache need to be stored somewhere. That’s where the -s option comes into play. By default, the objects are stored in memory (~malloc) and the size of the cache is 256 MiB.

Warning

Varnish expresses the size of the cache in kibibytes, mebibytes, gibibytes, and tebibytes. These differ from the traditional kilobytes, megabytes, gigabytes, and terrabytes. The “bi” in kibibytes stands for binary, so that means a kibibyte is 1,024 bytes, whereas a kilobyte is 1,000 bytes. The same logic applies to mebibytes (1024 × 1,024 bytes), gibibytes (1024 × 1024 × 1024 bytes), and tebibytes (1024 × 1024 × 1024 × 1024 bytes).

The size of your cache and the storage type heavily depends on the number of objects you’re going to store. If all of your cacheable files fit in memory, you’ll be absolutely fine. Memory is fast and simple, but unfortunately, your memory will be limited in terms of size. If your Varnish instance runs out of memory, it will apply a so-called Least Recently Used (LRU) strategy to evict items from cache.

Warning

If you don’t specify the size of the storage and only mention malloc, the size of the cache will be unlimited. That means Varnish could potentially eat all of your server’s memory. If your server runs out of memory, it will use the operating system’s swap space. This basically stores the excess data on disk. This could cause a major slowdown of your entire system if your disks are slow.

Varnish counts the amount of hits per cached object. When it has to evict objects due to a lack of available memory, it will evict the least popular objects until it has enough space to store the next requested object.

If you have a dedicated Varnish server, it is advised to allocate about 80% of your available memory to Varnish. That means you’ll have to change the -s startup option.

Note

File storage is also supported. Although it is slower than memory, it will still be buffered in memory. In most cases, memory storage will do the trick for you.

VCL file location

The location of the VCL file is set using the -f option. By default it points to /etc/varnish/default.vcl. If you want to switch the location of your VCL file to another file, you can modify this option.

Note

If you do not specify an -f option, you will need to add the -b option to define the backend server that Varnish will use.

Going more advanced

Let’s turn it up a notch and throw some more advanced startup options into the mix. Here’s an example:

DAEMON_OPTS="-a :80 \
             -a :81,PROXY \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -S /etc/varnish/secret \
             -s malloc,3g \
             -j unix,user=www-data \
             -l 100m,10m \
             -t 60 \
             -p feature=_++esi_disable_xml_check \
             -p connect_timeout=5 \
             -p first_byte_timeout=10 \
             -p between_bytes_timeout=2"

Shared log memory storage

Varnish doesn’t just store its objects; there’s also space allocated in memory for logging and statistics. This information is used by utility binaries like varnishlog, varnishtop, and varnishstat.

By default, 1 MiB is allocated to the Varnish Statistics Counters (VSC) and 81 MiB is allocated to the Varnish Shared Memory Logs (VSL).

You can manipulate the size of the VSC and the VSL by changing the value of the -l startup option.

Default time-to-live

Varnish relies on expires or cache-control headers to determine the time-to-live of an object. If no headers are present and no explicit time-to-live was specified in the VCL file, Varnish will default to a time-to-live of 120 seconds. You can modify the default time-to-live at startup time by setting the -t startup option. The value of this option is expressed in seconds.

Runtime parameters

There are a bunch of runtime parameters that can be tuned. Overriding a runtime parameter is done by setting the -p startup option. Alternatively, if you want these parameters to be read-only, you can use the -r option. Setting parameters to read-only restricts users with Varnish CLI access from overriding them at runtime.

Have a look at the full list of runtime parameters on the varnishd documentation page.

In the preceding example, we’re setting the following runtime parameters:

  • feature=esi_disable_xml_check

  • connect_timeout

  • first_byte_timeout

  • between_bytes_timeout

The first one (feature=esi_disable_xml_check) disables XML checks during the Edge Side Includes (ESI) processing. By default, Varnish requires ESI content to be valid XML. This is not always ideal, as this setting removes XML validation. ESI is a technique used by Varnish to assemble a page containing content blocks that come from multiple URLs. Each include can have its own time-to-live that is respected by Varnish. Varnish assembles content from the URLs using ESI include tags like <esi:include src="http://example.com" />. ESI allows you to still cache parts of a page that would otherwise be uncacheable (more information on ESI in “Edge Side Includes”).

The second one sets the connect_timeout to five seconds. This means that Varnish will wait up to five seconds when connecting with the backend. If the timeout is exceeded, a backend error is returned. The default value is 3.5 seconds.

The third one sets the first_byte_timeout to 10 seconds. After establishing a connection with the backend, Varnish will wait up to 10 seconds until the first byte comes in from the backend. If that doesn’t happen within 10 seconds, a backend error is returned. The default value is 60 seconds.

The fourth one sets the between_bytes_timeout to two seconds. When data is returned from the backend, Varnish expects a constant byte flow. If Varnish has to wait longer than two seconds between bytes, a backend error is returned. The default value is 60 seconds.

What About TLS/SSL?

Transport Layer Security (TLS), also referred to as Secure Sockets Layer (SSL), is a set of cryptographic protocols that are used to encrypt data communication over the network. In a web context, TLS and SSL are the “S” in HTTPS. TLS ensures that the connection is secured by encrypting the communication and establishing a level of trust by issuing certificates.

During the last couple of years, TLS has become increasingly popular to the point that non-encrypted HTTP traffic will no longer be considered normal in a couple of years. Security is still a hot topic in the IT industry, and nearly every brand on the internet wants to show that they are secure and trustworthy by offering HTTPS on their sites. Even Google Search supposedly gives HTTPS websites a better page rank.

Important

The Varnish project itself hasn’t included TLS support in its code base. Does that mean you cannot use Varnish in projects that require TLS? Of course not! If that were the case, Varnish’s days would be numbered in the low digits.

Varnish does not natively include TLS support because encryption is hard and it is not part of the project’s core business. Varnish is all about caching and leaves the crypto to the crypto experts.

The trick with TLS on Varnish is to terminate the secured connection before the traffic reaches Varnish. This means adding a TLS/SSL offloader to your setup that terminates the TLS connection and communicates over HTTP with Varnish.

The downside is that this also adds another layer of complexity to your setup and another system that can fail on you. Additionally, it’s a bit harder for the web server to determine the origin IP address. Under normal circumstances, Varnish should add the value of the X-Forwarded-For HTTP request header sent by the TLS offloader and store that value in its own X-Forwarded-For header. That way, the backend can still retrieve the origin IP.

In Varnish 4.1, PROXY protocol support was added. The PROXY protocol is a small protocol that was introduced by HAProxy, the leading open source load-balancing software. This PROXY protocol adds a small preamble to the TCP connection that contains the IP address of the original client. This information is transferred along and can be interpreted by Varnish. Varnish will use this value and automatically add it to the X-Forwarded-For header that it sends to the backend.

I wrote a detailed blog post about this, and it contains more information about both the HAProxy and the Varnish setup.

Additionally, the PROXY protocol implementation in Varnish uses this new origin IP information to set a couple of variables in VCL:

  • It sets the client.ip variable to the IP address that was sent via the PROXY protocol

  • It sets the server.ip variable to the IP address of the server that accepted the initial connection

  • It sets the local.ip variable to the IP address of the Varnish server

  • It sets the remote.ip variable to the IP address of the machine that sits in front of Varnish

HAProxy is not the only TLS offloader that supports PROXY. Varnish Software released Hitch, a TLS proxy that terminates the TLS connection and communicates over HTTP with Varnish. Whereas HAProxy is primarily a load balancer that offers TLS offloading, Hitch only does TLS offloading. HAProxy also wrote a blog post about the subject that lists a set of PROXY-protocol ready projects. Depending on your use case and whether you need load balancing in your setup, you can choose either HAProxy or a dedicated TLS proxy. Varnish Plus, the advanced version of Varnish, developed by Varnish Software, offers TLS/SSL support on both the server and the client side. The TLS/SSL proxy in Varnish Plus is tightly integrated with Varnish and helps improve website security without relying on third-party solutions.

Conclusion

Don’t let all these settings scare you—they’re just proof that Varnish is an incredibly flexible tool with lots of options and settings that can be tuned.

If you’re a sysadmin, I hope I have inspired you to try tuning some of these settings. If you’re not, just remember that Varnish can easily be installed with a package manager of your Linux distribution and hardly requires any tuning to be up and running.

At the bare minimum, have a look at the setting in “Network binding” if you want Varnish to process HTTP traffic on port 80.