Chapter 9. What Does This Mean for Your Business?

By this point, you should have extracted enough information from this book to set Varnish up and to configure it appropriately. And if all goes well, your application will have been tuned a bit, as well.

But why should you do this? From a technical perspective, it’s an easy decision: to make your platform better, faster, and stronger. But from a business perspective, this can be a tough sell. If you’re a developer or sysadmin and you’re looking to convince your manager to use Varnish, this chapter will give you the ammo you need. If you’re a technical decision-maker interested in using Varnish, this chapter is your source of inspiration.

This chapter offers some success stories and a bit of practical advice on how Varnish can fit into your stack—and into your organization.

To CDN or Not to CDN

A content delivery network (CDN) is nothing more than a set of reverse caching proxies that are hosted in various locations. A CDN is the perfect example of caching on the edge.

Companies often play the CDN card because it’s offered as a service and they get support. And in many cases, they go for a CDN purely for caching, not for geo distribution. The downside is the price, but because they’re in crisis mode, money is not an issue.

At Combell and at Sentia, the brands I work for, we managed to convince some of our clients to stop paying for a CDN and just use Varnish instead. Because of the power of VCL, we were able to achieve a much better hit rate and we had more insight because of tools like varnishlog. And they only paid a fraction of the money to host Varnish that they would have for a CDN.

I’m not saying CDNs are bad. Heck no! Varnish can be a good alternative for expensive CDNs, but it all depends on the context of your project. Please consider it, though. Some of our bigger clients need a CDN, either because they have a global presence that requires multiple points-of-presence, or because the sheer number of connections was saturating some network devices. And some CDNs are great at averting DDoS attacks—unless you have huge networking resources, those are tough to fend off.

Note

Although I’m positioning Varnish as a competitor of CDNs, many CDNs actually use Varnish for their reverse caching proxies. Varnish Software even has a product called Varnish Extend that allows you to set up your own private or hybrid CDN.

In fact, this is an easy way to extend the reach of your current CDN: Varnish Extend allows you to add servers to your current CDN in regions where there is no point of presence, basically creating a hybrid setup. Because it’s just software, you can install Varnish Extend on servers of your preference in a location of your preference.

VCL Is Cheaper

When Varnish is implemented, it’s often in times of crisis: the servers are under heavy load and the website is slow or unavailable. There isn’t much time to waste. Refactoring code or optimizing database queries is not an option.

Desperate times make even the most conservative manager take a second look at technology that was previously not desired. Where open source software was often stigmatized as being amateuristic, it now becomes a genuine option. Any lifeline will be accepted at this point. And although I’ve been preaching application-level HTTP best practices throughout the book, I’m happy to admit that VCL is cheaper in these cases. Even a couple of lines of quick and dirty VCL code can be a lifesaver.

It’s under these circumstances that decision-makers start to appreciate Varnish. Point out to your manager the simple syntax of VCL as illustrated in Chapter 4. Show your manager some common VCL scenarios as illustrated in Chapter 7. Compare the limited cost and effort to the alternative and you’ll see that a little bit of VCL goes a long way and is tough to beat.

If you make a solid case, you can bet your life that your manager’s next project will have Varnish in the web stack. Organizations that prior to a crisis were unaware of headers like Cache-control will have a pretty solid idea of how they can leverage HTTP in future projects.

Varnish as a Building Block

We went from not having Varnish to having Varnish when chaos arises. I call that an improvement. But the real lesson we need to learn here is: why didn’t we have a reverse caching proxy in the first place? As mentioned in “Caching Is Not a Trick”: don’t recompute if the data hasn’t changed. Why not treat caching as a strategy?

Many open source projects, such as Wordpress, Drupal, and Magento, offer Varnish support. These Varnish plugins know how to invalidate the cache, come with a VCL file, and in some cases offer support for block caching.

Reverse caching proxies are becoming a commodity. Agreed, if you run dedicated Varnish servers, it drives up cost. But the money you save by using Varnish outweighs these expenses. And if you want to be cheap, you can just install Varnish on your local web server—there’s nothing wrong with that.

I work in the hosting industry and notice that sales people and solution architects are very quick on the trigger when it comes to suggesting Varnish. In most cases, it’s a no-brainer. Sometimes we have a look at the application and either advise some refactoring or just write some VCL to get a decent hit rate.

There are development frameworks like Symfony that come with a built-in reverse caching proxy that obeys the same basic rules as Varnish. It’s an interesting point of view: Varnish is just an implementation. It doesn’t always matter what kind of technology you use, as long as it respects conventions and does the job.

On local development machines, Varnish isn’t always available. Having a built-in reverse cache proxy is definitely a convenience. And when it behaves the same way as Varnish does, using Varnish on test/staging/production isn’t a high-risk operation.

Pushing companies towards a “caching state of mind” will result in faster response times. See “Why Does Web Performance Matter?”.

The Original Customer Case

Varnish wasn’t originally a pure open source project; it was a custom piece of software built for VG.no, the biggest news website in Norway. They got Poul-Henning Kamp on board to build the thing and Linpro to port it to Linux, provide the tooling, and build the website. Varnish Software eventually spun off from Linpro and is now the commercial entity of the project.

The reverse caching proxy that was built for VG.no proved to be a real success and the code was stable and clean enough to be open sourced.

Without VG.no, there would be no Varnish and there wouldn’t be any other customer cases.

Varnish Plus

There are some mission-critical aspects of online businesses that Varnish doesn’t support out of the box. These require lots of custom VCL code, a bunch of VMODs, and some infrastructure trickery.

Varnish Plus is a commercial product by Varnish Software that adds business-oriented and mission-critical features on top of Varnish.

These are some of the features that Varnish Plus offers as a service:

You’ll agree that these are features that are tailored around the needs of businesses. You can build all these features yourself using VCL and VMODs, but it would take you a lot more time. Varnish Plus offers these advanced features on top of Varnish, with support and an intuitive management interface.

Companies Using Varnish Today

There are thousands of companies using Varnish: small, big, and anything in between. In this section, I’ll highlight a couple of interesting customer cases where it’s not just about the company, but about what they do with Varnish.

NU.nl: Investing Early Pays Off

By investing in caching and Varnish early on, you’ll save money in the long run. By having tight control over your caches and the know-how to invalidate when required, you’ll avoid spending money on extra servers for horizontal scalability purposes.

I love the NU.nl case study by Ibuildings that explains how an agressive caching strategy allowed the company to cope with massive traffic to the Nu.nl news site when an airliner crashed at Schiphol Airport in 2009. Pictures from bystanders were shared on the website and traditional media outlets couldn’t keep up with the pace.

The NU.nl website was the global center of attention that day with more than 20 million page views. And the site did not collapse because Varnish was able to serve all content from cache. It only needed two Varnish servers to handle that kind of load.

Although this is an old case that dates back to 2009, I still like to tell the story. The Ibuildings folks made a presentation about it and named it Surviving a plane crash. Talk about saving money on infrastructure cost!

SFR: Build Your Own CDN

SFR is a French telecommunications company that provides voice, video, data, and internet telecommunications and professional services to consumers and businesses. SFR has 21 million customers and provides 5 million households with high-speed internet access.

SFR’s main website is one of the top-10 most-visited sites in France. The costs of handling traffic were high enough that SFR decided to look into cost-reduction initiatives and ways to bring costs and efforts in-house.

SFR decided to build its own CDN using Varnish. The company started out with regular Varnish, but quickly got in touch with Varnish Software and ended up using Varnish Plus. SFR has so many objects to cache that regular Varnish storage options did not meet its requirements. The Massive Storage Engine ended up solving that problem.

SFR also added some VMODs that allowed bandwidth throttling and session identification for users who consumed video that was served from the CDN.

Varnish at Wikipedia

Even Wikipedia uses Varnish. Emanuele Rocca, one of Wikipedia’s operations engineers, gave a presentation at Varnishcon 2016 about how Wikipedia uses Varnish. The slides and video footage are available online.

Wikipedia chose Varnish because Varnish is open source and has a proven track record. The Wikimedia foundation itself is a very strong advocate of open source software and has deep roots in the free culture and free software movements.

Basically, Wikipedia has a multitiered Varnish stack to handle on average 100,000 incoming requests per second. It uses Varnish for its own multiregion CDN, but also for page caching. Because of Wikipedia’s values, it wants full autonomy and doesn’t want to depend on external companies. It also has some technical requirements that require full control over the caching technology. An as-a-service model doesn’t really work for Wikipedia.

Varnish is the ideal tool for this usage for the following reasons:

  • Wikipedia has full control over caching and purging policies

  • VCL allows it to write many custom optimizations

  • Varnish allows it to have custom analytics

Wikipedia’s multitier Varnish setup consists of two varnishd instances that run on separate ports:

  • A first tier that stores objects in memory

  • A second tier that stores objects on disk

Incoming requests are routed to the “best” data center based on GeoDNS. Load balancers send traffic to a caching node based on the client IP. Because TLS termination happens on the caching nodes and because of TLS session persistence, it makes sense to send requests from the same client IP to the same node.

The first Varnish tier serves popular content from memory. If an object is not in cache, the request is sent to the second tier that stores its cache on disk. This is still a lot faster than serving it directly from the web server. Requests are routed to the second Varnish tier based on the URL. This allows you to shard data across multiple servers without having duplicate objects.

Because of object persistence in the second tier, Wikipedia can easily cope with restarts without the risk of losing cached data.

Combell: Varnish on Shared Hosting

At Combell, we even offer Varnish on our shared hosting environment. We do this for entirely different reasons than what the previous cases would suggest.

The NU.nl, SFR, and Wikipedia cases had an underlying theme of massive traffic and how Varnish allows these companies to cope with that. At Combell, we offer Varnish on our shared hosting not for scalability purposes, but to get the raw performance.

Let’s face it: shared hosting is not exactly built to handle lots of traffic. But a lot of small websites can get really slow because of poor technical design or poor execution in terms of code. By adding Varnish to the stack, these slow requests can be sped up—increased scalability is merely a bonus.

But due to caching and the design of our platform, we run websites on this shared environment that can easily handle 50,000 visitors per day. When some of these websites have a good hit rate, Varnish allows them to have many more visitors without the slightest delay.

Conclusion

Varnish is a serious project—a serious piece of technology for serious organizations. Because of its simplicity and flexibility, it’s even a good fit for smaller companies and organizations that have performance rather than scalability issues. Varnish has an impressive track record and is used by more than 2.5 million websites, some of which are among the most popular in the world.

If you’re looking to integrate Varnish into your projects but you still encounter some resistance from colleagues or management, show them these customer cases. If it’s good enough for Wikipedia, it’s probably good enough for your website.

And if all of this is still way too complicated and technical for the person in charge, let them watch this video explaining Varnish cache, then talk about success stories, and finally create a proof of concept with a bit of VCL.

If your management wants more formal guarantees and extra services, point them in the direction of Varnish Software.

I like Varnish. The industry likes Varnish. Hopefully, by now, you like it, too!