Squid: The Definitive Guide

Chapter 3. Compiling and Installing

Squid is designed to be portable and should compile on all major Unix systems, including Linux, BSD/OS, FreeBSD, NetBSD, OpenBSD, Solaris, HP-UX, OSF/DUNIX/TRU-64, Mac OS/X, IRIX, and AIX. Squid also runs on Microsoft Windows. Please see Appendix E for instructions on compiling and running Squid on Windows.

Compiling Squid is relatively straightforward. If you’ve installed more than a few open source packages, you’re probably already familiar with the procedure. You first use a program called ./configure to probe your system and then a program called make to do the actual compiling.

Before getting to that step, however, let’s talk about tuning your system in preparation for Squid. Your operating system may have default resource limits that are too low for Squid to run correctly. Most importantly, you need to worry about the number of available file descriptors.

Before You Start

If you’ve been using Unix for a while, chances are that you’ve already compiled a number of other software packages. If so, you can probably quickly scan this chapter. The procedure for compiling and installing Squid is similar to many other software distributions.

To compile Squid, you need an ANSI C compiler. Don’t be too alarmed by the “ANSI” part. Chances are that if you already have a C compiler, it is compliant with the ANSI specification. The GNU C compiler (gcc) is an excellent choice and widely available. Most operating systems come with a C compiler as a part of the standard installation. The common exceptions are Solaris and HP-UX. If you’re using one of those operating systems, you might not have a compiler installed.

Ideally you should compile Squid on the same system on which it will run. Part of the installation process probes your system for certain parameters, such as the number of available file descriptors. However, if your system doesn’t have a C compiler, you may be able to compile Squid elsewhere and then copy the binaries back. If the operating systems are different, Squid may encounter some problems. Also, Squid may become confused if the two systems have different kernel configurations.

In addition to a C compiler, you’ll also need Perl and awk. awk is a standard program on all Unix systems, so you shouldn’t need to worry about it. Perl is quite common, but it may not be installed on your system by default. You may need the gzip program to uncompress the source distribution file.

Solaris users, make sure that /usr/ccs/bin is in your PATH, even if you’re using gcc. To compile Squid, you may need the make and ar programs found in that directory.

Unpacking the Source

After downloading the source distribution, you need to unpack it somewhere. The particular location doesn’t really matter. You can unpack Squid in your home directory or anywhere; you’ll need about 20 MB of free disk space. Personally, I like to use /tmp. Use the tar command to extract the source directory:

% cd /tmp
% tar xzvf /some/where/squid-2.5.STABLE4-src.tar.gz
squid-2.5.STABLE4/
squid-2.5.STABLE4/CONTRIBUTORS
squid-2.5.STABLE4/COPYING
squid-2.5.STABLE4/COPYRIGHT
squid-2.5.STABLE4/CREDITS
squid-2.5.STABLE4/ChangeLog
squid-2.5.STABLE4/INSTALL
squid-2.5.STABLE4/QUICKSTART
squid-2.5.STABLE4/README
...

Some tar programs don’t have the z option, which automatically uncompresses gzip files. In that case, you’ll need to use this command:

% gzip -dc /some/where/squid-2.5.STABLE4-src.tar.gz | tar xvf -

Once the source code has been unpacked, the next step is usually to configure the source tree. However, if this is the first time you’re compiling Squid, you should make sure certain kernel resource limits are high enough; to find out how, read on.

Pretuning Your Kernel

Squid requires a fair amount of kernel resources under moderate and high loads. In particular, you may need to configure your system with a higher-than-normal number of file descriptors and mbuf clusters. The file-descriptor limit can be especially annoying. You’d be better off to increase the limit before compiling Squid.

At this point, you might be tempted to get the precompiled binaries to avoid the hassle of building a new kernel.^[1] Unfortunately, you need to make a new kernel, regardless. Squid and the kernel exchange information through data structures that must not exceed the set file-descriptor limits. Squid checks these limits at runtime and uses the safest (smallest) value. Thus, even if a precompiled binary has higher file descriptors than the kernel, the kernel value takes precedence.

To change some settings, you must build and install a new kernel. This procedure varies among different operating systems. Consult Unix System Administration Handbook (Prentice Hall) or your operating-system documentation if necessary. If you’re using Linux, you probably don’t need to recompile your kernel.

File Descriptors

File descriptors are simply integers that identify each file and socket that a process has opened. The first opened file is 0, the second is 1, and so on. Unix operating systems usually impose a limit on the number of file descriptors that each process can open. Furthermore, Unix also normally has a systemwide limit.

Because of the way Squid works, the file-descriptor limits may adversely affect performance. When Squid uses up all the available file descriptors, it is unable to accept new connections from users. In other words, running out of file descriptors causes denial of service. Squid can’t accept new requests until some of the current requests complete, and the corresponding files and sockets are closed. Squid issues a warning when it detects a file-descriptor shortage.

You can save yourself some trouble by making sure the file descriptor limits are appropriate before running ./configure. In most cases, 1024 file descriptors will be sufficient. Very busy caches may require 4096 or more. When configuring file descriptor limits, I recommend setting the systemwide limit to twice the per-process limit.

You can usually discover your system’s file-descriptor limit from your Unix shell. All C shells and similar have the built-in limit command. Newer Bourne shells and similar have a command called ulimit. To find your file-descriptor limits, try running these commands:

csh% limit descriptors unlimited
csh% limit descriptors
descriptors    4096

or:

sh$ ulimit -n unlimited
sh$ ulimit -n
4096

On FreeBSD, you can also use the sysctl command:

% sysctl -a | grep maxfiles
kern.maxfiles: 8192
kern.maxfilesperproc: 4096

If you can’t figure out the file-descriptor limit, Squid’s ./configure script can do it for you. When you run ./configure, as described in Section 3.4, watch for output like this near the end:

checking Maximum number of file descriptors we can open... 4096

If either limit, ulimit, or ./configure report a value less than 1024, you should invest the time to increase the limit before compiling Squid. Otherwise, Squid’s performance will be poor under a moderate load.

Increasing the file descriptor limit varies from system to system. The following sections offer some tips to help get you started.

FreeBSD, NetBSD, OpenBSD

Edit your kernel configuration file, and add a line like this:

options       MAXFILES=8192

On OpenBSD, use option instead of options. Then, configure, compile, and install the new kernel. Reboot your system so the change takes effect.

Linux

Configuring file descriptors on Linux is a little complicated. You must edit one of the system include files, and execute some shell commands before compiling and running Squid. Start off by editing the file /usr/include/bits/types.h. Change the value for _ _FD_SETSIZE as follows:

#define _ _FD_SETSIZE    8192

Next, increase the kernel file descriptor limit with this command:

# echo 8192 > /proc/sys/fs/file-max

Finally, increase the process file-descriptor limit in the same shell in which you will configure and compile Squid:

sh# ulimit -Hn 8192

This command must be executed as root and only works from the bash shell. There is no need to reboot on Linux.

Tip

With this technique, you must execute the echo and ulimit commands each time your system boots, or at least before starting Squid. If you use an rc.d script to start Squid (see Section 5.6.2), that is a good place to stick these commands.

Solaris

Add this line to your /etc/system file:

set rlim_fd_max = 4096

Then, reboot the system for the change to take effect.

Mbuf Clusters

The BSD-based networking code uses a data structure known as an mbuf (see W.R.Stevens’ book, TCP/IP Illustrated, Vol 2). Mbufs are typically small (e.g., 128 octets) chunks of memory. The data for larger network packets are stored in mbuf clusters. The kernel may enforce an upper limit on the total number of mbuf clusters available in the system. You can find this limit with the netstat command:

% netstat -m
196/6368/32768 mbufs in use (current/peak/max):
        146 mbufs allocated to data
        50 mbufs allocated to packet headers
103/6182/8192 mbuf clusters in use (current/peak/max)
13956 Kbytes allocated to network (56% of mb_map in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

In this example, there are 8,192 mbuf clusters available, but there are never more than 6,182 used at once. When the system runs out of mbuf clusters, I/O routines such as read( ) and write( ) return the “No buffer space available” error message.

NetBSD and OpenBSD don’t display mbuf usage in netstat -m output. Instead, they report “WARNING: mclpool limit reached” via syslog.

To increase the number of mbuf clusters, you need to add an option to your kernel configuration file:

options         NMBCLUSTERS=16384

Ephemeral Port Range

Ephemeral ports are the local port numbers the TCP/IP stack assigns to outgoing connections. In other words, when Squid makes a connection to an origin server, the kernel assigns a port number to the local socket. These local port numbers fall within a certain range. On FreeBSD, for example, the default ephemeral port range is 1024-5000.

A shortage of ephemeral ports may adversely affect performance for very busy proxies (i.e., hundreds of requests per second). This is because some TCP connections enter a TIME_WAIT state when they are closed. An ephemeral port number can’t be reused while the connection is in the TIME_WAIT state.

You can see how many connections are in this state with the netstat command:

% netstat -n | grep TIME_WAIT
Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)
tcp4       0      0  192.43.244.42.19583    212.67.202.80.80       TIME_WAIT
tcp4       0      0  192.43.244.42.19597    202.158.66.190.80      TIME_WAIT
tcp4       0      0  192.43.244.42.19600    207.99.19.230.80       TIME_WAIT
tcp4       0      0  192.43.244.42.19601    216.131.72.121.80      TIME_WAIT
tcp4       0      0  192.43.244.42.19602    209.61.183.115.80      TIME_WAIT
tcp4       0      0  192.43.244.42.3128     128.109.131.47.25666   TIME_WAIT
tcp4       0      0  192.43.244.42.3128     128.109.131.47.25795   TIME_WAIT
tcp4       0      0  192.43.244.42.3128     128.182.72.190.1488    TIME_WAIT
tcp4       0      0  192.43.244.42.3128     128.182.72.190.2194    TIME_WAIT

Note that this example has both client- and server-side connections. Client-side connections have 3128 as the local port number; server-side connections have 80 as the remote (foreign) port number. The ephemeral port numbers appear under the Local Address heading. In this example, they are in the 19,000s.

Unless you see thousands of ephemeral ports in the TIME_WAIT state, you probably don’t need to increase the range. On FreeBSD, you can increase the range with this command:

# sysctl -w net.inet.ip.portrange.last=30000

On OpenBSD, the command is almost the same, but the sysctl variable has a different name:

# sysctl -w net.inet.ip.portlast=49151

On NetBSD, things work a little differently. The default range is 49,152-65,535. To increase the range, change the lower limit:

# sysctl -w net.inet.ip.anonportmin=10000

On Linux, simply write a pair of numbers to the following special file:

# echo "1024 40000" > /proc/sys/net/ipv4/ip_local_port_range

Don’t forget to add these commands to your system startup scripts so that they take effect each time your machine reboots.

The configure Script

Like many other Unix software packages, Squid uses a ./configure script to learn about an operating system before compiling. The ./configure script is generated by the popular GNU autoconf program. When the script runs, it probes the system in various ways to find out about libraries, functions, types, parameters, and features that may or may not be present. One of the first things that ./configure does is look for a working C compiler. If the compiler can’t be found or fails to compile a simple test program, the ./configure script can’t proceed.

The ./configure script has a number of different options. The most important is the installation prefix. Before running ./configure, you need to decide where Squid should live. The installation prefix determines the default locations for the Squid logs, binaries, and configuration files. You can change the location for those files after installing, but it’s easier if you decide now.

The default installation prefix is /usr/local/squid. Squid puts files in seven different subdirectories under the prefix:

% ls -l /usr/local/squid
total 5
drwxr-x---  2 wessels  wheel  512 Apr 28 20:42 bin
drwxr-x---  2 wessels  wheel  512 Apr 28 20:42 etc
drwxr-x---  2 wessels  wheel  512 Apr 28 20:42 libexec
drwxr-x---  3 wessels  wheel  512 Apr 28 20:43 man
drwxr-x---  2 wessels  wheel  512 Apr 28 20:42 sbin
drwxr-x---  4 wessels  wheel  512 Apr 28 20:42 share
drwxr-x---  4 wessels  wheel  512 Apr 28 20:43 var

Squid uses the bin, etc, libexec, man, sbin, and share directories for a few, relatively small files (or other directories) that don’t change very often. The files under the var directory, however, are a different story. This is where you’ll find Squid’s log files, which may grow quite large (tens or hundreds of megabytes). var is also the default location for the actual disk cache. You may want to put var on a different partition with plenty of space. One easy way to do this is with the —localstatedir option:

% ./configure --localstatedir=/bigdisk/var

You don’t need to worry too much about pathnames when configuring Squid. You can always change the pathnames later, in the squid.conf file.

configure Options

The ./configure script has a number of different options that all start with —. You can see the full list of options by typing ./configure --help. Some of these options are common to all configure scripts, and some are unique to Squid. Here are the standard options that you might find useful:

--prefix=PREFIX: This sets the installation prefix directory, as described earlier. The installation prefix is the default directory for all executables, logs, and configuration files. Throughout this book, $prefix refers to your choice for the installation prefix.
--localstatedir=DIR: This option allows you to change the location for the var directory. The default is $prefix/var, but you might want to change it so that Squid’s disk cache and log files are stored elsewhere.
--sysconfdir=DIR: This option allows you to change the location for the etc directory. The default is $prefix/etc. If you like to use /usr as the installation prefix, you might want to set —sysconfdir to /etc.

Here are the Squid-specific ./configure options:

--enable-dlmalloc[=LIB]

On some systems, the built-in memory allocation (malloc) functions have poor performance characteristics when used with Squid. Using the —enable-dlmalloc option builds and links with the dlmalloc package included in the Squid source code. If you already have dlmalloc built on your system, you can specify the library’s pathname as the =LIB argument. See http://g.oswego.edu/dl/html/malloc.html for more information on dlmalloc.

--enable-gnuregex

Squid uses regular expressions for pattern matching in access control lists and other configuration directives. The GNU regular expression library comes with the Squid source code; it can be used on operating systems that don’t have built-in regular expression functions. The ./configure script probes your system for a regular expression library and enables the use of GNU regex if necessary. If, for some reason, you want to force the usage of GNU regex, you can add this option to the ./configure command.

--enable-carp

The Cache Array Routing Protocol (CARP) is useful for forwarding cache misses to an array, or cluster, of parent caches. There’s more about CARP in Section 10.9.

--enable-async-io[=N_THREADS]

Async I/O refers to one of Squid’s techniques for improved storage performance. The aufs storage module uses a number of thread processes to perform disk I/O operations. This code works only on Linux and Solaris systems. The =N_THREADS argument changes the number of thread processes Squid uses. aufs and Async I/O are discussed in Section 8.4.

Note that the —enable-async-io option is a shortcut that turns on three other ./configure options. It is equivalent to specifying:

--with-aufs-threads=N_THREADS
--with-pthreads
--enable-storeio=ufs,aufs

--with-pthreads

The —with-pthreads option causes the compilation procedure to link with your system’s Pthreads library. The aufs storage module is the only part of Squid that uses threads. Normally, you don’t specify this option on the ./configure command line because it’s enabled automatically when you use —enable-async-io.

--enable-storeio=LIST

Squid supports a number of different storage modules. With this option, you tell ./configure which modules to compile. The ufs, aufs, diskd, coss, and null modules are supported in Squid-2.5. You can also get a list by looking at the directories under src/fs.

LIST is a comma-separated list of module names. For example:

% ./configure --enable-storeio=afus,diskd,ufs

The ufs module is the default and least likely to cause problems. Unfortunately, it also has limited performance characteristics. The other modules may not necessarily compile on your particular operating system. For a complete description of Squid’s storage modules, see Chapter 8.

--with-aufs-threads=N_THREADS

Specifies the number of threads to use for the aufs storage scheme (see Section 8.4). By default, Squid automatically calculates how many threads to use, based on the number of cache directories.

--enable-heap-replacement

This option has been deprecated but remains for backward compatibility. You should always use the —enable-removal-policies option instead.

--enable-removal-policies=LIST

Removal policies are the algorithms Squid uses to eject cached objects when making room for new ones. Squid-2.5 supports three removal policies: least recently used (LRU), greed dual size (GDS), and least frequently used (LFU).

However, for some reason, the ./configure options blur the distinction between a particular replacement policy and the underlying data structures required to implement them. LRU, which is the default, is implemented with a doubly linked list. The GDS and LFU implementations use a data structure known as a heap.

To use the GDS or LFU policies, you specify:

% ./configure --enable-removal-policies=heap

You then select between GDS and LFU in the Squid configuration file. If you want to retain the option of using LRU, specify:

% ./configure --enable-removal-policies=heap,lru

There’s more about replacement policies in Section 7.5.

--enable-icmp

As you’ll see in Section 10.5, Squid can make round-trip time measurements with ICMP messages, much like the ping program. You can use this option to enable these features.

--enable-delay-pools

Delay pools are Squid’s technique for traffic shaping or bandwidth limiting. The pools consist of groups of client IP addresses. When requests from these clients are cache misses, their responses may be artificially delayed. See more about delay pools in Appendix C.

--enable-useragent-log

This option enables logging of the HTTP User-Agent header from client requests. See more about this in Section 13.5.

--enable-referer - log

This option enables logging of the HTTP referer header from client requests. See more about this in Section 13.4.

--disable-wccp

The Web Cache Coordination Protocol (WCCP) is Cisco’s once-proprietary protocol for intercepting and distributing HTTP requests to one or more caches. WCCP is enabled by default, but you can use this option to prevent compilation of the WCCP code if you like.

--enable-snmp

The Simple Network Management Protocol (SNMP) is a popular way to monitor network devices and servers. This option causes the build procedure to compile all of the SNMP-related code, including a cut-down version of the CMU SNMP library.

--enable-cachemgr-hostname[=hostname]

cachemgr is a CGI program you can use to administratively query Squid. By default, cachemgr’s hostname field is blank, but you can create a default value with this option. For example:

% ./configure --enable-cachemgr-hostname=mycache.myorg.net

--enable-arp-acl

Squid supports ARP, or Ethernet address, access control lists on some operating systems. The code to implement ARP ACLs uses nonstandard function interfaces, so it is disabled by default. If you run Squid on Linux or Solaris, you may be able to use this feature.

--enable-htcp

HTCP is the Hypertext Caching Protocol—an intercache protocol similar to ICP. See Section 10.8 for more information.

--enable-ssl

Use this option to give Squid the ability to terminate SSL/TLS connections. Note this only works for accelerated requests in surrogate mode. See Section 15.2.2 for more information.

--with-openssl[=DIR]

This option exists so that you can tell the compiler where to find the OpenSSL libraries and header files, if necessary. If they aren’t in the default location, enter the parent directory after this option. For example:

% ./configure --enable-ssl --with-ssl=/opt/foo/openssl

Given this example, your compiler looks for the OpenSSL header files in /opt/foo/openssl/include, and for libraries in /opt/foo/openssl/lib.

--enable-cache-digests

Cache Digests are another alternative to ICP, but with significantly different characteristics. See Section 10.7.

--enable-err-languages=”lang1 lang2 ...”

Squid supports customizable error messages and comes with error messages in many different languages. This option determines the languages that are copied to the installation directory ($prefix/share/errors). If you don’t use this option, all available languages are installed. To see which languages are available, look at a directory listing of the errors directory in the source distribution. Here’s how to enable more than one language:

% ./configure --enable-err-languages="Dutch German French" ...

--enable-default-err-language=lang

This option sets the default value for the error_directory directive. For example, if you want to use Dutch error messages, you can use this ./configure option:

% ./configure --enable-default-err-language=Dutch

You can also set the error_directory directive in squid.conf, as described in Appendix A. English is the default error language if you omit this option.

--with-coss-membuf-size=N

The Cyclic Object Storage System (coss) is an experimental storage scheme for Squid. This option sets the memory buffer size for coss cache directories. Note that in order to use coss, you must specify it as a storage type in the —enable-storeio option.

The argument is given in bytes. The default is 1,048,576 bytes or 1 MB. You can specify a 2-MB buffer like this:

% ./configure --with-coss-membuf-size=2097152

--enable-poll

Unix provides two similar functions that scan open file descriptors for I/O events: select( ) and poll( ). The ./configure script usually does a very good job of figuring out when to use poll( ) over select( ). Use this option if you want to override the ./configure script and force it to use poll( ).

--disable-poll

Similarly, Unix gurus may want to force ./configure to not use poll( ).

--disable-http-violations

By default, Squid can be configured to violate the HTTP protocol specifications. You can use this option to remove the code completely that would violate HTTP.

--enable-ipf-transparent

In Chapter 9, I’ll describe how to configure Squid for interception caching. Some operating systems use the IP Filter package to assist with the interception. In these cases you should use this ./configure option. If you enable this option and get compiler errors on the src/client_side.c file, chances are that the IP Filter package isn’t actually (or correctly) installed on your system.

--enable-pf-transparent

You may need this option to use HTTP interception on systems that use the PF packet filter. PF is the standard packet filter for OpenBSD and may have been ported to other systems as well. If you enable this option and get compiler errors on the src/client_side.c file, chances are that PF isn’t actually installed on your system.

--enable-linux-netfilter

Netfilter is the name of the Linux packet filter for the 2.4 kernel series. Enable this option if you want to use HTTP interception with Linux 2.4 or later.

--disable-ident-lookups

ident is a simple protocol that allows a server to find the username associated with a client’s particular TCP connection. If you use this option, the compiler excludes completely the code that performs such lookups. Even if you leave the code enabled at compile time, Squid doesn’t make ident lookups unless you configure them in squid.conf.

--disable-internal-dns

The Squid source code includes two different DNS resolution implementations, called internal and external. Internal lookups are the default, but some people prefer the external technique. This option disables the internal functionality and reverts to the older method.

Internal lookups use Squid’s own implementation of the DNS protocol. That is, Squid generates raw DNS queries and sends them to a resolver. It retransmits queries that time out, and you can specify any number of resolvers. One of the benefits to this implementation is that Squid gets accurate TTLs for DNS replies.

External lookups use the C library’s gethostbyname( ) and gethostbyaddr( ) functions. Since these routines block the process until the answer comes back, they must be called from external, helper processes. Squid uses a pool of external processes to make queries in parallel. The primary drawback to external DNS resolution is that you need more helper processes as Squid’s load increases. Another annoyance is that the C library functions don’t convey TTLs with the answers, in which case Squid uses a constant value supplied by the positive_dns_ttl directive.

--enable-truncate

The truncate( ) system call is an alternative to using unlink( ). While unlink( ) removes a cache file altogether, truncate( ) sets the file size to zero. This frees the disk space associated with the file but leaves the directory entry in place. This option exists because some people believed (or hoped) that truncate( ) would produce better performance than unlink( ). However, benchmarks have shown little or no real difference.

--disable-hostname-checks

By default, Squid requires that URL hostnames conform to the somewhat archaic specifications in RFC 1034:

The labels must follow the rules for ARPANET host names. They must start with a letter, end with a letter or digit, and have as interior characters only letters, digits, and hyphen.

Here, “letter” means the ASCII characters A through Z. Since internationalized domain names are becoming increasingly popular, you may want to use this option to remove the restriction.

--enable-underscores

This option controls Squid’s behavior regarding underscore characters in hostnames. General consensus is that hostnames must not include underscore characters, although some people disagree. Squid, by default, generates an error message for requests that have an underscore in a URL hostname. You can use this option to make Squid treat them as valid. However, your DNS resolver may also enforce the no-underscore requirement and fail to resolve such hostnames.

--enable-auth[=LIST]

This option controls which HTTP authentication schemes to support in the Squid binary. You can select any combination of the following schemes: basic, digest, and ntlm. If you omit the option, Squid supports only basic authentication. If you give the —enable-auth option without any arguments, the build process adds support for all schemes. Otherwise, you can give a comma-separated list of schemes to support:

% ./configure --enable-auth=digest,ntlm

I talk more about authentication in Chapters 6 and 12.

--enable-auth-helpers=LIST

This old option is now deprecated, but still remains for backward compatibility. You should use —enable-basic-auth-helpers= LIST instead.

--enable-basic-auth-helpers=LIST

With this option, you can build one or more of the HTTP Basic authentication helper programs found in helpers/basic_auth. See Section 12.2 for their names and descriptions.

--enable-ntlm-auth-helpers=LIST

With this option, you can build one or more of the HTTP NTLM authentication helper programs found in helpers/ntlm_auth. See Section 12.4 for their names and descriptions.

--enable-ntlm-fail-open

When you enable this option, Squid’s NTLM authentication module defaults to allow access in the event of an error or problem.

--enable-digest-auth-modules=LIST

With this option, you can build one or more of the HTTP Digest authentication helper programs found in helpers/digest_auth. See Section 12.3 for their names and descriptions.

--enable-external-acl-helpers=LIST

With this option, you can build one or more of the external ACL helper programs that I discuss in Section 12.5. For example:

% ./configure --enable-external-acl-helpers=ip_user,ldap_group

--disable-unlinkd

Unlinkd is another one of Squid’s external helper processes. Its sole job is to execute the unlink( ) (or truncate( )) system call on cache files. Squid realizes a significant performance gain by implementing file deletion in an external process. Use this option to disable the external unlink daemon feature.

--enable-stacktrace

Some operating systems support automatic generation of stack trace data in the event of a program crash. When you enable this feature and Squid crashes, the stack trace information is written to the cache.log file. This information is often helpful to developers in tracking down programming bugs.

--enable-x-accelerator-vary

This advanced feature may be used when Squid is configured as a surrogate. It instructs Squid to look for X-Accelerator-Vary headers in responses from backend origin servers. See Section 15.5.

Running configure

Now we’re ready to run the ./configure script. Go to the top-level source directory and type ./configure, followed by any of the options mentioned previously. For example:

% cd squid-2.5.STABLE4
% ./configure --enable-icmp --enable-htcp

./configure’s job is to probe your operating system and find out which things are available, and which are not. One of the first things it does is make sure your C compiler is working. If ./configure detects a problem with your C compiler, the script exits with this error message:

configure: error: installation or configuration problem: C compiler
cannot create executables.

Most likely, you’ll never see that message. If you do, it means either your system doesn’t have a C compiler at all or that the compiler isn’t installed correctly. Look at the config.log file for hints as to the exact problem. If your system has more than one C compiler, you can tell ./configure which to use by setting the CC environment variable before running ./configure:

% setenv CC /usr/local/bin/gcc
% ./configure ...

After ./configure checks out the compiler, it looks for a long list of header files, libraries, and functions. Normally you won’t have to worry about this part. In some cases, ./configure pauses to get your attention about something that may be a problem (such as not enough file descriptors). It may also stop if you specify incompatible or unreasonable command-line options. If something does go wrong, check the config.log output. ./configure’s final task is to create Makefiles and other files based on the things it learned about your system. At this point, you’re ready to begin compiling.

make

Once ./configure has done its job, you can simply type make to begin compiling the source code:

% make

Normally, this part goes smoothly. You’ll see a lot of lines that look like this:^[2]

source='cbdata.c' object='cbdata.o' libtool=no  depfile='.deps/cbdata.Po'
tmpdepfile='.deps/cbdata.TPo'  depmode=gcc /bin/sh ../cfgaux/depcomp  gcc -DHAVE_
CONFIG_H -DDEFAULT_CONFIG_FILE=\"/usr/local/squid/etc/squid.conf\" -I. -I. -I../
include -I. -I. -I../include -I../include     -g -O2 -Wall -c 'test -f cbdata.c ||
echo './''cbdata.c
source='client_db.c' object='client_db.o' libtool=no  depfile='.deps/client_db.Po'
tmpdepfile='.deps/client_db.TPo'  depmode=gcc /bin/sh ../cfgaux/depcomp  gcc -DHAVE_
CONFIG_H -DDEFAULT_CONFIG_FILE=\"/usr/local/squid/etc/squid.conf\" -I. -I. -I../
include -I. -I. -I../include -I../include     -g -O2 -Wall -c 'test -f client_db.c ||
echo './''client_db.c
source='client_side.c' object='client_side.o' libtool=no  depfile='.deps/client_side.Po'
tmpdepfile='.deps/client_side.TPo'  depmode=gcc /bin/sh ../cfgaux/depcomp  gcc -
DHAVE_CONFIG_H -DDEFAULT_CONFIG_FILE=\"/usr/local/squid/etc/squid.conf\" -I. -I. -I../
include -I. -I. -I../include -I../include     -g -O2 -Wall -c 'test -f client_side.c ||
echo './''client_side.c
source='comm.c' object='comm.o' libtool=no  depfile='.deps/comm.Po' tmpdepfile='.
deps/comm.TPo'  depmode=gcc /bin/sh ../cfgaux/depcomp  gcc -DHAVE_CONFIG_H -DDEFAULT_
CONFIG_FILE=\"/usr/local/squid/etc/squid.conf\" -I. -I. -I../include -I. -I. -I../
include -I../include     -g -O2 -Wall -c 'test -f comm.c || echo './''comm.c

You may see some compiler warnings. In most cases, it is safe to ignore these. If you see a lot of them or something that looks really serious, report it to the developers as described in Section 16.5.

If the compilation gets all the way to the end without any errors, you can move to the next section, which describes how to install the programs you just built.

To verify that compilation was successful, you can run make again. You should see this output:^[3]

% make
Making all in lib...
Making all in scripts...
Making all in src...
Making all in fs...
Making all in repl...
'squid' is up to date.
'client' is up to date.
'unlinkd' is up to date.
'cachemgr.cgi' is up to date.
Making all in icons...
Making all in errors...
Making all in auth_modules...

The compilation step may fail for a number of reasons, including:

Source code bugs: Usually the Squid source code is thoroughly debugged. However, you may encounter some bugs or problems that prevent Squid from compiling. You’re more likely to find these sorts of bugs in the newer development versions. Report these to the developers.
Compiler installation problems: An improperly installed C compiler probably won’t be able to compile Squid or any other moderately sized software package. Usually, compilers come pre-installed with the operating system, so you don’t have to worry about that. However, if you attempt to upgrade your compiler after installing the operating system, you might make a mistake. Never copy a compiler installation from one machine to another, unless you are absolutely sure about what you are doing. I feel it is always better to install the compiler on each machine separately.

Always make sure that your compiler’s header files are synchronized with the library files. The header files normally reside in /usr/include, while libraries are found in /usr/lib. Linux’s popular RPM system makes it possible to upgrade one, but not the other. If the libraries are based on different header files, Squid may not compile.

If you want to upgrade the compiler on one of the open-source BSD variants, be sure to run make world from the /usr/src directory, rather than from the /usr/src/lib or /usr/src/include directories.

Here are some common compilation problems and error messages:

Solaris: make[1]: *** [libmiscutil.a] Error 255

This means that ./configure didn’t find the ar program. Make sure /usr/ccs/bin is listed in your PATH environment variable. If you don’t have the Sun compiler installed, you’ll need the GNU binutils (http://www.gnu.org/directory/binutils.html).

Linux: storage size of 'rl' isn't known

This happens when the header and library files don’t match, as described earlier. Be sure to upgrade both packages at the same time.

Digital Unix: Don't know how to make EXTRA_libmiscutil_a_SOURCES. Stop.

Digital Unix’s make program isn’t tolerant of the Makefile produced by the automake package. For example, lib/Makefile.in contains these lines:

noinst_LIBRARIES = \
        @LIBDLMALLOC@ \
        libmiscutil.a \
        libntlmauth.a \
        @LIBREGEX@

After substitution, when lib/Makefile is created, it looks like this:

noinst_LIBRARIES = \
         \
        libmiscutil.a \
        libntlmauth.a \
        <TAB>

As shown above, the last line contains an (invisible) TAB character, which confuses make. You can get past this problem by installing and using GNU make, or by manually editing lib/Makefile (and any others exhibiting this problem) to make it look like this:

noinst_LIBRARIES = \
         \
        libmiscutil.a \
        libntlmauth.a

If you have problems compiling Squid, check the FAQ first. You may also want to search the Squid web site (use the search box on the home page). Finally, if you’re still stuck, send email to the squid-users@squid-cache.org list.

make Install

After compiling, you need to install the programs into their permanent directories. This might require superuser privileges, to put files in the installation directories. If so, become root first:

% su
Password:
# make install

If you enable Squid’s ICMP measurement features with the —enable-icmp option, you must install the pinger program. The pinger program must be installed with superuser privileges because only root is allowed to send and receive ICMP messages. The following command installs pinger with the appropriate permissions:

# make install-pinger

After installing Squid, you should see the following directories and files listed under the installation prefix directory (/usr/local/squid by default):

sbin: The sbin directory contains programs normally started by root.
sbin/squid: This is the main Squid program.
bin: The bin directory contains programs for all users.
bin/RunCache: RunCache is a shell script you can use to start Squid. If Squid dies, this script automatically starts it again, unless it detects frequent restarts. The RunCache script is a relic from the time when Squid was not a daemon process. With the current versions, RunCache is less useful because Squid automatically restarts itself when you don’t use the -N option.
bin/RunAccel: The RunAccel script is nearly identical to RunCache, except that it adds a command-line argument that tells Squid where to listen for HTTP requests.
bin/squidclient: squidclient is a simple HTTP client you can use to test Squid. It also has some special features for making management requests to a running Squid process.
libexec: The libexec directory traditionally contains helper programs. These are commands that you wouldn’t normally run yourself. Rather, these programs are normally started by other programs.
libexec/unlinkd: unlinkd is a helper program that removes files from the cache directories. As you’ll see later, file deletion can be a significant bottleneck. By implementing the delete operation in an external process, Squid achieves some performance gain.
libexec/cachemgr.cgi: cachemgr.cgi is a CGI interface to Squid’s management functions. To use it, you’ll probably need to copy this program to your HTTP server’s cgi-bin directory. You’ll see more about this in Section 14.2.
libexec/diskd (optional): You get this only if you specify —enable-storeio=diskd.
libexec/pinger (optional): You get this only if you specify —enable-icmp.
etc: The etc directory contains Squid’s configuration files.
etc/squid.conf: This is the primary configuration file for Squid. Initially, this file contains a lot of comments to explain what each option does. After you understand the configuration directives, it’s a good idea to remove the comments to make the configuration file smaller and easier to read. Note that the installation procedure doesn’t overwrite this file if it already exists.
etc/squid.conf.default: This is a copy of the default configuration file from the source distribution. You may find it useful to have a copy of the current default configuration file after upgrading your Squid installation. New configuration directives may be added, and some of the existing directives may have changed.
etc/mime.conf: The mime.conf file tells Squid which MIME types to use for data retrieved from FTP and Gopher servers. The file is a table that correlates filename extensions to MIME types. Normally, you won’t need to edit this file. However, you may need to add entries for special file types used within your organization.
etc/mime.conf.default: This is the default mime.conf file from the source distribution.
share: The share directory normally contains read-only data files used by Squid.
share/mib.txt: This is the SNMP Management Information Base (MIB) file for Squid. Squid doesn’t use this file itself. Rather, your SNMP agent software (such as snmpget and Multi-Router Traffic Grapher (MRTG)) needs this file to understand the SNMP objects available from Squid.
share/icons: The share/icons directory contains a number of small icon files Squid uses in FTP and Gopher directory listings. Normally, you won’t need to worry about these files, but you can change them if you want.
share/errors: The share/errors directory contains templates for the error messages Squid shows to users. These files are copied from the source directory when you install Squid. You can edit them if you like. However, the installation procedure always overwrites these files every time you run make install. So if you want to have customized error messages, it’s a good idea to put them in a different directory.
var: The var directory contains files that aren’t critical and that change frequently. These are the sort of files you don’t normally back up.
var/logs: The var/logs directory is the default location for Squid’s various log files. It is empty when you first install Squid. Once Squid gets running, you can expect to find files here named access.log, cache.log, and store.log.
var/cache: This is the default cache directory (cache_dir) if you don’t specify one in squid.conf. See Chapter 7 for all the details about cache directories.

Applying a Patch

After you’ve been running Squid for a while, you may find that you need to patch the source code to fix a bug or add an experimental feature. Patches are posted for important bug fixes on the squid-cache.org web site. If you don’t want to wait for the next official release, you can download and apply the patch to your source code. You will then need to recompile Squid.

To apply a patch—also sometimes called a diff—you need a program called patch. Chances are that your operating system already has the patch program. If not, you can download it from the GNU collection (http://www.gnu.org/directory/patch.html). Note that if you’re using anonymous CVS (see Section 2.4), you don’t need to worry about patching files. The CVS system does it for you automatically when you update your tree.

To apply a patch, you need to save the patch file somewhere on your system. Then cd to the Squid source directory and run the command like this:

% cd squid-2.5.STABLE4
% patch < /tmp/patch_file

By default, the patch program tells you what it’s doing as it runs. Usually this output scrolls by very quickly, unless there is a problem. You can safely ignore the warnings that say offset NNN lines. If you don’t want to see all this output, use the -s option to make patch silent.

When patch updates the source files, it creates a backup copy of the original file. For example, if you’re applying a patch to src/http.c, patch names the backup file src/http.c.orig. Thus, if you want to undo the patch after applying it, you can simply rename all the .orig files back to their former names. To use this technique successfully, it’s a good idea to remove all .orig files before applying a patch.

If patch encounters a problem, it stops and prompts you for advice. Common problems are as follows:

Running patch from the wrong directory. To fix this problem, you may need to cd to a different directory or use patch’s -p option.
Patch is already applied. patch can usually tell if the patch file has already been applied. In this case, it asks if you want to unpatch the file.
The patch program doesn’t understand the file you are giving it. Patch files come in three flavors: normal, context, and unified. Old versions of patch may not understand context or unified diff output. Getting the latest version from the GNU FTP site will solve this problem.
Corrupted patch file. If you aren’t careful when downloading and saving the patch file, it may become corrupted. Sometimes people send patch files in email messages, and it is tempting to simply cut-and-paste them into a new window. On some systems, cut-and-paste can change Tab characters into spaces, or incorrectly wrap long lines. Both changes confuse patch. The -l option may be helpful, but it’s best to make sure you copy and save the patch file correctly.

Sometimes patch can’t apply part or all of the diff. In these cases, you’ll see such messages as Hunk 3 of 4 failed. The failed sections are saved to files named .rej. For example, if a failure occurs while processing src/http.c, patch saves that piece of the diff to src/http.c.rej. In some cases, you may be able to fix these by hand, but it’s usually not worth the trouble. If you have a lot of “failed hunks” or .rej files, it’s a good idea to download a whole new copy of the latest source code.

After you apply a patch, you need to recompile Squid. One of the great things about make is that it only recompiles the files that have changed. But sometimes make doesn’t comprehend all the intricate dependencies, and it doesn’t rebuild enough of the files. To be safe, it’s usually a good idea to recompile everything. The best way to do this is to clean the source tree before recompiling:

% make clean
% make

Running configure Later

Sometimes you may find it necessary to rerun ./configure. For example, if you tune your kernel parameters, you must run ./configure again so it picks up the new settings. As you read this book, you may also find that you want to use features that must be enabled with ./configure options.

To rerun ./configure with the same options, use this command:

% ./config.status --recheck

Another technique is to “touch” the config.status file, which updates its timestamp. This causes make to re-run the ./configure script before compiling the source code:

% touch config.status
% make

To add or remove ./configure options, you need to type in the whole command again. If you can’t remember the previous options, just look at the top of the config.status file. For example:

% head config.status
#! /bin/sh
# Generated automatically by configure.
# Run this file to recreate the current configuration.
# This directory was configured as follows,
# on host foo.life-gone-hazy.com:
#
# ./configure  --enable-storeio=ufs,diskd --enable-carp \
#   --enable-auth-modules=NCSA
# Compiler output produced by configure, useful for debugging
# configure, is in ./config.log if it exists.

After rerunning ./configure, you must compile and install Squid again. To be safe, it’s a good idea to run make clean first:

% make clean
% make

Recall that ./configure caches the things it discovers about your system. In some situations, you’ll want to clear this cache and start the compilation process from the very beginning. You can simply remove the config.cache file if you like. Then, the next time ./configure runs, it won’t use the previous values. You can also restore the Squid source tree to its preconfigure state with the following command:

% make distclean

This removes all object files and other files created by the ./configure and make commands.

Exercises

After compiling Squid, remove one or more of the .o files and run make again.
Use the ulimit or limits command to change the file descriptor limit to some small value before compiling Squid. Does ./configure obey or ignore your new limit?
Compile Squid with a high file-descriptor limit, then try to run it on a system with a lower limit. Does Squid use the lower or higher limit?
What happens if you mistype one of the —enable options? What if you specify an invalid storage scheme with the —enable-store-io option?
After compiling Squid, remove src/Makefile and try to compile it again. What’s the easiest way to restore the file?

^[1]Not all operating systems require building a new kernel. Some may be tunable at runtime.

^[2]The make output used to be much prettier, but such is the price we pay for advanced compiling tools such as automake.

^[3]If make recompiles the source every time you run it, and there are no errors, your system clock may be set wrong.

Previous Chapter

2. Getting Squid

Next Chapter

4. Configuration Guide for the Eager