Table of Contents for
Squid: The Definitive Guide

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Squid: The Definitive Guide by Duane Wessels Published by O'Reilly Media, Inc., 2004
  1. Cover
  2. Squid: The Definitive Guide
  3. Squid: The Definitive Guide
  4. Dedication
  5. Preface
  6. 1. Introduction
  7. 2. Getting Squid
  8. 3. Compiling and Installing
  9. 4. Configuration Guide for the Eager
  10. 5. Running Squid
  11. 6. All About Access Controls
  12. 7. Disk Cache Basics
  13. 8. Advanced Disk Cache Topics
  14. 9. Interception Caching
  15. 10. Talking to Other Squids
  16. 11. Redirectors
  17. 12. Authentication Helpers
  18. 13. Log Files
  19. 14. Monitoring Squid
  20. 15. Server Accelerator Mode
  21. 16. Debugging and Troubleshooting
  22. A. Config File Reference
  23. http_port
  24. https_port
  25. ssl_unclean_shutdown
  26. icp_port
  27. htcp_port
  28. mcast_groups
  29. udp_incoming_address
  30. udp_outgoing_address
  31. cache_peer
  32. cache_peer_domain
  33. neighbor_type_domain
  34. icp_query_timeout
  35. maximum_icp_query_timeout
  36. mcast_icp_query_timeout
  37. dead_peer_timeout
  38. hierarchy_stoplist
  39. no_cache
  40. cache_access_log
  41. cache_log
  42. cache_store_log
  43. cache_swap_log
  44. emulate_httpd_log
  45. log_ip_on_direct
  46. cache_dir
  47. cache_mem
  48. cache_swap_low
  49. cache_swap_high
  50. maximum_object_size
  51. minimum_object_size
  52. maximum_object_size_in_memory
  53. cache_replacement_policy
  54. memory_replacement_policy
  55. store_dir_select_algorithm
  56. mime_table
  57. ipcache_size
  58. ipcache_low
  59. ipcache_high
  60. fqdncache_size
  61. log_mime_hdrs
  62. useragent_log
  63. referer_log
  64. pid_filename
  65. debug_options
  66. log_fqdn
  67. client_netmask
  68. ftp_user
  69. ftp_list_width
  70. ftp_passive
  71. ftp_sanitycheck
  72. cache_dns_program
  73. dns_children
  74. dns_retransmit_interval
  75. dns_timeout
  76. dns_defnames
  77. dns_nameservers
  78. hosts_file
  79. diskd_program
  80. unlinkd_program
  81. pinger_program
  82. redirect_program
  83. redirect_children
  84. redirect_rewrites_host_header
  85. redirector_access
  86. redirector_bypass
  87. auth_param
  88. authenticate_ttl
  89. authenticate_cache_garbage_interval
  90. authenticate_ip_ttl
  91. external_acl_type
  92. wais_relay_host
  93. wais_relay_port
  94. request_header_max_size
  95. request_body_max_size
  96. refresh_pattern
  97. quick_abort_min
  98. quick_abort_max
  99. quick_abort_pct
  100. negative_ttl
  101. positive_dns_ttl
  102. negative_dns_ttl
  103. range_offset_limit
  104. connect_timeout
  105. peer_connect_timeout
  106. read_timeout
  107. request_timeout
  108. persistent_request_timeout
  109. client_lifetime
  110. half_closed_clients
  111. pconn_timeout
  112. ident_timeout
  113. shutdown_lifetime
  114. acl
  115. http_access
  116. http_reply_access
  117. icp_access
  118. miss_access
  119. cache_peer_access
  120. ident_lookup_access
  121. tcp_outgoing_tos
  122. tcp_outgoing_address
  123. reply_body_max_size
  124. cache_mgr
  125. cache_effective_user
  126. cache_effective_group
  127. visible_hostname
  128. unique_hostname
  129. hostname_aliases
  130. announce_period
  131. announce_host
  132. announce_file
  133. announce_port
  134. httpd_accel_host
  135. httpd_accel_port
  136. httpd_accel_single_host
  137. httpd_accel_with_proxy
  138. httpd_accel_uses_host_header
  139. dns_testnames
  140. logfile_rotate
  141. append_domain
  142. tcp_recv_bufsize
  143. err_html_text
  144. deny_info
  145. memory_pools
  146. memory_pools_limit
  147. forwarded_for
  148. log_icp_queries
  149. icp_hit_stale
  150. minimum_direct_hops
  151. minimum_direct_rtt
  152. cachemgr_passwd
  153. store_avg_object_size
  154. store_objects_per_bucket
  155. client_db
  156. netdb_low
  157. netdb_high
  158. netdb_ping_period
  159. query_icmp
  160. test_reachability
  161. buffered_logs
  162. reload_into_ims
  163. always_direct
  164. never_direct
  165. header_access
  166. header_replace
  167. icon_directory
  168. error_directory
  169. maximum_single_addr_tries
  170. snmp_port
  171. snmp_access
  172. snmp_incoming_address
  173. snmp_outgoing_address
  174. as_whois_server
  175. wccp_router
  176. wccp_version
  177. wccp_incoming_address
  178. wccp_outgoing_address
  179. delay_pools
  180. delay_class
  181. delay_access
  182. delay_parameters
  183. delay_initial_bucket_level
  184. incoming_icp_average
  185. incoming_http_average
  186. incoming_dns_average
  187. min_icp_poll_cnt
  188. min_dns_poll_cnt
  189. min_http_poll_cnt
  190. max_open_disk_fds
  191. offline_mode
  192. uri_whitespace
  193. broken_posts
  194. mcast_miss_addr
  195. mcast_miss_ttl
  196. mcast_miss_port
  197. mcast_miss_encode_key
  198. nonhierarchical_direct
  199. prefer_direct
  200. strip_query_terms
  201. coredump_dir
  202. ignore_unknown_nameservers
  203. digest_generation
  204. digest_bits_per_entry
  205. digest_rebuild_period
  206. digest_rewrite_period
  207. digest_swapout_chunk_size
  208. digest_rebuild_chunk_percentage
  209. chroot
  210. client_persistent_connections
  211. server_persistent_connections
  212. pipeline_prefetch
  213. extension_methods
  214. request_entities
  215. high_response_time_warning
  216. high_page_fault_warning
  217. high_memory_warning
  218. ie_refresh
  219. vary_ignore_expire
  220. sleep_after_fork
  221. B. The Memory Cache
  222. C. Delay Pools
  223. D. Filesystem Performance Benchmarks
  224. E. Squid on Windows
  225. F. Configuring Squid Clients
  226. About the Author
  227. Colophon
  228. Copyright

Chapter 4. Configuration Guide for the Eager

After compiling and installing Squid, your next task is to delve into the configuration file. If you’re new to Squid, you’re likely to find it a bit overwhelming. The most recent version has approximately 200 configuration file directives and 2700 lines of comments. I certainly don’t expect you to read about, and configure, every directive before starting Squid. This chapter can help you get Squid running quickly.

All the squid.conf directives have default values. You might be able to get Squid going without even touching the configuration file. However, I don’t recommend trying that. You’ll be much happier if you read the following sections first.

If you are really turned off by Squid’s configuration file syntax, you might want to try the Webmin graphical user interface. It allows you to configure Squid (and numerous other programs) from your web browser. See http://www.webmin.com and The Book of Webmin by Joe Cooper (No Starch Press) for more information.

The squid.conf Syntax

Squid’s configuration file is relatively straightforward. It is similar in style to many other Unix programs. Each line begins with a configuration directive, followed by some number of values and/or keywords. Squid ignores empty lines and comment lines (beginning with #) when reading the configuration file. Here are some sample configuration lines:

cache_log /squid/var/cache.log

# define the localhost ACL
acl Localhost src 127.0.0.1/32

connect_timeout 2 minutes

log_fqdn on

Some directives take a single value. For these, repeating the directive with a different value overwrites the previous value. For example, there is only one connect_timeout value. The first line in the following example has no effect because the second line overwrites it:

connect_timeout 2 minutes
connect_timeout 1 hour

On the other hand, some directives are actually lists of values. For these, each occurrence of the directive adds a new value to the list. The extension_methods directive works this way:

extension_methods UNGET
extension_methods UNPUT
extension_methods UNPOST

For these list-based directives, you can also usually put multiple values on the same line:

extension_methods UNGET UNPUT UNPOST

Many of the directives have common types. For example, connect_timeout is a time specification that has a number followed by a unit of time. For example:

connect_timeout 3 hours
client_lifetime 4 days
negative_ttl 27 minutes

Similarly, a number of directives refer to the size of a file or chunk of memory. For these, you can write a size specification as a decimal number, followed by bytes, KB, MB, or GB. For example:

minimum_object_size 12 bytes
request_header_max_size 10 KB
maximum_object_size 187 MB

Another type worth mentioning is the toggle, which can be either on or off. Many directives use this type. For example:

server_persistent_connections on
strip_query_terms off
prefer_direct on

In general, the configuration file directives may appear in any order. However, the order is important when one directive makes reference to something defined by another. Access controls are a good example. An acl must be defined before it can be used in an http_access rule:

acl Foo src 1.2.3.4
http_access deny Foo

Many things in squid.conf are case-sensitive, such as directive names. You can’t write HTTP_port instead of http_port.

The default squid.conf file contains comments describing each directive, as well as the default values. For example:

#  TAG: persistent_request_timeout
#       How long to wait for the next HTTP request on a persistent
#       connection after the previous request completes.
#
#Default:
# persistent_request_timeout 1 minute

Each time you install Squid, the current default configuration file is saved as squid.conf.default in the $prefix/etc directory. Since directives change from time to time, you can refer to this file for the most up-to-date documentation on squid.conf.

The rest of this chapter is about the handful of directives you need to know before running Squid for the very first time.

User IDs

As you probably know, Unix processes and files have user and group ownership attributes. You need to select a user and group for Squid. This user and group combination must have read and write access to most of the Squid-related files and directories.

I highly recommend creating a dedicated squid user and group. This minimizes the chance that someone can exploit Squid to read other files on the system. If more than one person has administrative authority over Squid, you can add them to the squid group.

Unix processes inherit their parent process’ ownership attributes. That is, if you start Squid as user joe, Squid also runs as user joe. If you don’t want Squid to run as joe, you need to change your user ID beforehand. This is typically accomplished with the su command. For example:

joe% su - squid
squid% /usr/local/squid/sbin/squid

Unfortunately, running Squid isn’t always so simple. In some cases, you may need to start Squid as root, depending on your configuration. For example, only root can bind a TCP socket to privileged ports like port 80. If you need to start Squid as root, you must set the cache_effective_user directive. It tells Squid which user to become after performing the tasks that require special privileges. For example:

cache_effective_user squid

The name that you provide must be a valid user (i.e., in the /etc/passwd file). Furthermore, note that this directive is used only when you start Squid as root. Only root has the ability to become another user. If you start Squid as joe, it can’t switch to user squid.

You might be tempted to just run Squid as root without setting cache_effective_user. If you try, you’ll find that Squid refuses to run. This, again, is due to security concerns. If an outsider were somehow able to compromise or exploit Squid, he could gain full access to your system. Although we strive to make Squid secure and bug-free, this requirement provides some extra insurance, just in case.

If you start Squid as root without setting cache_effective_user, Squid uses nobody as the default value. Whatever user ID you choose for Squid, make sure it has read access to the files installed in $prefix/etc, $prefix/libexec, and $prefix/share. The user ID must also have write access to the log files and cache directory.

Squid also has a cache_effective_group directive, but you probably don’t need to set it. By default, Squid uses the cache_effective_user’s default group (from the password file).

Port Numbers

The http_port directive tells Squid which port number to listen on for HTTP requests. The default is port 3128:

http_port 3128

If you are running Squid as a surrogate (see Chapter 15), you should probably set this to 80.

You can instruct Squid to listen on multiple ports with additional http_port lines. This is often useful if you must support groups of clients that have been configured differently. For example, the browsers from one department may be sending requests to port 3128, while another department uses port 8080. Simply list both port numbers as follows:

http_port 3128
http_port 8080

You can also use the http_port directive to make Squid listen on specific interface addresses. When Squid is used on a firewall, it should have two network interfaces: one internal and one external. You probably don’t want to accept HTTP requests coming from the external side. To make Squid listen on only the internal interface, simply put the IP address in front of the port number:

http_port 192.168.1.1:3128

Log File Pathnames

I’ll discuss all the details of Squid’s log files in Chapter 13. For now the only thing you may need to worry about is where you want Squid to put its log files. The default location is a directory named logs under the installation prefix. For example, if you don’t use the —prefix= option with ./configure, the default log file directory is /usr/local/squid/var/logs.

You need to make sure that log files are stored on a disk partition with enough space. When Squid receives a write error for a log file, it exits and restarts. The primary reason for this behavior is to grab your attention. Squid wants to make sure you don’t miss any important logging information, especially if your system is being abused or attacked.

Squid has three main log files: cache.log, access.log, and store.log. The first of these, cache.log, contains informational and debugging messages. When you start Squid the first few times, you should closely watch this file. If Squid refuses to run, the reason is probably at the end of cache.log. Under normal conditions, this log file doesn’t become large enough to warrant any special attention. Also note that if you start Squid with the -s option, the important cache.log messages are also sent to your syslog daemon. You can change the location for this log file with the cache_log directive:

cache_log /squid/logs/cache.log

The access.log file contains a single line for each client request made to Squid. On average, each line is about 150 bytes. In other words, it takes about 150 MB to log one million client requests. Use the cache_access_log directive to change the location of this log file:

cache_access_log /squid/logs/access.log

If, for some reason, you don’t want Squid to log client requests, you can specify the log file pathname as /dev/null.

The store.log file is probably not very useful to most cache administrators. It contains a record for each object that enters and leaves the cache. The average record size is typically 175-200 bytes. However, Squid doesn’t create an entry in store.log for cache hits, so it contains fewer records than access.log. Use the cache_store_log directive to change the location:

cache_store_log /squid/logs/store.log

You can easily disable store.log altogether by specifying the location as none:

cache_store_log none

If you’re not careful, Squid’s log files increase in size without limit. Some operating systems enforce a 2-GB file size limit, even if you have plenty of free disk space. Exceeding this limit results in a write error, which then causes Squid to exit. To keep log file sizes reasonable, you should create a cron job that regularly renames and archives the log files. Squid has a built-in feature to make this easy. See Section 13.7 for an explanation of log file rotation.

Access Controls

I’ll have a lot to say about access controls in Chapter 6. For now, I’ll cover a few controls so that more enthusiastic readers can quickly start using Squid.

Squid’s default configuration file denies every client request. You must place additional access control rules in squid.conf before anyone can use the proxy. The simplest approach is to define an ACL that corresponds to your user’s IP addresses and an access rule that tells Squid to allow HTTP requests from those addresses. Squid has many different ACL types. The src type matches client IP addresses, and the http_access rules are checked for client HTTP requests. Thus, you need to add only two lines:

acl MyNetwork src 192.168.0.0/16
http_access allow MyNetwork

The tricky part is putting these lines in the right place. The order of http_access lines is very important, but the order of acl lines doesn’t matter. You should also be aware that the default configuration file contains some important access controls. You shouldn’t change or disrupt these until you fully comprehend their significance. When you edit squid.conf for the first time, look for this comment:

#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#

Insert your new rules below this comment, and before the http_access deny All line.

For the sake of completeness, here is a suitable initial access control configuration, including the recommended default controls and the example earlier:

acl All src 0/0
acl Manager proto cache_object
acl Localhost src 127.0.0.1/32
acl Safe_ports port 80 21 443 563 70 210 280 488 591 777 1025-65535
acl SSL_ports 443 563
acl CONNECT method CONNECT
acl MyNetwork src 192.168.0.0/16

http_access allow Manager Localhost
http_access deny Manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow MyNetwork
http_access deny All

Visible Hostname

Hopefully, you won’t need to worry about the visible_hostname directive. However, you’ll need to set it if Squid can’t figure out the hostname of the machine on which it is running. When this happens, Squid complains and refuses to run:

% squid -Nd1
FATAL: Could not determine fully qualified hostname.  Please set 'visible_hostname'

Squid wants to be sure about its hostname for a number of reasons:

  • The hostname appears in Squid’s error messages. This helps users identify the source of potential problems.

  • The hostname appears in the HTTP Via header of cache misses that Squid forwards. When the request arrives at the origin server, the Via header contains a list of all proxies involved in the transaction. Squid also uses the Via header to detect forwarding loops. I’ll talk about forwarding loops in Chapter 10.

  • Squid uses internal URLs for certain things, such as the icons for FTP directory listings. When Squid generates an HTML page for an FTP directory, it inserts embedded images for little icons that indicate the type of each file in the directory. The icon URLs contain the cache’s hostname so that web browsers request them directly from Squid.

  • Each HTTP reply from Squid includes an X-Cache header. This isn’t an official HTTP header. Rather, it is an extension header that indicates if the response was a cache hit or a cache miss. Since requests and responses may flow through more than one cache, each X-Cache header includes the name of the cache reporting hit or miss. Here’s a sample response that passed through two caches:

    HTTP/1.0 200 OK
    Date: Mon, 29 Sep 2003 22:57:23 GMT
    Content-type: text/html
    Content-length: 733
    X-Cache: HIT from bo2.us.ircache.net
    X-Cache: MISS from bo1.us.ircache.net

    Squid tries to figure out the hostname automatically at startup. First it calls the gethostname() function, which usually returns the correct hostname. Next, Squid attempts a DNS lookup on the hostname with gethostbyname( ). This function typically returns both IP addresses and the canonical name for the system. If gethostbyname() succeeds, Squid uses the canonical name in error messages, Via headers, etc.

    Squid may be unable to determine its fully qualified hostname for a number of reasons, including:

  • The hostname may not be set.

  • The hostname may be missing from the DNS zone or /etc/hosts files.

  • The Squid system’s DNS client configuration may be incorrect or missing. On Unix, you should check the /etc/resolv.conf and /etc/host.conf files.

If you see the fatal message mentioned previously, you need either to fix the hostname and DNS information or explicitly configure the hostname for Squid. In most cases, it is sufficient to ensure the hostname command returns a fully qualified hostname and add an entry to /etc/hosts. If that doesn’t work, just set the visible hostname in squid.conf:

visible_hostname squid.packet-pushers.net

Administrative Contact Information

You should set the cache_mgr directive as a favor to your users. The value is an email address users can write to in case a problem surfaces. The cache_mgr address appears in Squid’s error messages by default. For example:

cache_mgr squid@web-cache.net

Next Steps

After creating the minimal configuration file, you’re more or less ready to run Squid for the first time. To do that, just follow the instructions in the next chapter.

When you’ve mastered starting and stopping Squid, you can spend some time beefing up the configuration file. You may want to add more sophisticated access controls, which you’ll find documented in Chapter 6. Since I didn’t say anything about the disk cache yet, you should also spend a fair amount of time in Chapter 7 and Chapter 8.

Exercises

  • Parse Squid’s configuration file with squid -k parse and check the process exit status.

  • Intentionally introduce a some errors into the configuration file and run squid -k parse again. Notice how Squid reports different errors.

  • Insert comments into the configuration file. Can you start a comment anywhere, even after a valid directive?

  • Why do you think some configuration file errors are fatal, but others are not?