This appendix contains information on setting up various browsers and user-agents to use Squid. Although it is more extensively covered in my O’Reilly book Web Caching, I’ll include some brief instructions here.
I have instructions for the following HTTP user-agents: Internet Explorer v6, Konqueror v3, Lynx v2.8, Netscape v7 a.k.a. Mozilla v5, Opera v7, libwww-perl v5, Python’s urllib/urllib2, and Wget v1.8. If you think this is all a huge hassle, consider using HTTP interception, as described in Chapter 9.
Web browsers and other HTTP-based user-agents have methods for explicitly setting a proxy address. For large organizations, this is a real hassle. You may simply have too many desktops to visit one at a time. Additionally, this approach isn’t as flexible as the others. For example, you can’t temporarily stop the flow of requests to the proxy or easily bypass the cache for certain troublesome sites.
Browsers usually give you the option to send HTTPS URLs to a proxy. Squid can handle HTTPS requests, although it can’t cache the responses. Squid simply tunnels the encrypted traffic. Thus, you should configure the browser to proxy HTTPS requests only if your firewall prevents direct connections to secure sites.
To manually configure proxies with Netscape and Mozilla, follow this sequence of menus:
Edit
Preferences
Advanced
Proxies
Manual proxy configuration
Fill in the HTTP Proxy address and Port fields. Enter the same values for FTP Proxy if you like.
To manually configure proxies in Internet Explorer, select the following sequence of menus:
View from the main window menu
Internet Options
Connections tab
LAN Settings
Enable Use a proxy server and enter its address in the Address and Port fields
The Advanced button opens a new window in which you can enter different proxy addresses for different protocols (HTTP, FTP, etc.).
You can manually configure proxies in Konqueror by clicking on the following sequence of menus:
Settings
Configure Konqueror
Proxies & Cache
Use Proxy
Fill in the address for HTTP Proxy, and Port. Use the same values for other protocols if you like.
Here’s how to find the proxy configuration screen in Opera browsers:
File
Preferences
Network
Proxy Servers
Enter an IP address (or hostname) and port number for HTTP, FTP, and other protocols as necessary.
The Lynx browser uses a configuration file, typically /usr/local/etc/lynx.cfg. There you’ll find a number of settings for proxies. For example:
http_proxy:http://proxy.example.com:3128/ https_proxy:http://proxy.example.com:3128/ ftp_proxy:http://proxy.example.com:3128/
Lynx also accepts proxy configuration via environment variables, as described in the next section.
Some browsers and other user-agents look for proxy settings in environment variables. Note that the variable names are lowercase, unlike most environment variable names:
csh% setenv http_proxy http://proxy.example.com:3128/ csh% setenv ftp_proxy http://proxy.example.com:3128/ sh$ http_proxy=http://proxy.example.com:3128/ sh$ ftp_proxy=http://proxy.example.com:3128/ export http_proxy ftp_proxy
I’ve convinced myself that the following products and packages check for these environment variables:
Opera
Lynx
Wget
Python’s urllib and urllib2
libwww-perl
Proxy Auto-Configuration is a technique that allows more control over the way user-agents select a proxy. The configuration file is simply a text file containing a JavaScript function. Browsers download the configuration file when they start up and then evaluate the function before each request. The function’s return value determines where the request is sent.
Proxy Auto-Configuration is attractive because it gives the network administrator more control. For example, you can temporarily disable your caching service, implement load balancing, or migrate the service to new systems. Additionally, the function can return a list of proxy addresses, which the browser tries in sequence. If the first is unavailable, it tries the second, and so on.
The following browsers support Proxy Auto-Configuration:
Internet Explorer
Opera
Netscape
Konqueror
Mozilla
All these browsers have a place in which you can type in the Proxy Auto-Configuration URL. You’ll find it in the same place as the manual proxy settings, earlier described in Section F.1. Configuring hundreds or thousands of workstations is a real hassle, which is why a handful of companies came up with WPAD, described in the next section.
Writing a Proxy Auto-Configuration function is relatively
straightforward. The function, named FindProxyForURL, takes two arguments and
returns a list of proxy addresses, separated by semicolons. The word
DIRECT instructs the browser to
forward the request directly to the origin server, rather than to a
proxy. Here is a simple example:
function FindProxyForURL(url, host) {
if (isPlainHostName(host))
return "DIRECT";
if (!isResolvable(host))
return "DIRECT";
if (url.substring(0, 5) = = "http:")
return "PROXY 172.16.5.1:3128; DIRECT";
if (url.substring(0, 4) = = "ftp:")
return "PROXY 172.16.5.1:3128; DIRECT";
return "DIRECT";
}The first if statement makes
the browser connect directly to the origin server if the user types a
single-component hostname, such as www. This is
generally a good idea because the browser’s interpretation of the
hostname might be different from the proxy’s. The second if statement ensures that the hostname exists
in the DNS. If not, the user sees an error message from the browser
itself, rather than from Squid. The next two if statements return a proxy address, followed
by DIRECT for HTTP and FTP URLs. If
the proxy doesn’t respond, the browser attempts to make a direct
connection to the origin server.
If you have a firewall in place, the browser probably won’t be able to make a direct connection.
After writing the function, save it somewhere in your web server’s
data directory. Next, you need to configure the server to return a
specific content type for the file. The convention is to give the file a
.pac extension, such as proxy.pac. Then, ensure that the HTTP server
returns the content type application/x-ns-proxy-autoconfig. With
Apache, you can add this line to your server config file:
AddType application/x-ns-proxy-autoconfig .pac
Refer to Section 4.3 of Web Caching
(O’Reilly), for more information on
Proxy Auto-Configuration files, including more complicated FindProxyForURL ideas and examples.
The Web Proxy Auto Discovery (WPAD) protocol is a technique for user-agents to find a nearby caching proxy automatically. The idea is relatively simple. The protocol provides a number of methods for generating a URL that refers to a Proxy Auto-Configuration file. Those methods include DHCP, DNS lookups, and SLP (the Service Location Protocol).
DHCP is the first method the user-agent should try. It sends a query for “option 252” to a local DHCP server. The response is a string: the URL. Here’s how to configure ISC’s DHCP server for WPAD:
option wpad code 252 = text; option wpad "http://172.16.1.1/proxy.pac";
The second method is SLP. However, its implementation is optional. I do not know if any user-agents actually support WPAD via SLP.
DNS is the last resort. The protocol specification outlines a number of DNS techniques a user-agent might use to find a wpad.dat URL. The most straightforward technique is to perform an address lookup for the hostname wpad in the local domain. For example, if the system’s hostname is orion.example.com, the agent requests the IP address of wpad.example.com. If the lookup is successful, the agent makes a TCP connect to that address on port 80 and requests /wpad.dat.
To make this work in Apache, you need to set the content type for the wpad.dat file like this:
AddType application/x-ns-proxy-autoconfig .dat
This may have negative side effects if your server has other files that end with .dat. One trick some people use is to redirect requests for wpad.dat to proxy.pac, with commands like this in httpd.conf:
Redirect /wpad.dat http://wpad.example.com/proxy.pac
Note that you probably won’t be able to set up a separate virtual
host for the wpad name in your domain. This is
because some user-agents set the Host
header to the IP address, rather than the hostname. The following is an
example.
GET /wpad.dat HTTP/1.1 Accept: */* User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Win32) Host: 206.168.0.13
WPAD is enabled by default in Microsoft Internet Explorer. Konqueror also supports WPAD but disables it by default. You can enable WPAD in Konqueror by visiting the proxy configuration page (described in the Section F.1) and selecting Auto Configure Proxy. Although the current stable versions of Netscape (v7.02) and Mozilla (v5.0) don’t implement WPAD, future versions will.
Table F-1 summarizes the various proxy configuration options for the user-agents mentioned in this appendix.