More on proxy servers
The open source proxy server and accelerator Squid has been around since the mid-1990s, helping speed up Internet traffic at both the user and the server end. Squid plays two main roles, although it can be configured for a wide variety of situations. The first role is to act as a caching proxy server between the user and the web. This means when a user requests a Web page (or any other Web content), the request is routed through Squid, which grabs the content, saves it, and sends it back to the user. If someone else then wants the same content, Squid delivers it from the cache, saving time and bandwidth. The other way Squid is regularly used is as a content accelerator, or reverse proxy, intercepting requests to a server and using a cached version of the page to serve the request. After a configurable time, the cache expires, and Squid will grab a new copy for the next request, then cache that. Wikimedia is one of the many sites using Squid for this purpose.
Untangling the many tentacles of this popular proxy server is well worth the effort.
Squid is sufficiently ubiquitous, making it easy to get hold of via your Linux distribution. As ever with FOSS, you can choose to download the source, or even the development source if you’re fond of the bleeding edge, and roll your own. sudo apt-get install squid will do the trick on Debian or Ubuntu.
Another advantage of Squid’s ubiquity and longevity is the extensive documentation available, provided by the project itself and the many notes and tutorials available online. Most problems you might have, or setups you might want, have already been tried out by someone else and their experiences recorded online, which is incredibly useful for the busy or inexperienced admin.
Configuration by Proxy
In true old-school Unix style, configuration is via the text file at /etc/squid/squid.conf. There are stacks of example configs available at the Squid wiki and a comprehensive manual. The default setup as installed by your distro packaging system should provide the basics for a regular proxy server, although you’ll need to take a look at the security settings.
It’s important to be careful about who is permitted to access the proxy server. The default Squid setup is conservative, allowing only localhost to access; it also sets up a default localnet that you can easily configure or
set up access for. The wiki has more detailed
information about potential security pitfalls.
Note that recent versions of Squid (2.7 and 3.0) allow Config Includes. This means the configuration file can be split into multiple files, which makes it easier to locate particular options and maintain complicated configurations, as well as share config snippets between Squid admins. Bear in mind Squid processes options in the order it encounters them, so it may be important to have your include files processed in a particular order. If, on the other hand, you prefer graphical configuration, there’s a Squid module for webmin that may make your life a little easier.
Unfortunately, Squid’s configuration isn’t always entirely transparent, and the logs aren’t necessarily helpful in all situations. You can increase the debug levels by setting the debug_options parameter in the configuration file to ALL,2 (increase the number up to 9 for very full logging), and restarting Squid, which provides more information and may be helpful. Even so, I had difficulty setting up my browser to use the Squid daemon as a manual proxy.
Once you do have things running, if you don’t want to set up browsers manually to use the proxy (a nuisance if you have a large number of machines), you can set Squid up as a transparent proxy instead. There are also notes about this on the wiki.
Squid: Software With Many Tentacles
In addition to the proxying and server acceleration, Squid has many other abilities and possible setups. The FAQ explains the availability of anonymization, and there are examples available to set up various sorts of authentication, if you wish only authenticated users to be able to access the proxy. Other clever setups include a complicated configuration allowing you (potentially) to bypass Squid’s default setting of not caching dynamic content, and cache YouTube flash videos
Squid does have its drawbacks — most notably, the lack of clear error messages for configuration errors; the logs, which aren’t always helpful when fixing problems; and (if you don’t like the command line), the text-only config file. It also, unfortunately, doesn’t quite support HTTP 1.1 just yet. This, however, is in the process of being changed. However, Squid is powerful, very configurable, and fast and reliable once set up — more than enough to overcome the minor drawbacks. Squid’s ongoing status in the FOSS world is assured.
Juliet Kemp has been messing around with Linux systems, for financial reward and otherwise, for about a decade. She is also the author of “Linux System Administration Recipes: A Problem-Solution Approach” (Apress, 2009).