The previous Apache-focused tutorial published on ServerWatch discussed the benefits of a proxy server for the network, and how it can speed up access, reduce bandwidth requirements, and perform basic information filtering tasks. This type of proxy is a forward proxy — it forwards requests from a network to the Internet.
We flip the proxy model on its head and discuss when and how to implement a reverse proxy server using Apache 2.0.
However, if the proxy model is flipped on its head, a different type of proxy server is created — a reverse proxy. In this instance, instead of requests from a client being forwarded (and optionally cached) through the proxy to the Internet, requests are forwarded (and cached) to one or more Web servers, as illustrated in Figure 1.
Interesting, you’re thinking. But what is the benefit of this?
Reverse proxies are useful for reasons similar to why forward proxies are useful. The performance and security aspects are similar to those provided by a forward proxy. The other, and less obvious, advantage is that a reverse proxy provides a unified interface to Web servers.
Reverse Proxy Gateway Operation
One of the problems with supporting a modern Web site is that as the site grows, the level and quantity of information requested and returned also increases. A number of solutions have been developed to resolve this issue. The most obvious is to just build a bigger, more powerful server by adding more CPUs, RAM, disk space, and network interfaces. Ultimately, however, a physical or practical limit is reached that makes it impossible to expand any further.
Other solutions involve simple, or complex, load balancing techniques, clustering tools, or manual (and generally complex) methods of splitting up the site into different areas, and manually redirecting users to different machines to handle the requests and load.
With a reverse proxy, a single machine is inserted to act as a gateway to the real servers in the network. Now, instead of multiple machines directly handling the requests from clients, a single machine is responsible for accepting and redirecting the requests to the real servers. This means that a single domain continues to appear as a single machine, while still having the flexibility of multiple machines working behind the scenes to honor the actual requests.
The unified interface is, in essence, the same as using a forward proxy for Internet access. However, instead of being a single interface to the Internet, it becomes a single interface into the Web server network.
Caching of Static Data
Another problem with most Web sites, even those based on static content, is that the information must be read off of the disk each time it is supplied to a client. With a bit work within Apache we can use mod_cache (and the mod_mem_cache module) to keep some documents in memory.
A reverse proxy can provide an in-memory cache on a single machine, servicing the requests from clients for a number of different real servers because the proxy server is caching only requests.