Basic Reverse Proxy Configuration
From a client perspective, a reverse proxy looks just like a standard Web server. It doesn’t require any special configuration to operate (and if it did, it wouldn’t be anywhere near as useful).
The only real requirement is to ensure the forward proxy is switched off, which is done using the ProxyRequests directive:
ProxyRequests Off |
But we do need to configure the reverse proxy to tell it where it should be redirecting or caching information for clients that request information. The system redirects specific directories within the hostname assigned to the proxy server to an alternative host. For example, Figure 3 shows three back-end servers, and a front-end reverse proxy identified as www.mcslp.com.
When a user requests www.mcslp.com/marketing, the admin actually wants the content on marketing.mcslp.com to be returned instead. For this he must edit the Apache httpd.conf file on the reverse proxy, or the machine being used as a front end to the Web site, and then set the ProxyPass directive for the requested directory to point to the URL of the real data. For example:
ProxyPass /marketing http://marketing.mcslp.com |
The above line would cause the proxy server to supply the data from marketing.mcslp.com when a request for an object within /marketing was requested. For example, the content of the URL www.mcslp.com/marketing/index.html would actually come from marketing.mcslp.com/index.shtml.
ProxyPass generates an internal proxy request from the remote directory and then returns the information, just as a forward proxy does with a proxy request from a client. This is not redirection — the information is loaded to the proxy server from the real host and sent back to the client from the reverse proxy as if the data were from the proxy server.
You can also configure the same effect from within a Location directive by simply omitting the directory (because Apache gets the directory context from the Location directive):
ProxyPass http://market.mcslp.com |
The redirection for all three directories requires something like:
ProxyPass /marketing http://marketing.mcslp.com ProxyPass /accounts http://finance.mcslp.com ProxyPass /sales http://sales.mcslp.com |
The second argument is a URL, so it could point to a sub-directory on a remote machine, too (e.g., the directive).
However,
ProxyPass /contact http://sales.mcslp.com/contact |
would redirect requests from www.mcslp.com/contact to the same directory on the sales Web server.
You can also stop subdirectories of a directory being passed through by using an exclamation mark (!) as the destination URL. For example, to reverse proxy /marketing, but not /marketing/contact you would use:
ProxyPass /marketing/contact ! ProxyPass /marketing http://marketing.mcslp.com |
Proper Reverse Proxy Configuration
The only problem with the ProxyPass directive is that it’s not “clean” reverse proxying. Although the directive will correctly pass data through to the remote host, the HTTP headers (some of which contain the true location of the data) will remain unchanged. So, for example, when accessing www.mcslp.com/marketing/index.html, the client browser will be able to identify the true source of the data as marketing.mcslp.com/index.html just by looking at the HTTP headers returned.
The one downside is that this can cause problems with relative links in pages that would ultimately point to the true server, not the proxy server we’re trying to hide behind. Solving this problem requires an additional directive, ProxyPassReverse. This forces the proxy module to rewrite the HTTP header fields Location, Content-Location, and URI with the address of the proxy server, not the true server.
A true reverse proxy configure requires two lines:
ProxyPass /marketing http://marketing.mcslp.com ProxyPassReverse /marketing http://marketing.mcslp.com |
The first line triggers the proxy request for the real data; the second handles the rewriting.
Important to note is that at no point does Apache rewrite the content of the information it is sending back, which can cause a few problems. Luckily, if you are already using a single server and replacing it with multiple servers and a reverse proxy interface, you shouldn’t have to make changes on the site, as the references you are already using will continue to be valid in the new setup.