More on Apache server
Disabling Components and Systems
Now that we’ve got a trimmed-back and simplified configuration file, we can start removing the configuration elements for the systems not in use. In particular:
- HostnameLookups add overhead to each request
by requesting DNS lookup on the client, first reverse to find the name from
the IP address, and then a forward look up to ensure information is not
spoofed. In most cases, you can simply disable this. If you regularly process
your logs, use post-processing to determine the information. To disable
lookups, include the following directive HostnameLookups off. - Symbolic links, when enabled, will ensure Apache checks every request to see if a symbolic link is involved in the request. There will be one call to the lstat() system call for each directory to
which the request relates. Unless you have a need for symbolic links, switch
it off by using: Options -FollowSymLinks
- Server status and info, although very useful when
testing and monitoring your server, create additional
overhead for the Web server. Disable it by looking for any
SetHandler server-status directives,
and, if possible, remove the module from Apache when you configure the
application during build. -
Wildcards and flexible options should generally be avoided if you can be more explicit. For example, the DirectoryIndex directive, explicitly
specifies the list of files to be configured, always listing the most
likely choice first. -
CGI execution should take place unless you have good reason for not doing
so. Put all CGI files into a single directory and configure it for
CGI execution. This prevents Apache from trying to determine whether a
request is actually for a CGI component or a static file.
Disable Logs
Writing log information is a time consuming process. Although Apache keeps the log
files open so that it’s just a case of writing the information, this can take up
valuable time. If storing log information is not required, you can save a few
processor cycles by disabling it. To do this, simply comment out the log lines in
the configuration file.
If you do decide to keep your logs, disable HostnameLookups (see above) and make sure you copy the log information on to another machine to parse the file
for analysis.
Simplify Directory-level Configurations
The .htaccess files are an incredibly useful way of extending the configurable
parameters of your Apache server without having to edit the main configuration file
each time you want to change something. The problem is that the use of .htaccess
files also slows down the server.
First, it has to look to see if a .htaccess file
exists, then it has to parse and process the elements before finally applying the
configuration to the directory in question. Worse still, Apache must determine
this information not only for the current directory, but also for any parent directories
and it then must make the changes based on the contents of all these files.
If you want maximum performance however, you should disable the use of .htaccess
files altogether. Any directory specific configuration can go in the main
configuration file where it can be parsed once by Apache when the server starts.
To disable .htaccess add the directive
AllowOverride None to any section.
MPM Configuration
The Multi-Processing Module (MPM) is what enables a specific platform to handle
multiple concurrent connections. MPM modules are platform specific. Solutions are available to work specifically with Unix, Windows, BeOS, and NetWare. For some
platforms more than one alternative is available. For most users, the
default configuration for a particular environment works fine, especially when
getting the exact parameters correct can be a time-consuming task in and of itself. By
comparison, many of the techniques already described may yield better
performance, but when you want to squeeze the maximum performance out of your server, you
must adjust the configuration.
Under most platforms only MPM is available, under Unix there are two
options, prefork and worker. The prefork MPM forks
off a number of identical Apache processes, while the worker creates multiple
threads. In general, prefork is better on systems with one or two processors where
the operating systems is better geared toward time slicing between multiple processes. On a system
with a higher number of CPUs the threading model will probably be more effective.
In nearly all cases, the MaxClients directive is
the most effective for increasing server performance, as it controls that
maximum number of simultaneous connections Apache can handle.
Optimizing Static Components
If your Web site uses a lot of static components, or if you’ve split the static and
dynamic elements across two or more Web servers, then your main goal should be to improve
the response time for Apache sending back the information that was requested. The
easiest way to do this is to use the mod_cache module.
You can use this with the mod_disk_cache and
mod_mem_cache to provide disk-based and memory-based
caches of the static files.
Check out the Apache documentation on the mod_cache module for more information.
Optimizing Dynamic Components
Dynamic components are probably the most time-sapping component of any Web server.
Dynamic components, especially if you are using CGI, can add seconds to the response time
just to load and execute a simple application. A more system options can be found at mod_perl, PHP, and Python, and the Jakarta interface for Java.
The main advantage of the script-based solutions is that they embed the interpreter
into the Apache executable, which removes the initial loading problem with dynamic
scripts. Some will even cache the parsed script so the next time it’s requested it need
only to be executed.
Configuration can be complex and getting the exact system correct can be time
consuming. Some solutions also don’t work quite as one would expect with virtual hosts, and you
will need to change certain scripts to take full advantage of the speed enhancements on
offer.
The improvements, however, can be significant, with as much as 70 percent of the
execution time being knocked off of a Perl script simply by using mod_perl in place of CGI. With
even more work, these solutions also allow you to keep persistent connections open to
databases or to cache information between requests. This is great for e-commerce sites and also
for reducing the overhead of otherwise loading information between requests.
Summary
Although Apache is highly configurable and a relatively complex application, it’s
interesting to note that standard installations of Apache actually achieve very high
levels of performance. One area where you can easily and significantly improve performance is by tuning parameters. Unfortunately, often the components you have least control over within
Apache — dynamic elements and CGI scripts, for example — are the ones that have the
biggest impact on performance. Monitor a typical Apache server and you’ll see that the
time taken for Apache to answer a connection and send data back is in the range of
milliseconds — but waiting for the source of that data can take seconds.
This is not to say the optimizations we’ve highlighted are pointless, however. During the course of a day these saved milliseconds add up. More significant though is that cleaning up
and simplifying your Apache configuration will do more to reduce the administration
overhead than any time you might save when serving information.