Apache Maintenance Basics

Apache Maintenance Basics


May 13, 2004

You've downloaded and configured your Apache server and are ready to move on to the next project. Can it really be left to fend for itself in a darkened room?

Yes. To some degree, anyway. With the exception of configuration testing, once Apache is up, you likely need never think about how the Web server is running.

On the other hand, completely ignoring your Apache installation would be foolhardy.

Contents
Monitoring Apache
Log Monitoring
Log Management
Configuration Management
Security and Passwords
Keeping Apache Up to Date
Other Systems and Extensions
Scheduling Maintenance

Doing some regular checks and maintenance on your Apache installation helps identify any issues — usually before they even become issues — and helps you stay up date with the latest security and performance patches. This article covers some of the major steps and maintenance tasks that should be regularly undertaken while the Apache system is running.

Monitoring Apache

The first step of regular Apache maintenance is to keep a close eye on what Apache is doing. Monitoring the logs really only tells you about the status of the Web serving — not the status of the Apache server itself at a moment in time. For live monitoring, use mod_status, which provides a summary of the active processes and threads and their current activity.

The following screenshot is an example of a mod_status report on an intranet server.

mod_status Screen Shot
A mod_status report on an intranet server

What you get is a heap of information about the active processes and their current status, what they are doing, and how busy they have been. Just a getting a response is a good sign that the server is running; the information from mod_status more detailed information. To enable mod_status, add, or uncomment, the following lines on you server config:

LoadModule status_module          libexec/httpd/mod_status.so
<Location /server-status>
         SetHandler server-status
         Order deny,allow
         Deny from all
         Allow from .mcslp.pri
</Location>

The Allow line must include the hosts, domains, or IP addresses for whom you want to provide access to the information.

Also, although the display does not need to be open continuously, if you suspect something is wrong, it is a good starting point.

Log Monitoring

Contents
Monitoring Apache
Log Monitoring
Log Management
Configuration Management
Security and Passwords
Keeping Apache Up to Date
Other Systems and Extensions
Scheduling Maintenance

Checking logs is the best way to find out what is going on. Apache 2.0 introduced the generation of a separate error log for the Apache process itself. Checking this, often, is the best way to find out if something needs attention, as examining the logs makes it easy to catch a faulty or missing module or a bad process. Consider this sample fragment:

[Sat May 01 10:00:14 2004] [notice] Apache/2.0.44 (Unix) DAV/2  
configured -- resuming
normal operations
[Sat May 01 10:00:14 2004] [info] Server built: Mar  7 2003 14:41:06
[Sat May 01 10:00:14 2004] [debug] prefork.c(1039): AcceptMutex:  
pthread (default:
pthread)
[Mon May 03 02:09:48 2004] [notice] child pid 23464 exit signal  
Segmentation fault (11)
[Mon May 03 02:19:48 2004] [notice] child pid 10932 exit signal  
Segmentation fault (11)
[Mon May 03 02:29:50 2004] [notice] child pid 23470 exit signal  
Segmentation fault (11)
[Mon May 03 02:39:52 2004] [notice] child pid 23471 exit signal  
Segmentation fault (11)
[Mon May 03 02:49:52 2004] [notice] child pid 23465 exit signal  
Segmentation fault (11)
[Sun May 09 13:20:07 2004] [notice] child pid 21539 exit signal Bus  
error (10)

All log entries are marked with a particular class in much the same way as entries in the system log are marked under the various Unix variants. Log levels include 'notice', which is for notification information only; 'info', which relates to running or log information; 'debug', which is output when debugging on a module is enabled; and 'warn', which notifies of a series problem.

Checking Web host logs should also be a regular activity. These highlight problems with missing files, errors in CGI scripts, and users trying to access files and directories that no longer exist. Note, though, that some errors from a site are to be expected, even if everything tests out okay elsewhere.

The important things to look out for are unexpected items, rather than missteps you might repeatedly be making. For example, say you frequently forget to add a 'favicon' to your sites; you would then get many errors with browsers looking for a file which will never exist. But errors in a CGI or other item you would want to know about.

If you are on a Unix system, and using the standard error log format, the following command generates a unique list of errors from a log file:

$ tail -100 www-error_log | cut -d']' -f 4-99 | sed -e "s/,  
referer.*//g"|sort|uniq

pulls out the most recent 100.

$ cat www-error_log | cut -d']' -f 4-99 | sed -e "s/,  
referer.*//g"|sort|uniq

searches the entire file.

To monitor the significance, add '-c' to the uniq command, which will find you a count of the number of each error.

Log Management

Contents
Monitoring Apache
Log Monitoring
Log Management
Configuration Management
Security and Passwords
Keeping Apache Up to Date
Other Systems and Extensions
Scheduling Maintenance

For obvious reasons, it's a good idea to keep logs for a while to track and trace problems. You'll also probably want to keep access logs for a long time to perform the necessary analysis. Error logs can be disposed of every three months — once you've gone through the steps above to check out any errors or potential problems.

The most effective way of doing this in the standard Apache release (without any clever configuration tricks) is to:

  1. Shut down Apache
  2. Rename the error and access logs
  3. Restart Apache

This obviously shuts down Apache for a period, which you may not want to do if the server is busy. To get around this, use the piped log system, which outputs log information through an external command that can automatically rotate and archive the information. Apache, in fact, provides the rotatelogs application as part of the standard kit to do this. Rotatelogs accepts the name of the log, and the interval for rotation (in seconds).

To enable rotatelogs, change the configuration file to use the pipe system for each log file:

CustomLog "|/usr/local/apache/bin/rotatelogs  /var/log/access_log  
86400" common 

The number 86400 is the number of seconds in a day. On a busy site, it is preferable to decrease that value so the rotation is performed every six hours, or even every hour. On less busy sites setting it for every week or month would work.

We recommend writing all logs into a custom MySQL database. This makes it easier to get out information from both error and access logs. The fields in the SQL table match those in the output, and an extra field records the name of the Web site.

Configuration Management

Many Web servers rarely have their configuration files modified and updated; others regularly add new configurations, virtual hosts, and other elements. Two ways to ensure the configuration is up to date and working are 1) checking the configuration and 2) tracking configuration changes.

Checking the configuration periodically highlights any problems (including any disparity between the currently running Apache and the current configuration file). Sometimes, it will even highlight changes made to the Apache configuration of which you may not have been aware.

Checking the configuration can be handled with the apachectl command with the configtest argument:

$ apachectl configtest
Processing config directory: /etc/httpd/sites/*.conf
  Processing config file:  
/etc/httpd/sites/0000_192.168.1.24_80_atuin.mcslp.pri.conf
  Processing config file:
/etc/httpd/sites/ 
0001_192.168.1.24_80_cheffy.devel.atuin.mcslp.pri_copy.conf
  Processing config file:  
/etc/httpd/sites/0001_any_80_cheffy.devel.atuin.mcslp.pri.conf
  Processing config file:  
/etc/httpd/sites/0002_any_80_webmail.mcslp.pri.conf
  Processing config file: /etc/httpd/sites/virtual_host_global.conf
[Wed May 12 11:07:11 2004] [warn] module mod_WebObjects.c is already  
added,
skipping
Syntax OK

The 'Syntax OK' line at the end is the key piece.

Configuration management is about keeping a history of the changes you've made to your configuration file. The easiest way to do this is to use RCS or CVS to check in any configuration changes made. They not only track the changes and differences between versions, but also enable you to record a log of the changes made and recover previous versions.

If you're worried you will forget to log the changes, you can run a script each night to automatically check the latest version of the script, along with a suitable description to identify the automatic changes:

cd /export/http/apache2/conf
cvs commit -m "Nightly auto-commit"

Security and Passwords

Contents
Monitoring Apache
Log Monitoring
Log Management
Configuration Management
Security and Passwords
Keeping Apache Up to Date
Other Systems and Extensions
Scheduling Maintenance

User life cycle management may not be one of your first thoughts during the lifetime of your Web server, but it's a critical part to keeping the system running and ensuring the security of the environment. It doesn't matter whether you are using standard HTTP authentication, an authentication system mapped to a MySQL or other database, or your own internal system; you need to keep a track of those people to whom you have granted access.

The key part of user life cycle management is to ensure their ID and access is granted only while they are actually allowed to access the system. This means regularly checking your user list and HTTP authentication systems to ensure only those users who should have access to your servers do.

One way to do this is to keep a separate log of the users added to the system and when (both manually and automatically). The moment a user leaves, remove him or her from the authentication system. Periodically, you should also go through Web site access logs to check who has been using the system recently.

There are two reasons for this. First it highlights any errors. Second, it enables the removal of users from the authentication system if they've been inactive for a reasonable period. On secure sites, such as an intranet, checks should be performed every month; on unsecured sites, checks should be run every three months. Remove anyone who hasn't used the system in that time period. They can always be granted access at a later stage if need be.

Keeping Apache Up to Date

The final component of server maintenance is monitoring the process of updating your Apache installation to the latest version. We can't emphasize enough: Never install a version of Apache on a production or live system without first testing it. To test, we keep a copy of the latest site, configuration, and other information on a VMware or VirtualPC machine; install a copy of Apache; and test the effects.

For these machines, as well as developmental and other non-critical machines, we use a directory structure to hold Apache sources during building. We keep a separate directory for each instance of an Apache server — not just each version. For machines running multiple Apache instances, it's vital to have a separate installation directory and source structure for each one.

Here, is a sample structure from a main Web server that holds three instances for development, staging, and production Web sites:

production/httpd-2.0.46/
devel/httpd-2.0.46/
devel/httpd-2.0.49/
staging/httpd-2.0.46/
staging/httpd-2.0.49/

There is a separate build directory for each version of Apache within each instance. One of the issues with Apache is that you must configure and then build the system using the correct options — not specifying a dynamic structure or the additional modules you want can cause problems. So, within each instance directory sits a script that holds the configuration command line used to configure the Apache server instance.

For example, the script might contain:

./configure --prefix=/export/http/staging-apache --enable-shared=dav

This works with any version of Apache to configure the sources for the correct instance. If a new version comes out, just re-run the script in the new source directory. Whatever configuration options were active with the current version will then apply to the new one, no need to remember the configuration and command line options used months ago when the previous edition came out.

Other Systems and Extensions

Contents
Monitoring Apache
Log Monitoring
Log Management
Configuration Management
Security and Passwords
Keeping Apache Up to Date
Other Systems and Extensions
Scheduling Maintenance

It's unlikely Apache is your entire Web serving platform — there are probably additional modules, languages, scripting environments, and other components to maintain and keep up to date (e.g., the latest versions of Perl, Perl modules, PHP, and MySQL). Keeping these up to date is not entirely a full-time job, but they should be checked every one to three months to see what needs updating.

Some of this can be quite easy. For example, if you are using the CPAN module within Perl, you can update all of the installed modules on your system using the command:

/usr/local/bin/perl -MCPAN -e 'CPAN::Shell->install(CPAN::Shell->r)'

This forces CPAN to produce a list of all of the outdated modules and install them. Other systems must then be handled manually.

Scheduling Maintenance

Some items in this article should be checked weekly, some monthly, and some annually. Each enterprise must determine which schedule is right for each item based on the environment, how busy the server is, and how well-used the system is. Most likely, the treatment that sites and virtual hosts receive will vary.

Just don't ignore maintenance in the hopes that it will go away — it wont. But a few simple steps could save you hours, and even days, in the long run.