Apache 2.2.0: Should I Stay or Should I Go?
Apache 2.2.0 is major release of the Apache httpd server and includes a number of critical changes. Many of these changes are improvements of existing modules, but there are also a number of new modules and improvements in some aspects of the operational functionality. This article will cover some of the specific elements that have changed (with examples and alternative configurations) as well as discuss when to upgrade to the new version and when to wait for a future revision.
New Features and ChangesWith Apache powering more than 70 percent of Web sites, many enterprises are now caught between upgrading or sticking with the status quo. We take stock of what is new in this latest version and look at when it makes sense to move up.
The new 2.2.0 version is not just an updated release of an existing tree; much of the code is new or has been heavily improved and extended to provide additional functionality, or to extend or simplify existing features.
The configuration file in Apache has always had a love-hate relationship with its users. Some like the monolithic, all-configuration-in-the-same-file approach. Others prefer to split their file up and use the import system to insert specific configuration information. Although it doesn't affect the configuration, using multiple files is easier to understand and can be more convenient, as it enables you to put the configuration for individual virtual hosts into individual files.
The default configuration file with the Apache distribution was the monolithic style and often contained many directives that some users would either not use or understand, and some they simply wouldn't ever modify. Some Linux distributions (e.g., Gentoo), already divide the configuration file by default. Now it is a standard feature of the standard distribution.
The main httpd.conf file remains. In addition, the configuration file optionally includes standard configuration files for the following elements:
- Server-pool management (MPM configuration)
- Multilanguage error messages
- Fancy directory listings
- Language settings
- User home directories
- Real-time info on requests/configuration (/server-info and /server-status)
- Virtual hosts configuration
- Access to the Apache manual
- Distributed authoring and versioning (WebDAV)
- Miscellaneous default settings
- SSL configuration
File splitting is not compulsory, and you should be able to use existing single- or multi-file configuration without problems. However, going forward, consider splitting up the file along the guidelines used in the default configuration.Authorization/Authentication Modules
Although authorization and authentication itself hasn't changed, the modules that provide them have been rebuilt, and in some cases renamed, to make it easier to load the precise components desired. A new module has also been added that provides authorization through LDAP (mod_authnz_ldap).
Standard authorization modules have been changed to provide some consistency between the module names and the type of authorization they provide. For example, the original mod_auth module has been split into mod_auth_basic (now specifically for HTTP authentication) and mod_authn_file (which provides the back-end interface to authentication through files). The module prefix now identifies the modules role in the authentication/authorization process. Hence:
- mod_auth_* indicates modules that implement an HTTP authentication mechanism (e.g., mod_auth_basic and mod_auth_digest etc).
- mod_authn_* indicates modules the implement back-end authentication (e.g., mod_authn_file and mod_authn_dbm).
- mod_authz_* indicates modules that implement authorization (e.g., mod_authz_dbm and mod_authz_host).
- mod_authnz_* indicates modules that implement both authentication and authorization (including the new mod_authnz_ldap module).
The result is a much more intuitive suite of modules that can finely control the authentication and authorization support included in your configuration. It should also make it easier for custom authentication and authorization modules to be developed since you can more easily integrate with other existing components.
A new balancing module, mod_proxy_balancer, has been added that provides load balancing services for the main mod_proxy proxy module. The load balancer enables requests to be shared among workers via two methods, request counting and weighted traffic counting. Request counting just counts the number of requests and distributes requests across workers until they have each served an equal number of requests.
Weighted traffic counting works on the same bases as simple request counting, but you can weight individual workers so certain workers execute more requests than others. Configuration is by bytes, rather than simple requests, so you could, for example, configure one worker to process twice as many bytes as other workers, even though this may be from fewer actual requests.
The new proxy balancer also includes an additional status display, similar to the /server-status and /server-info systems for monitor server status and configuration.
The caching modules (mod_cache, mod_disk_cache, and mod_mem_cache) have never been considered complete, although many organizations use these modules without any issues. The modules are, however, now considered to be production quality. There is also a new program, htcacheclean, that cleans up the file database of cached documents. It can be run ad-hoc or as a daemon and can also provide statistics on cache directory sizes.
The mod_filter module has been extended to allow filters to be executed based on conditional criteria. This changes the old model under which documents were merely filtered unconditionally according to the configuration of the AddOutputFilter directive or the minor flexibility offered by AddOutputFilterByType.
Now, instead of adding specific filters to specific file types we create a proper filter chain output is processed by each filter in the chain. This requires a declaration of the available filter types, and if necessary, the source requirements (file type) and the filters to apply.
To expand on the example given in the standard documentation, this changes the old style filter method for server side includes (SSI) from:
AddOutputFilter INCLUDES .shtml
FilterDeclare SSI FilterProvider SSI INCLUDES resp=Content-Type $text/html FilterChain SSI
The filter chain declaration enables us to add filters at specific points in the chain and even specify that a specific filter is removed based on a specific condition. For example, you may want to add SSI to all output, unless the output is a CGI. You could achieve this by adding the SSI filter to the chain but removing it when the request is for a CGI script.
Database support in Apache modules used to require additional coding to build a wrapper around the code to gain access to the database. For example, if you wanted to add SQL-based authentication through MySQL or PostgreSQL, then the module had to provide its own interface to the SQL database. Programming and performance issues made this a less than ideal solution.
Apache now provides the mod_dbd module, which provides database connections using a standard interface. The module uses the apr_dbd interface, which also means database connections can work within a threaded environment by providing a pool of available connections. This should help increase the flexibility of the database environment and improve the performance of modules requiring databaseconnectivity.
Note that this is not a solution for database access in dynamic Web sites, but in the future it may be made available through module-based interfaces, such as mod_perl and mod_php, for providing connectivity.
Module Development Changes
There are some back-end changes to the interface for developing certain features when building custom modules for use with Apache. The release notes provide more detailed information, but basic changes include:
- Connection error logging make it easier to log connection-related error messages.
- Test configuration hooks provide test results when performing a configuration test.
- Stacksize can be modified for thread-based MPMs.
- Protocol handling is performed on output filters. In line with the changes to the filter system, filters can now delegate responsibility for setting the correct output type to mod_filter.
- A monitor hook enables modules to request the regular or schedule execution of jobs automatically.
- The regular expression interface has changed and the Perl Compatible Regular Expression (PCRE) library has been updated to v5.0. Both the header files supporting the regexp functions and the functions have changed.
- New DBD framework makes it easier to interface to SQL databases, but changes should be made to modules that use their own custom system.
Any modules that use these features should be updated accordingly.