Using .htaccess Files with Apache

Using .htaccess Files with Apache


July 19, 2000


Copyright © 2000 by Ken Coar. All rights reserved. Limited rights granted to Internet.Com.

One of the most common needs Webmasters have is to cause the Web server to handle all the documents in a particular directory, or tree of directories, in the same way -- such as requiring a password before granting access to any file in the directory, or allowing (or disallowing) directory listings. However, this need often extends to more than just the Webmaster; consider students on a departmental Web server at a university, or individual customers of an ISP, or clients of a Web-hosting company. This article describes how the Webmaster can extend permission to tailor Apache's behaviour to users, allowing them to have some control over how it handles their own sub-areas of its total Web-space.

This article shows how you can use per-directory configuration files, called .htaccess files, to customise Apache behaviour -- or allow your users to do so for their own documents.

Per-Directory Settings

Apache's configuration system addresses the need to group documents by directory in a straightforward manner. To apply controls to a particular directory tree, for instance, you can use the <Directory> container directive in the server's configuration files:

    <Directory "C:/Program Files/Apache Group/Apache/htdocs">
        AllowOverride None
        Options None
    </Directory>
  

This has the advantage of keeping control in the Webmaster's hands; there's no need to worry about any of the server's users being able to change the settings, since the server configuration files are generally not modifiable by anyone except the admin. Unfortunately, it has the disadvantages of requiring a restart of Apache any time the config file is changed, and that it can become truly burdensome to add all the <Directory> containers that might be needed for all the users that have special requirements.

An alternative method for supplying the desired granularity of Apache configuration -- down to the directory level -- is to use special partial config files in each directory with special requirements.

So What's an .htaccess File?

An .htaccess file is simply a text file containing Apache directives. Those directives apply to the documents in the directory where the .htaccess file is located, and to all subdirectories under it as well. Other .htaccess files in subdirectories may change or nullify the effects of those in parent directories; see the section on merging for more information.

As text files, you can use whatever text editor you like to create or make changes to .htaccess files.

These files are called '.htaccess files' because that's what they're typically named. This naming scheme has its roots in the NCSA Web server and the Unix file system; files whose names begin with a dot are often considered to be 'hidden' and aren't displayed in a normal directory listing. The NCSA developers chose the name '.htaccess' so that a control file in a directory would have a fairly reasonable name ('ht' for 'hypertext') and not clutter up directory listings. Plus, there's a long history of Unix utilities storing their preferences information in such 'hidden' files.

The name '.htaccess' isn't universally acceptable, though. Sometimes it can quite difficult to persuade a system to let you create or edit a file with such a name. For this reason, you can change the name that Apache will use when looking for these per-directory config files by using the AccessFileName directive in your server's httpd.conf file. For instance,


  
    AccessFileName ht.acl
  

will cause Apache to look for files named ht.acl instead of .htaccess. They'll be treated the same way, though, and they're still called '.htaccess files' for convenience.

Locating and Merging .htaccess Files

When Apache determines that a requested resource actually represents a file on the disk, it starts a process called the 'directory walk.' This involves checking through its internal list of <Directory> containers to find those that apply, and possibly searching the directories on the filesystem for .htaccess files.

Each time the directory walk finds a new set of directives that apply to the request, they are merged with the settings already accumulated. The result is a collection of settings that apply to the final document, culled from all of its ancestor directories and the server's config files.

When searching for .htaccess files, Apache starts at the top of the filesystem. (On Windows, that usually means 'C:\'; otherwise, the root directory '/'.) It then walks down the directories to the one containing the final document, processing and merging any .htaccess files it finds that the config files say should be processed. (See the section on overrides for more information on how the server determines whether an .htaccess file should be processed or not.)

This can be an intensive process. Consider a request for <URI:http://your.host.com/foo/bar/gritch/x.html> which resolves to the file

    C:\Program Files\Apache Group\Apache\htdocs\foo\bar\gritch\x.html
  

Unless instructed otherwise, Apache is going to look for each of the following .htaccess files, and process any it finds:

  1. C:\.htaccess
  2. C:\Program Files\.htaccess
  3. C:\Program Files\Apache Group\.htaccess
  4. C:\Program Files\Apache Group\Apache\.htaccess
  5. C:\Program Files\Apache Group\Apache\htdocs\.htaccess
  6. C:\Program Files\Apache Group\Apache\htdocs\foo\.htaccess
  7. C:\Program Files\Apache Group\Apache\htdocs\foo\bar\.htaccess
  8. C:\Program Files\Apache Group\Apache\htdocs\foo\bar\gritch\.htaccess

That's a lot of work just to return a single file! And the server will repeat this process each and every time the file is requested. See the overrides section for a way to reduce this overhead with the AllowOverride None directive.

Because .htaccess files are evaluated for each request, you don't need to reload the Apache server whenever you make a change. This makes them particularly well suited for environments with multiple groups or individuals sharing a single Web server system; if the Webmaster allows, they can exercide control over their own areas without nagging the Webmaster to reload Apache with each change. Also, if there's a syntax error in an .htaccess file, it only affects a portion of the server's Web space, rather than keeping the server from running at all (which is what would happen if the error was in the server-wide config files).

Directives that Work in .htaccess Files

Not all directives will work in .htaccess files; for example, it makes no sense to allow a ServerName directive to appear in one, since the server is already running and knows its name -- and cannot change it -- by the time a request would cause the .htaccess file to be read. Other directives aren't allowed because they deal with features that are server-wide, or perhaps are too sensitive.

However, most directives are allowed in .htaccess files. If you're not sure, take a look at the directive's documentation. Figure 1 is a sample extracted from the Apache documentation. You can see where the text says 'Context' that .htaccess is listed; that means this directive can be used in the per-directory config files.

The SetEnvIf Directive

Syntax: SetEnvIf attribute regex envar[=value] [...]
Default: none
Context: server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Base
Module: mod_setenvif
Compatibility: Apache 1.3 and above; the Request_Protocol keyword and environment-variable matching are only available with 1.3.7 and later; use in .htaccess files only supported with 1.3.13 and later

Figure 1: Directive Documentation

Note, however, that there's more information on the Compatibility line; it says that this directive can only be used in .htaccess files if you're running Apache version 1.3.13 or later.

If you try to include a directive in an .htaccess file that isn't permitted there, any requests for documents under that directory will result in a '500 Server Error' error page and a message in the server's error log.

If your .htaccess file contains directives that aren't covered by the current set of override categories, they won't cause an error -- the server will just ignore them. So your file can contain directives in any -- or all -- of the categories, and only those in the categories listed in the AllowOverride list will be processed. All of the others will be checked for syntax, but otherwise not interpreted.

Overrides: Limiting Which Directives Will Be Processed

Apache directives fall into seven different categories, and all can appear in the server-wide config files. Only five of the categories can be used in .htaccess files, though, and in order for Apache to accept a directive in a per-directory file, the settings for the directory must permit the directive's category to be overridden.

The five categories of directives are:

AuthConfig
This category is intended to be used to control directives that have to do with Web page security, such as the AuthName, Satisfy, and Require directives. This is the most common category to allow to be overridden, as it allows users to protect their own documents.
FileInfo
Directives that control how files are processed are
Indexes
Directives that affect file listings should be in this category. It includes IndexOptions, AddDescription, and DirectoryIndex, for example.
Limit
This category is similar to the AuthConfig one in that the directives it covers are typically related to security. However, they usually involve involuntary controls, such as controlling access by IP address. Directive in this category include Order, Allow, and Deny.
Options
The Options category is intended for directives that support miscellaneous options, such as ContentDigest, XBitHack, and Options itself.

A special directive, which is usable only in the server-wide configuration files, dictates which categories may be overridden in any particular directory tree. The AllowOverride directive accepts two special keywords in addition to the category names listed above:

All
This is a shorthand way of listing all of the categories; the two statements below are equivalent:

    
    AllowOverride AuthConfig FileInfo Indexes Limits Options
    AllowOverride All
    
None
This keyword totally disables the processing of .htaccess files for the specified directory and its descendants (unless another AllowOverride directive for a subdirectory is defined in the server config files). 'Disabled' means that Apache won't even look for .htaccess files, much less process them. This can result in a performance savings, and is why the default httpd.conf file includes such a directive for the top-level system directory. .htaccess processing is disabled for all directories by default by that directive, and is only selectively enabled for those trees where it makes sense.

As shown above, the AllowOverride directive takes a whitespace-separated list of category names as its argument.

Be Aware of What You're Granting

By allowing the use of .htaccess files in user (or customer or client) directories, you're essentially extending a bit of your Webmaster privileges to anyone who can edit those files. So if you choose to do this, you should consider occasionally performing an audit to make sure the files are appropriately protected -- and, if you're really ambitious, that they contain only settings of which you approve.

Because of the very coarse granularity of the possible override categories, it's quite possible that by granting a user the aility to override one set of directives you're inadvertently delegating more power than you anticipate. For instance, you might want to include a "AllowOverride FileInfo" directive for user directories so that individuals can use the AddType directive to label documents with MIME types that aren't in the server-wide list -- but were you aware when you did this that you were also giving them access to the Alias, Header, Action, and Rewrite* directives as well? Directives are associated with override categories on a per-module basis, so tracking down what's permitted by allowing a particular category of override can be a tedious process.

The ultimate answer to what directives are in which categories is the source code. If you really want to know, examine the source for the following strings:

String Corresponding AllowOverride Keyword
OR_AUTHCFG AllowOverride AuthConfig
OR_FILEINFO AllowOverride FileInfo
OR_INDEXES AllowOverride Indexes
OR_LIMIT AllowOverride Limit
OR_OPTIONS AllowOverride Options

(See the previous section for a description of what the different override categories mean.)

As you can see, with the exception of the AuthConfig/AUTHCFG keywords, the source keywords are identical to the directive keywords. This is convenient!

Putting It All Together

Before enabling .htaccess files, consider the advantages and disadvanteges. On servers I run myself, with no users, I tend to use .htaccess files for testing and debugging, and when I have a configuration I like, I move the directives into a <Directory> container in the httpd.conf file and delete the .htaccess file. For this reason, I have overrides enabled just about everywhere. This allows me to balance the convenience of .htaccess files against their performance impact.

On some of my servers I have some user accounts for people I know and trust, and in those environments I'm more cautious and don't allow all overrides globally. I do tend to allow whatever overrides my friends need for their own directories, though.

And in some cases I have real 'user' accounts, for people I do not know as well -- and on those servers AllowOverride None is the rule. I occasionally allow .htaccess files in their private directories, but I carefully audit the possible effects before granting an override category.

The two main disadvantages to using .htaccess are the performance impact and the extending of control access to others. The first is somewhat manageable through the judicious use of the AllowOverride directive, and the latter is a matter of establishing trust -- and performing risk assessment. What mix works best in your environment is something you'll need to determine for yourself.

Troubleshooting

Here are some of the most common problems I've seen people have (or have had myself) with .htaccess files. One thing I should stress first, though: the server error log is your friend. You should always consult the error log when things don't seem to be functioning correctly. If it doesn't say anything about your problem, try boosting the message detail by changing your LogLevel directive to debug. (Or adding a LogLevel debug line of you don't have a LogLevel already).

'Internal Server Error' page is displayed when a document is requested
This indicates a problem with your configuration. Check the Apache error log file for a more detailed explanation of what went wrong. You probably have used a directive that isn't allowed in .htaccess files, or have a directive with incorrect syntax.
.htaccess file doesn't seem to change anything
It's possible that the directory is within the scope of an AllowOverride None directive. Try putting a line of gibberish in the .htaccess file and force a reload of the page. If you still get the same page instead of an 'Internal Server Error' display, then this is probably the cause of the problem. Another slight possibility is that the document you're requesting isn't actually controlled by the .htaccess file you're editing; this can sometimes happen if you're accessing a document with a common name, such as index.html. If there's any chance of this, try changing the actual document and requesting it again to make sure you can see the change. this isn't happening.
I've added some security directives to my .htaccess file, but I'm not getting challenged for a username and password
The most common cause of this is having the .htaccess directives within the scope of a Satisfy Any directive. Explicitly disable this by adding a Satisfy All to the .htaccess file, and try again.

Going Further

Once you've got your Apache Web server up and running, the first hurdle has been surmounted. Now you can move on to exploring its capabilities and features. Here are some pointers to resources for further investigation:

Conclusion

Apache provides two main ways of controlling its behaviour on a per-directory level: <Directory> containers in the server-wide configuration files, and .htaccess files in each directory where they're needed. Each method has its advantages and its disadvantages; you, as the Webmaster, need to balance these against each other to decide what mix of the techniques is best for your environment.

If you do decide to permit the use of .htaccess files, be sure to limit them to appropriate areas and improve your performance by using AllowOverride None elsewhere. This will save unnecessary disk activity.


Got a Topic You Want Covered?

If you have a particular Apache-related topic that you'd like covered in a future article in this column, please let me know; drop me an email at <coar@Apache.Org>. I do read and answer my email, usually within a few hours (although a few days may pass if I'm travelling or my mail volume is 'way up). If I don't respond within what seems to be a reasonable amount of time, feel free to ping me again.

About the Author

Ken Coar is a member of the Apache Group and a director and vice president of the Apache Software Foundation. He is also a core member of the Jikes open-source Java compiler project, a contributor to the PHP project, the author of Apache Server for Dummies, a lead author of Apache Server Unleashed, and is currently working with Ryan Bloom on a book for Addison-Wesley tentatively entitled Apache Module Development in C. He can be reached via email at <coar@apache.org>.