Using .htaccess Files with Apache

# Using .htaccess Files with Apache

July 19, 2000

One of the most common needs Webmasters have is to cause the Web server to handle all the documents in a particular directory, or tree of directories, in the same way -- such as requiring a password before granting access to any file in the directory, or allowing (or disallowing) directory listings. However, this need often extends to more than just the Webmaster; consider students on a departmental Web server at a university, or individual customers of an ISP, or clients of a Web-hosting company. This article describes how the Webmaster can extend permission to tailor Apache's behaviour to users, allowing them to have some control over how it handles their own sub-areas of its total Web-space.

This article shows how you can use per-directory configuration files, called .htaccess files, to customise Apache behaviour -- or allow your users to do so for their own documents.

### Per-Directory Settings

Apache's configuration system addresses the need to group documents by directory in a straightforward manner. To apply controls to a particular directory tree, for instance, you can use the <Directory> container directive in the server's configuration files:

    <Directory "C:/Program Files/Apache Group/Apache/htdocs">
AllowOverride None
Options None
</Directory>


This has the advantage of keeping control in the Webmaster's hands; there's no need to worry about any of the server's users being able to change the settings, since the server configuration files are generally not modifiable by anyone except the admin. Unfortunately, it has the disadvantages of requiring a restart of Apache any time the config file is changed, and that it can become truly burdensome to add all the <Directory> containers that might be needed for all the users that have special requirements.

An alternative method for supplying the desired granularity of Apache configuration -- down to the directory level -- is to use special partial config files in each directory with special requirements.

## So What's an .htaccess File?

An .htaccess file is simply a text file containing Apache directives. Those directives apply to the documents in the directory where the .htaccess file is located, and to all subdirectories under it as well. Other .htaccess files in subdirectories may change or nullify the effects of those in parent directories; see the section on merging for more information.

As text files, you can use whatever text editor you like to create or make changes to .htaccess files.

These files are called '.htaccess files' because that's what they're typically named. This naming scheme has its roots in the NCSA Web server and the Unix file system; files whose names begin with a dot are often considered to be 'hidden' and aren't displayed in a normal directory listing. The NCSA developers chose the name '.htaccess' so that a control file in a directory would have a fairly reasonable name ('ht' for 'hypertext') and not clutter up directory listings. Plus, there's a long history of Unix utilities storing their preferences information in such 'hidden' files.

The name '.htaccess' isn't universally acceptable, though. Sometimes it can quite difficult to persuade a system to let you create or edit a file with such a name. For this reason, you can change the name that Apache will use when looking for these per-directory config files by using the AccessFileName directive in your server's httpd.conf file. For instance,

      AccessFileName ht.acl

will cause Apache to look for files named ht.acl instead
of .htaccess.  They'll be treated the same way, though,
and they're still called '.htaccess files' for
convenience.

Locating and Merging .htaccess
Files

When Apache determines that a requested resource actually represents
a file on the disk, it starts a process called the 'directory walk.'
This involves checking through its internal list of
<Directory> containers to find those that apply,
and possibly searching the directories on the filesystem for
.htaccess files.

Each time the directory walk finds a new set of directives that apply
to the request, they are merged with the settings already
accumulated.  The result is a collection of settings that apply to
the final document, culled from all of its ancestor directories and
the server's config files.

When searching for .htaccess files, Apache starts at the
top of the filesystem.  (On Windows, that usually means 'C:\';
otherwise, the root directory '/'.)  It then walks down the
directories to the one containing the final document, processing and merging
any .htaccess files it finds that the config files say should
be processed.  (See the section on overrides
.htaccess file should be processed or not.)

This can be an intensive process.  Consider a request for
<URI:http://your.host.com/foo/bar/gritch/x.html>
which resolves to the file

C:\Program Files\Apache Group\Apache\htdocs\foo\bar\gritch\x.html

Unless instructed otherwise, Apache is going to look for each of the
following .htaccess files, and process any it finds:

C:\.htaccess

C:\Program Files\.htaccess

C:\Program Files\Apache Group\.htaccess

C:\Program Files\Apache Group\Apache\.htaccess

C:\Program Files\Apache Group\Apache\htdocs\.htaccess

C:\Program Files\Apache Group\Apache\htdocs\foo\.htaccess

C:\Program Files\Apache Group\Apache\htdocs\foo\bar\.htaccess

C:\Program Files\Apache Group\Apache\htdocs\foo\bar\gritch\.htaccess

That's a lot of work just to return a single file!  And the server will
repeat this process each and every time the file is requested.  See
the overrides section for a way to reduce
this overhead with the AllowOverride None
directive.

Because .htaccess files are evaluated for each request,
you don't need to reload the Apache server whenever you make a
change.  This makes them particularly well suited for environments
with multiple groups or individuals sharing a single Web server
system; if the Webmaster allows, they can exercide control over
their own areas without nagging the Webmaster to reload Apache
with each change.  Also, if there's a syntax error in an
.htaccess file, it only affects a portion of
the server's Web space, rather than keeping the server from
running at all (which is what would happen if the error was
in the server-wide config files).

Directives that Work in .htaccess
Files

Not all directives will work in .htaccess files; for example,
it makes no sense to allow a ServerName directive to appear
in one, since the server is already running and knows its name -- and
cannot change it -- by the time a request would cause the
.htaccess file to be read.  Other directives aren't
allowed because they deal with features that are server-wide, or perhaps
are too sensitive.

However, most directives are allowed in .htaccess
files.  If you're not sure, take a look at the directive's documentation.
Figure 1 is a sample extracted from the Apache documentation.  You can
see where the text says 'Context' that .htaccess is
listed; that means this directive can be used in the per-directory
config files.

The SetEnvIf Directive

Syntax: SetEnvIf attribute regex envar[=value]
[...]

Default: none

Context: server config, virtual host, directory,
.htaccess

Override: FileInfo

Status: Base

Module: mod_setenvif

Compatibility: Apache 1.3 and above; the
Request_Protocol keyword and environment-variable matching are only
available with 1.3.7 and later; use in .htaccess files only supported
with 1.3.13 and later

Figure 1: Directive Documentation

line; it says that this directive can only be used in
.htaccess files if you're running Apache version 1.3.13 or
later.

If you try to include a directive in an .htaccess
file that isn't permitted there, any requests for documents under
that directory will result in a '500 Server Error'
error page and a message in the server's error log.

If your .htaccess file contains directives that aren't
covered by the current set of override categories, they won't
cause an error -- the server will just ignore them.  So your
file can contain directives in any -- or all -- of the categories,
and only those in the categories listed in the AllowOverride
list will be processed.  All of the others will be checked for
syntax, but otherwise not interpreted.

Overrides: Limiting Which Directives Will
Be Processed

Apache directives fall into seven different categories, and all can
appear in the server-wide config files.  Only five of the categories
can be used in .htaccess files, though, and in order for
Apache to accept a directive in a per-directory file,
the settings for the directory must permit the directive's
category to be overridden.

The five categories of directives are:

AuthConfig
This category is intended to be used to control
directives that have to do with Web page security, such as
the AuthName, Satisfy, and Require
directives.  This is the most common category to allow to be
overridden, as it allows users to protect their own documents.
FileInfo
Directives that control how files are processed are

Indexes
Directives that affect file listings should be in this
category.  It includes IndexOptions,
AddDescription, and DirectoryIndex,
for example.
Limit
This category is similar to the AuthConfig one
in that the directives it covers are typically related to
security.  However, they usually involve involuntary
controls, such as controlling access by IP address.
Directive in this category include Order,
Allow, and Deny.
Options
The Options category is intended for directives
that support miscellaneous options, such as
ContentDigest, XBitHack, and
Options itself.

A special directive, which is usable only in the server-wide
configuration files, dictates which categories may be overridden
in any particular directory tree.
The AllowOverride directive accepts two special
keywords in addition to the category names listed above:

All
This is a shorthand way of listing all of the
categories; the two statements below are equivalent:

        AllowOverride AuthConfig FileInfo Indexes Limits Options
AllowOverride All

None
This keyword totally disables the processing of
.htaccess files for the specified directory and
its descendants (unless another AllowOverride
directive for a subdirectory is defined in the server config files).
'Disabled' means that Apache won't even look for
.htaccess files, much less process them.  This
can result in a performance savings, and is why the
default httpd.conf file includes such a
directive for the top-level system directory.
.htaccess processing is disabled for all
directories by default by that directive, and is only
selectively enabled for those trees where it makes
sense.

As shown above, the AllowOverride directive takes a
whitespace-separated list of category names as its argument.

Be Aware of What You're Granting

By allowing the use of .htaccess files in user (or
customer or client) directories, you're essentially extending a
bit of your Webmaster privileges to anyone who can edit those
files.  So if you choose to do this, you should consider
occasionally performing an audit to make sure the files are
appropriately protected -- and, if you're really ambitious,
that they contain only settings of which you approve.

Because of the very coarse granularity of the possible override
categories, it's quite possible that by granting a user the
aility to override one set of directives you're inadvertently
delegating more power than you anticipate.  For instance,
you might want to include a "AllowOverride FileInfo"
directive for user directories so that individuals can use the
AddType directive to label documents with MIME
types that aren't in the server-wide list -- but were you aware
when you did this that you were also giving them access to the
Alias, Header, Action, and
Rewrite* directives as well?  Directives are
associated with override categories on a per-module
basis, so tracking down what's permitted by allowing a particular
category of override can be a tedious process.

The ultimate answer to what directives are in which categories is
the source code.  If you really want to know, examine the
source for the following strings:

String
Corresponding AllowOverride Keyword

OR_AUTHCFG
AllowOverride AuthConfig

OR_FILEINFO
AllowOverride FileInfo

OR_INDEXES
AllowOverride Indexes

OR_LIMIT
AllowOverride Limit

OR_OPTIONS
AllowOverride Options

(See the
previous section
for a description of what the different override categories mean.)

As you can see, with the exception of the AuthConfig/AUTHCFG keywords,
the source keywords are identical to the directive keywords.  This
is convenient!

Putting It All Together

Before enabling .htaccess files, consider the
no users, I tend to use .htaccess files for
testing and debugging, and when I have a configuration I
like, I move the directives into a <Directory>
container in the httpd.conf file and delete the
.htaccess file.  For this reason, I have
overrides enabled just about everywhere.  This allows me to balance
the convenience of .htaccess files against
their performance impact.

On some of my servers I have some user accounts for people
I know and trust, and in those environments I'm more
cautious and don't allow all overrides globally.  I do
tend to allow whatever overrides my friends need for their
own directories, though.

And in some cases I have real 'user' accounts, for people I
do not know as well -- and on those servers
AllowOverride None is the rule.  I
occasionally allow .htaccess files in their
private directories, but I carefully audit the possible
effects before granting an override category.

The two main disadvantages to using .htaccess
are the performance impact and the extending of control
the judicious use of the AllowOverride
directive, and the latter is a matter of establishing trust --
and performing risk assessment.  What mix works best in your
environment is something you'll need to determine for
yourself.

Troubleshooting

Here are some of the most common problems I've seen people have
(or have had myself) with .htaccess files.  One thing I
should stress first, though: the server error log is your friend.
You should always consult the error log when things don't seem to
be functioning correctly.  If it doesn't say anything about your
problem, try boosting the message detail by changing your
LogLevel directive to debug.  (Or
adding a LogLevel debug line of you don't have
a LogLevel already).

'Internal Server Error' page is displayed when a document is
requested
This indicates a problem with your configuration.  Check the
Apache error log file for a more detailed explanation of what
went wrong.  You probably have used a directive that isn't allowed
in .htaccess files, or have a directive with incorrect
syntax.
.htaccess file doesn't seem to change anything
It's possible that the directory is within the scope of an
AllowOverride None directive.  Try putting a line
of gibberish in the .htaccess file and force a reload
of the page.  If you still get the same page instead of an
'Internal Server Error' display, then this is probably the
cause of the problem.  Another slight possibility is that the document
you're requesting isn't actually controlled by the .htaccess
file you're editing; this can sometimes happen if you're accessing
a document with a common name, such as index.html.  If
there's any chance of this, try changing the actual document and
requesting it again to make sure you can see the change.
this isn't happening.
I've added some security directives to my .htaccess
The most common cause of this is having the .htaccess
directives within the scope of a Satisfy Any
directive.  Explicitly disable this by adding a
Satisfy All to the .htaccess file,
and try again.

Going Further

Once you've got your Apache Web server up and running, the first
hurdle has been surmounted.  Now you can move on to exploring its
capabilities and features.  Here are some pointers to resources
for further investigation:

The main Apache Web site, of course:
<URL:http://www.apache.org/>

The documentation for Apache and its modules:
<URL:http://www.apache.org/docs/>

The canonical email response page:
<URL:http://www.apache.org/foundation/preFAQ.html>

support, but there are lots of good resources listed on
it.)

Conclusion

Apache provides two main ways of controlling its behaviour on a
per-directory level: <Directory> containers
in the server-wide configuration files, and .htaccess
files in each directory where they're needed.  Each method has its
balance these against each other to decide what mix of the
techniques is best for your environment.

If you do decide to permit the use of .htaccess
files, be sure to limit them to appropriate areas and improve
your performance by using AllowOverride None
elsewhere.  This will save unnecessary disk activity.

Got a Topic You Want Covered?

If you have a particular Apache-related topic that you'd like covered
in a future article in this column, please let me know; drop me
an email at
<coar@Apache.Org>.
I do read and answer my email, usually within a few hours
(although a few days may pass if I'm travelling or my mail volume is
'way up).  If I don't respond within what seems to be a reasonable
amount of time, feel free to ping me again.

Ken Coar
is a member of the Apache Group and a director and vice
president of the
Apache Software Foundation.
He is also a core member of the
Jikes open-source Java compiler project, a contributor to the
PHP project, the author of
Apache Server for Dummies, a lead author of
Apache Server Unleashed,
and is currently working with Ryan Bloom on a book for Addison-Wesley
tentatively entitled Apache Module Development in C.
He can be reached via email at
<coar@apache.org>.