If you’re like me, you probably loathe updating directory index pages. You
add a new file or folder to your Web site and then you have to find other pages
that you should link from and update them–not to mention the toils of updating
all of those pages if the page name/location changes!
On a daily basis, updating directory index pages is one of the most tiresome tasks there is. But fear not: in this column, Matthew Keller explains how a Perl script can automate this task for you–as well as updating
all of those pages if the page name/location changes.
I solve this problem, quite simply, by creating directory index scripts
using Perl. The largest member of this class of scripts is a
directory on my private Web
server that has folders containing pages talking about my projects. My
entire Web site is logically organized (logical to me, anyways) using
directories to house and nest information, and my “projects” page is
no different.
From a filesystem structure standpoint, every directory in my
projects
directory contains a different project. Every project
directory has an index
HTML file. Every index
HTML
file has a title. Keeping these rules in mind, it is easy to write a short Perl
script that makes ones’ life much easier.
Configuring Apache
This script resides in the root of the Projects
folder, and is
called index.pl
. In order for Apache to consider
index.pl
the directory index script, we have to configure the
httpd.conf
file to include index.pl
as a valid
directory index file. You may choose index.cgi
instead of
.pl
if you want. Below shows my DirectoryIndex statement. Apache
reads these entries one at a time, from left to right. You will probably want
to have index.html
placed ahead of index.pl
, if the
majority of your directory index pages are HTML pages and not these handy
scripts:
DirectoryIndex index.pl index.html index.php index.cgi
index.htm
Regardless of what you call these scripts, make sure you let Apache know how
to handle them, by using the AddHandler directive in your config file. Below is
an excerpt of mine:
AddHandler cgi-script .pl .cgi
Thinking About the Problem
Recall the environment I mentioned earlier:
- Every directory in my projects directory contains a different project
-
Every project directory has an
index.html
file -
Every
index.html
file has a title
Given this organizational structure, our little script has to do only four
things:
- Obtain a list of directories
-
For every directory, open the
index.html
file if it exists -
For every
index.html
file, extract the title of the page -
For every pilfered title, print it back to the user as a link to the given
page
Step 1: Obtain a list of directories
A clumsy, but easy way do acquire a list of directories, is to place all of the
contents of the root directory we want to index, into an array (the
projects
directory for our example):
1: my ="/usr/local/apache/htdocs/projects/";
2: my ="http://mattwork.potsdam.edu/projects/";
3: opendir(PRJD,"");
4: my @dirs=readdir PRJD;
5: closedir(PRJD);