ServersWeb Automation: Dynamic Directory Indexing

Web Automation: Dynamic Directory Indexing




If you’re like me, you probably loathe updating directory index pages. You
add a new file or folder to your Web site and then you have to find other pages
that you should link from and update them–not to mention the toils of updating
all of those pages if the page name/location changes!

On a daily basis, updating directory index pages is one of the most tiresome tasks there is. But fear not: in this column, Matthew Keller explains how a Perl script can automate this task for you–as well as updating
all of those pages if the page name/location changes.

I solve this problem, quite simply, by creating directory index scripts
using Perl. The largest member of this class of scripts is a
directory on my private Web
server
that has folders containing pages talking about my projects. My
entire Web site is logically organized (logical to me, anyways) using
directories to house and nest information, and my “projects” page is
no different.

From a filesystem structure standpoint, every directory in my
projects directory contains a different project. Every project
directory has an index HTML file. Every index HTML
file has a title. Keeping these rules in mind, it is easy to write a short Perl
script that makes ones’ life much easier.

Configuring Apache

This script resides in the root of the Projects folder, and is
called index.pl. In order for Apache to consider
index.pl the directory index script, we have to configure the
httpd.conf file to include index.pl as a valid
directory index file. You may choose index.cgi instead of
.pl if you want. Below shows my DirectoryIndex statement. Apache
reads these entries one at a time, from left to right. You will probably want
to have index.html placed ahead of index.pl, if the
majority of your directory index pages are HTML pages and not these handy
scripts:

DirectoryIndex index.pl index.html index.php index.cgi
index.htm

Regardless of what you call these scripts, make sure you let Apache know how
to handle them, by using the AddHandler directive in your config file. Below is
an excerpt of mine:

AddHandler cgi-script .pl .cgi

Thinking About the Problem

Recall the environment I mentioned earlier:

  1. Every directory in my projects directory contains a different project
  2. Every project directory has an index.html file
  3. Every index.html file has a title

Given this organizational structure, our little script has to do only four
things:

  1. Obtain a list of directories
  2. For every directory, open the index.html file if it exists
  3. For every index.html file, extract the title of the page
  4. For every pilfered title, print it back to the user as a link to the given
    page

Step 1: Obtain a list of directories

A clumsy, but easy way do acquire a list of directories, is to place all of the
contents of the root directory we want to index, into an array (the
projects directory for our example):

1: my ="/usr/local/apache/htdocs/projects/";

2: my ="http://mattwork.potsdam.edu/projects/";

3: opendir(PRJD,"");

4: my @dirs=readdir PRJD;

5: closedir(PRJD);

Latest Posts

Related Stories