Web Automation: Dynamic Directory Indexing
If you're like me, you probably loathe updating directory index pages. You add a new file or folder to your Web site and then you have to find other pages that you should link from and update them--not to mention the toils of updating all of those pages if the page name/location changes!On a daily basis, updating directory index pages is one of the most tiresome tasks there is. But fear not: in this column, Matthew Keller explains how a Perl script can automate this task for you--as well as updating all of those pages if the page name/location changes.
I solve this problem, quite simply, by creating directory index scripts using Perl. The largest member of this class of scripts is a directory on my private Web server that has folders containing pages talking about my projects. My entire Web site is logically organized (logical to me, anyways) using directories to house and nest information, and my "projects" page is no different.
From a filesystem structure standpoint, every directory in my
projects directory contains a different project. Every project
directory has an
index HTML file. Every
file has a title. Keeping these rules in mind, it is easy to write a short Perl
script that makes ones' life much easier.
This script resides in the root of the
Projects folder, and is
index.pl. In order for Apache to consider
index.pl the directory index script, we have to configure the
httpd.conf file to include
index.pl as a valid
directory index file. You may choose
index.cgi instead of
.pl if you want. Below shows my DirectoryIndex statement. Apache
reads these entries one at a time, from left to right. You will probably want
index.html placed ahead of
index.pl, if the
majority of your directory index pages are HTML pages and not these handy
DirectoryIndex index.pl index.html index.php index.cgi
Regardless of what you call these scripts, make sure you let Apache know how to handle them, by using the AddHandler directive in your config file. Below is an excerpt of mine:
AddHandler cgi-script .pl .cgi
Thinking About the Problem
Recall the environment I mentioned earlier:
- Every directory in my projects directory contains a different project
Every project directory has an
index.htmlfile has a title
Given this organizational structure, our little script has to do only four things:
- Obtain a list of directories
For every directory, open the
index.htmlfile if it exists
index.htmlfile, extract the title of the page
- For every pilfered title, print it back to the user as a link to the given page
Step 1: Obtain a list of directories
A clumsy, but easy way do acquire a list of directories, is to place all of the contents of the root directory we want to index, into an array (the
projects directory for our example):
1: my ="/usr/local/apache/htdocs/projects/";
2: my ="http://mattwork.potsdam.edu/projects/";
4: my @dirs=readdir PRJD;
IT Solutions Builder TOP IT RESOURCES TO MOVE YOUR BUSINESS FORWARD
Which topic are you interested in?
What is your company size?
What is your job title?
What is your job function?
Searching our resource database to find your matches...