Web Automation: Generating Dynamic Tables of Contents
In my last article, I described how to make a Dynamic Directory Index using a Perl CGI. That CGI did four things:
- Obtained a list of directories (only one directory deep)
-
For every directory, opened the
index.htmlfile if it existed -
For every
index.htmlfile, extracted the title of the page - For every pilfered title, printed it back to the user as a link to the given page
This is nice if you have a hierarchical Web directory structure and want to
strategically place these CGIs in high-traffic or frequently added/updated
areas. But what if you want a table of contents that listed
every index.html page? By slightly modifying our script
from last time, we can turn our one-deep directory index, into a
full-blown table of contents generator--and even a search engine,
with a little ingenuity.
Configuring Apache
Apache is pretty much ready to go if you want to implement these features. You might want to change the
AddHandler directive for cgi-script, as I have
demonstrated below, and turn on ExecCGI for whatever directory you
have this script housed:
AddHandler cgi-script .pl .cgi
Thinking About the Problem
As I mentioned before, the four things our last CGI did will continue to be the core of this script- We just need to add in some recursion, and spiff up the output formatting a little bit. So here's what this script needs to do (condensed a bit):
- Obtain a list of directories
-
For every directory, open the
index.htmlfile and extract the title of the page - Print back the title, with a link to the page
- Dive into the directory, and back to Step 1
The Functions
There are three parts of this program that fit very nicely into their own functions.
- Given a directory, return a list of subdirectories (one deep)
- Given the path to an HTML page, extract and return its title
- Given any path, judge its "depth"
Function 1: Given a directory, return a list of subdirectories (one deep)
The method used in the last article to do this was fairly clumsy and not
scalable. Instead of trying to brute-force the same code into this example,
I've improved this method. The function below, named Get_Dirs,
takes the name of a directory, and returns an array of all of the
subdirectories (one deep):
1: sub Get_Dirs {
2: my $basedir=shift;
3: opendir(GD,"$basedir") or return;
4: my @DIRS;
5: for(readdir(GD)) {
6: my $temp="$_";
7: if($temp =~ /^\./) { next; }
8: if(-d "$basedir$temp") {
9: push(@DIRS,"$basedir$temp/");
10: }
11: }
12: closedir GD;
13: return @DIRS;
14: }
|
