Web Automation: Generating Dynamic Tables of Contents

In my last article, I described how to make a Dynamic Directory Index using a Perl CGI. That CGI did four things:

  1. Obtained a list of directories (only one directory deep)
  2. For every directory, opened the index.html file if it existed
  3. For every index.html file, extracted the title of the page
  4. For every pilfered title, printed it back to the user as a link to the given page
It's one thing to create a dynamic directory index, as Matthew Keller did in his last column, but what if you want a table of contents that lists every index.html page? Fear not: Keller's latest installment tackles this very issue.

This is nice if you have a hierarchical Web directory structure and want to strategically place these CGIs in high-traffic or frequently added/updated areas. But what if you want a table of contents that listed every index.html page? By slightly modifying our script from last time, we can turn our one-deep directory index, into a full-blown table of contents generator--and even a search engine, with a little ingenuity.

Configuring Apache

Apache is pretty much ready to go if you want to implement these features. You might want to change the AddHandler directive for cgi-script, as I have demonstrated below, and turn on ExecCGI for whatever directory you have this script housed:

AddHandler cgi-script .pl .cgi

Thinking About the Problem

As I mentioned before, the four things our last CGI did will continue to be the core of this script- We just need to add in some recursion, and spiff up the output formatting a little bit. So here's what this script needs to do (condensed a bit):

  1. Obtain a list of directories
  2. For every directory, open the index.html file and extract the title of the page
  3. Print back the title, with a link to the page
  4. Dive into the directory, and back to Step 1

The Functions

There are three parts of this program that fit very nicely into their own functions.

  1. Given a directory, return a list of subdirectories (one deep)
  2. Given the path to an HTML page, extract and return its title
  3. Given any path, judge its "depth"

Function 1: Given a directory, return a list of subdirectories (one deep)

The method used in the last article to do this was fairly clumsy and not scalable. Instead of trying to brute-force the same code into this example, I've improved this method. The function below, named Get_Dirs, takes the name of a directory, and returns an array of all of the subdirectories (one deep):

1: sub Get_Dirs {
2: my $basedir=shift;
3: opendir(GD,"$basedir") or return;
4: my @DIRS;
5: for(readdir(GD)) {
6: my $temp="$_";
7: if($temp =~ /^\./) { next; }
8: if(-d "$basedir$temp") {
9: push(@DIRS,"$basedir$temp/");
10: }
11: }
12: closedir GD;
13: return @DIRS;
14: } 

This article was originally published on Jun 21, 2000
Page 1 of 5

Thanks for your registration, follow us on our social networks to keep up-to-date