SHARE

Web Automation: Dynamic Directory Indexing Page 3

Written By

Jul 20, 2010

ServerWatch content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Don’t let this snippet scare you; it’s actually quite logical once
dissected. Line 1 declares the function Get_Title. Line 2 takes
the parameter we passed to the function (that’s the
sh/index.html from Line 4 in Step 2), and shifts it into the
scalar variable . Line 3 says “unless this is a
file, return the text ‘NO INDEX'”. Line 4 opens the file for reading and
assigns the handle HTML to it. Line 5 begins a while
iteration over every line of the open file (every line will cause a new
iteration of the loop, the contents of the line will be stored in the special
variable sh). Line 6 says “if this line contains a
and a , place the stuff
in between in the special variable and continue inside the
brackets”. Line 7 is inside the if statement and closes the
HTML file. Line 8 returns the text of the title and exits the function. Line 9
ends the if statement. Line 10 ends the while statement. Line 11 will close the
HTML file if no title has been found. Line 12 will return the word
Untitled in the advent that no title has been found. Line 13 ends
the function.

This function is a bit complex in code, but I like how it demonstrates a lot
of Perl’s power and flexibility. The if statement in line 6
contains a regular expression that it’s case-insensitive (note the i
after the last /), so that different capitalizations all appear the same
to the if.

Step 4: For every pilfered title, print it back to the user as a link to
the given page

I noted back in my description of Line 6 in Step 2 that we needed to add some
code that displays the proper HTML link for the viewer of our index. Before we
get to that, we need to do a little housecleaning. We need to shoehorn in an
HTML header and perhaps some introductory text on the line before the for @dirs on Line 1 of Step 2. At the very least, we need to send the HTTP
content header to the viewer’s browser, and probably should send a little more.
The snippet below is an example of such:

1: print "Content-Type: text/htmlnn";

2: print "Project Index<br /> Pagen";

Please note the two return characters on Line 1–this is essential.
Line two may be ignored for brevity.

So, now we’re back to outputting the correct link information back to the
viewer. The code below would replace the comment I made on Line 5 of Step 2:




1: =~ s///i;



2: print "n";

Line 1 uses a substitution pattern to replace the filesystem name with the
appropriate URL name. Line 2 prints the HTMLized entry we want: the title of
the page showing and the underlying link to that page.

Summary Discussion

There's lots of room for improvement with this script. The script I have is
82 lines of code and has all sorts of neat features, some of which I'll mention
in a moment. There is also room for frustrating errors with this script. It is
imperative that you keep track of your trailing /. You need to append them
where needed, and don't append them where you don't. If you're having odd
problems, place a print statement just before you use one of the
sh/ magics and print that out to make sure it looks like a
directory path or a file path (depending on what you're interested in at the
moment).

As I mentioned above, there's lots of room for improvement. Here is a list
of some of the features that I have implemented in my various indexing scripts.
They get more complex as you go down.

Matthew Keller

Matthew Keller is a ServerWatch contributor.

Web Automation: Dynamic Directory Indexing Page 3

Summary Discussion

Matthew Keller

Company

Categories