Web Automation: Generating Dynamic Tables of Contents Page 4

Line 1 stuffs the local path of the root web directory into the variable . Line 2 stuffs the URL we will want to replace the local path with, in the variable . Line 3 calls the Get_Depth function on our initial director and adds 1 to it (adding 1 is important, because logically we'll never be in a directory that is at the same depth of the htdocs directory (eg. the logs and conf directories are at the same depth--we won't be going there)). Line 4 calls the Get_Dirs function to obtain a list of subdirectories in the root of our webspace. Lines 5 and 6 send the default HTTP content header to the browser. Line 7 starts the unordered list. Line 8 starts iterating over the @dirs array using a while loop. Line 9 removes the first item from the @dirs e array, and stuffs it into the scalar (My rationale for doing this is scalability. By using a while loop instead of a for loop, and removing items with shift, it keeps the array from growing unnecessarily large and needlessly consuming resources. If I didn't remove the entries as I used them, at the end of this script the @dirs array would contain all of the directories in webspace--this is not necessary). Line 10 calls the Get_Depth function to obtain the depth of the current directory, and stuffs the value into the scalar. Line 11 first calls the Get_Dirs function to obtain a list of subdirectories, and then prepends that list to the beginning of the @dirs array (making them the "next" to be iterated onto). Line 12 says, "if the current depth is greater than the previous depth, indent." Line 13 catches the occurrences when the current depth is less than the previous depth. Line 14 calculates the difference of the current and previous depths and store the result in . Line 15 print the </ul> tag however many time is necessary according to the difference between the current and previous depths (note the x operator). Line 16 ends this If. Line 17 assigns the to . Line 18 concatenates the current directory path with index.html, and stores the value in . Line 19 says, "unless exists, skip to the next iteration" (which is the same as saying "if does not exist, skip to the next iteration"). This is to prevent directories that don't have an index.html file from making things icky. Line 20 calls the Get_Title function on , and assigns that value to the scalar. Line 21 contains a substitution regular expression that replaces the base directory part of the path we placed in line 1, with the base URL we provided in line 2. Line 22 prints the title of the page as a link to the page, as a list item. Line 23 ends this while loop. Line 24 prints the final </ul>.

Summary Discussion

As always, this script is not the end-all of table-of-contents generators- But it is a good place to start. It is short and fairly memory efficient. It scales fairly well to sites with 5000 directories, and perhaps even beyond.

This article was originally published on Jun 21, 2000

Thanks for your registration, follow us on our social networks to keep up-to-date