Apache Guide: Logging, Part 4 -- Log-File Analysis

In the first sections of this series, I've talked about what goes into the standard log files, and how you can change the contents of those files.

The problem with log files is that they track an enormous amount of information -- not all of it much good to the people that pay your salary.

This week, we're looking at how to get meaningful information back out of those log files.

The Challenge

The problem is that although there is an enormous amount of information in the log files, it's not much good to the people that pay your salary. They want to know how many people visited your site, what they looked at, how long they stayed, and where they found out about your site. All of that information is (or might be) in your log files.

They also want to know the names, addresses, and shoe sizes of those people, and, hopefully, their credit card numbers. That information is not in there, and you need to know how to explain to your employer that not only is it not in there, but the only way to get this information is to explicitly ask your visitors for this information, and be willing to be told 'no.'

What Your Log Files Can Tell You

There is a lot of information available to put in your log files, including the following:

Address of the remote machine
This is almost the same as "who is visiting my web site," but not quite. More specifically, it tells you where that visitor is from. This will be something like buglet.rcbowen.com or proxy01.aol.com.

Time of visit
When did this person come to my web site? This can tell you something about your visitors. If most of your visits come between the hours of 9 a.m. and 4 p.m., then you're probably getting visits from people at work. If it's mostly 7 p.m. through midnight, people are looking at your site from home.

Single records, of course, give you very little useful information, but across several thousand 'hits', you can start to gather useful statistics.

Resource requested.
What parts of your site are most popular? Those are the parts that you should expand. Which parts of the site are completely neglected? Perhaps those parts of the site are just really hard to get to. Or, perhaps they are genuinely uninteresting, in which case you should spice them up a little. Of course, some parts of your site, such as your legal statements, are boring and there's nothing you can do about it, but they need to stay on the site for the two or three people that want to see them.

What's broken?
And, of course, your logs tell you when things are not working as they should be. Do you have broken links? Do other sites have links to your site that are not correct? Are some of your CGI programs malfunctioning? Is a robot overwhelming your site with thousands of requests per second? (Yes, this has happened to me. In fact, it's the reason that I did not get this article in on time last week!)

What your log files don't tell you

This article was originally published on Sep 18, 2000
Page 1 of 4

Thanks for your registration, follow us on our social networks to keep up-to-date