GuidesApache Guide: Logging, Part 4 -- Log-File Analysis

Apache Guide: Logging, Part 4 — Log-File Analysis




In the first sections of this
series, I’ve talked about what goes into the
standard log files, and how you can change the
contents of those files.

The problem with log files is that they track an enormous amount of
information — not all of it much good
to the people that pay your salary.

This week, we’re
looking at how to get meaningful information back
out of those log files.

The Challenge

The problem
is that although there is an enormous amount of
information in the log files, it’s not much good
to the people that pay your salary. They want to
know how many people visited your site, what they
looked at, how long they stayed, and where they
found out about your site. All of that
information is (or might be) in your log
files.

They also want to know the names,
addresses, and shoe sizes of those people, and,
hopefully, their credit card numbers. That
information is not in there, and you need to know
how to explain to your employer that not only is
it not in there, but the only way to get this
information is to explicitly ask your visitors
for this information, and be willing to be told
‘no.’

What Your Log Files Can Tell
You

There is a lot of information
available to put in your log files, including the
following:

Address
of the remote machine

This
is almost the same as “who is visiting my web
site,” but not quite. More specifically, it tells
you where that visitor is from. This will be
something like buglet.rcbowen.com or
proxy01.aol.com.

Time of
visit

When did this person
come to my web site? This can tell you something
about your visitors. If most of your visits come
between the hours of 9 a.m. and 4 p.m., then
you’re probably getting visits from people at
work. If it’s mostly 7 p.m. through midnight,
people are looking at your site from home.

Single records, of course, give you very
little useful information, but across several
thousand ‘hits’, you can start to gather useful
statistics.

Resource
requested.

What parts of
your site are most popular? Those are the parts
that you should expand. Which parts of the site
are completely neglected? Perhaps those parts of
the site are just really hard to get to. Or,
perhaps they are genuinely uninteresting, in
which case you should spice them up a little. Of
course, some parts of your site, such as your
legal statements, are boring and there’s nothing
you can do about it, but they need to stay on the
site for the two or three people that want to see
them.

What’s
broken?

And, of course,
your logs tell you when things are not working as
they should be. Do you have broken links? Do
other sites have links to your site that are not
correct? Are some of your CGI programs
malfunctioning? Is a robot overwhelming your site
with thousands of requests per second? (Yes, this
has happened to me. In fact, it’s the reason that
I did not get this article in on time last week!)

What your log files don’t tell
you

Latest Posts

Related Stories