Apache Guide: Logging with Apache--Understanding Your access_log Page 2
The second slot, alas, is blank, and almost always will be. That's what that ''-''
is: a place-holder for the second piece of information. That is the
location where you're supposed to get the identity of the visitor.
That's not just their login name, but their email address, or other
unique identifier. This information is supposed to be returned by
identd, or directly by the browser. And in the old days, back when
Netscape 0.9 was the dominant browser, you would usually have email
addresses in this spot. However, it did not take long for unsavory
marketing types to think that it would be a good idea to collect those
email addresses and send them unsolicited email (also known as spam).
So, before very long, this feature was removed from just about every
browser on the market. You will almost never find information in this
The third piece of information is also blank. The information that would appear there is the username with which the visitor authenticated. This will appear, of course, only when you have required authentication for a particular resource. So for the majority of entries in your log file, for most sites, this will be blank.
Next we have the time when the request was made. This information
is enclosed in square brackets, and is in what is called ''common
log format'', or ''standard english format.'' So the request in the above
example was made at 14:47:37 on Saturday, August 19. The
the end of the field means that the server is in the time zone
4 hours before UTC. This tells you
two things. One, that I tend to leave my column until the last
minute, and two, that I appear to have the wrong time-zone set
on my server. I'll have to make a note to take care of that ...
The next piece of information is probably the most useful piece
of information in the record. It tells what request was actually made
of the server. This is typically in the format
METHOD RESOURCE PROTOCOL.
In the example above, the
GET. The other most common methods
HEAD. There are a number of other valid methods, but those
three are what you will see most of the time.
RESOURCE is the actual
document, or URL, that was requested from the server. In this example,
the client requested ''/'', which is the root, or front page,
of the server. In most configurations, this corresponds to the
index.html in the
DocumentRoot directory, but could
be something else, depending on your server configuration.
PROTOCOL is usually going to be
HTTP, followed by a version
number. The version number will be either
1.1, with most
of the records being
1.0 As you probably know from other articles,
HTTP is the protocol that makes the web work. HTTP/1.0 was the earlier
version of this protocol, and 1.1 was the more recent version. However,
most web clients still speak version 1.0.
The sixth piece of information is a status code. This tells you whether
the request was successful, or encountered some problem. Most of the time,
200, which means that the transfer was successful, and everything
went well. Hopefully. I'm not going to give the whole list of the status
codes, and what they mean. You need to look in the documentation for that.
But, in general, a status code that starts with 2 was successful. Starting
with a 3 means that the request was redirected somewhere else for some
reason. Starting with a 4 means that the user did something wrong, and
starting with a 5 means that the server did something wrong.