Apache Guide: Logging with Apache--Understanding Your access_log Page 2

The second slot, alas, is blank, and almost always will be. That's what that ''-'' is: a place-holder for the second piece of information. That is the location where you're supposed to get the identity of the visitor. That's not just their login name, but their email address, or other unique identifier. This information is supposed to be returned by identd, or directly by the browser. And in the old days, back when Netscape 0.9 was the dominant browser, you would usually have email addresses in this spot. However, it did not take long for unsavory marketing types to think that it would be a good idea to collect those email addresses and send them unsolicited email (also known as spam). So, before very long, this feature was removed from just about every browser on the market. You will almost never find information in this field.

The third piece of information is also blank. The information that would appear there is the username with which the visitor authenticated. This will appear, of course, only when you have required authentication for a particular resource. So for the majority of entries in your log file, for most sites, this will be blank.

Next we have the time when the request was made. This information is enclosed in square brackets, and is in what is called ''common log format'', or ''standard english format.'' So the request in the above example was made at 14:47:37 on Saturday, August 19. The -0400 pn the end of the field means that the server is in the time zone 4 hours before UTC. This tells you two things. One, that I tend to leave my column until the last minute, and two, that I appear to have the wrong time-zone set on my server. I'll have to make a note to take care of that ...

The next piece of information is probably the most useful piece of information in the record. It tells what request was actually made of the server. This is typically in the format METHOD RESOURCE PROTOCOL.

In the example above, the METHOD is GET. The other most common methods will be POST and HEAD. There are a number of other valid methods, but those three are what you will see most of the time.

The RESOURCE is the actual document, or URL, that was requested from the server. In this example, the client requested ''/'', which is the root, or front page, of the server. In most configurations, this corresponds to the file index.html in the DocumentRoot directory, but could be something else, depending on your server configuration.

The PROTOCOL is usually going to be HTTP, followed by a version number. The version number will be either 1.0 or 1.1, with most of the records being 1.0 As you probably know from other articles, HTTP is the protocol that makes the web work. HTTP/1.0 was the earlier version of this protocol, and 1.1 was the more recent version. However, most web clients still speak version 1.0.

The sixth piece of information is a status code. This tells you whether the request was successful, or encountered some problem. Most of the time, this is 200, which means that the transfer was successful, and everything went well. Hopefully. I'm not going to give the whole list of the status codes, and what they mean. You need to look in the documentation for that. But, in general, a status code that starts with 2 was successful. Starting with a 3 means that the request was redirected somewhere else for some reason. Starting with a 4 means that the user did something wrong, and starting with a 5 means that the server did something wrong.

This article was originally published on Aug 21, 2000
Page 2 of 3

Thanks for your registration, follow us on our social networks to keep up-to-date