dcsimg

Apache Guide: Logging with Apache--Understanding Your access_log

By Rich Bowen (Send Email)
Posted Aug 21, 2000


Apache comes with built-in mechanisms for logging activity on your server. In this series of articles, I'll talk about the standard way that Apache writes log files, and some of the tricks for getting more useful information and statistics out of your server.

Apache keeps extensive track of your server usage via logfiles. In this article, Rich Bowen discusses logfiles and how you can get more useful information from them.

This week we'll talk about the information that appears in your transfer log, and what it all means.

The standard log files

If you have done a default installation of Apache, when you run your server, two log files will get written. These files are called access_log (access.log on Windows) and error_log (error.log on Windows). These files can be found (again, if you did a default installation) in /usr/local/apache/logs. On Windows, the logs will be in the logs subdirectory of wherever you installed Apache. Various of the package managers put the log files in various other places, and you'll have to poke around to find them, or check in the configuration file for the configured location.

access_log

access_log is, as the name suggests, the log of all accesses to your server. Typical entries in this file look like:

        216.35.116.91 - - [19/Aug/2000:14:47:37 -0400] "GET / HTTP/1.0" 200 654

This line contains 7 pieces of information. Actually, two of them are blank in this example, but there is space for 7 pieces of information.

The first piece of information is the address of the remote host. That is, who is looking at your web site. In the example above, the host visiting my web site is 216.35.116.91, which is, incidentally, the IP address of the machine called si3001.inktomi.com. (I figured that out by looking up the address in DNS, with the nslookup utility.) inktomi.com is a company that makes web searching software. (I looked at their web site.) Since this same IP address requested the file robots.txt just a few seconds earlier, I suspect that this is a web searching spider that was indexing my web site. I'll talk about spiders in another column. So, just based on that first piece of information, and a glance back in the log file, I've already found out quite a bit of information about my visitors.

By default, this address is just the IP address of the remote host. You can tell Apache to look up all the host names, and put those host names in the log instead of the IP address. This is probably not a good idea, since it greatly slows down the logging process, and so slows down your entire server. And there are various tools that will go through your log after the fact, and resolve all the IP addresses to host names, so there's no real advantage to doing this anyway.

But, if you want to, you can tell Apache to do these lookups with the directive:


        HostNameLookups on

Setting HostNameLookups to double, rather than on, will cause the logging process to do a reverse lookup on the name that it finds, to verify that it points back to the IP address that you started with. The value is set to off by default.

Page 1 of 3


Comment and Contribute

Your name/nickname

Your email

(Maximum characters: 1200). You have characters left.