Guides Apache Guide: Logging with Apache--Understanding Your access_log

Apache Guide: Logging with Apache–Understanding Your access_log




Apache comes with built-in mechanisms for logging activity on
your server. In this series of articles, I’ll talk about
the standard way that Apache writes log files, and some of
the tricks for getting more useful information and statistics
out of your server.

Apache keeps extensive track of your server usage via logfiles. In this article, Rich Bowen discusses logfiles and how you can get more useful information from them.

This week we’ll talk about the information that appears in
your transfer log, and what it all means.

The standard log files

If you have done a default installation of Apache, when you
run your server, two log files will get written. These files
are called access_log (access.log on Windows) and
error_log (error.log on Windows). These files can be
found (again, if you did a default installation) in
/usr/local/apache/logs. On Windows, the logs will be in
the logs subdirectory of wherever you installed Apache.
Various of the package managers put the log files in various
other places, and you’ll have to poke around to find them,
or check in the configuration file for the configured location.

access_log

access_log is, as the name suggests, the log of all
accesses to your server. Typical entries in this file look like:

        216.35.116.91 - - [19/Aug/2000:14:47:37 -0400] "GET / HTTP/1.0" 200 654

This line contains 7 pieces of information. Actually, two of them
are blank in this example, but there is space for 7 pieces of
information.

The first piece of information is the address of the remote host.
That is, who is looking at your web site. In the example above,
the host visiting my web site is 216.35.116.91, which is,
incidentally, the IP address of the machine called
si3001.inktomi.com. (I figured that out by looking up the
address in DNS, with the nslookup utility.) inktomi.com is
a company that makes web searching software. (I looked at their
web site.) Since this same IP address requested the file
robots.txt just a few seconds earlier, I suspect that this
is a web searching spider that was indexing my web site. I’ll
talk about spiders in another column. So, just based on that
first piece of information, and a glance back in the log file,
I’ve already found out quite a bit of information about my visitors.

By default, this address is just the IP address of the remote
host. You can tell Apache to look up all the host names, and
put those host names in the log instead of the IP address. This is
probably not a good idea, since it greatly slows down the logging process,
and so slows down your entire server. And there are various tools
that will go through your log after the fact, and resolve all the IP
addresses to host names, so there’s no real advantage to doing this
anyway.

But, if you want to, you can tell Apache to do these lookups with
the directive:

        HostNameLookups on

Setting HostNameLookups to double, rather than on, will cause
the logging process to do a reverse lookup on the name that it finds,
to verify that it points back to the IP address that you started with.
The value is set to off by default.

Latest Posts

How to Convert a Physical Computer to a Virtual Machine

Many organizations are implementing virtualization technology into their networks to convert physical computers to virtual machines (VM). This helps reduce overall physical hardware costs,...

HPE ProLiant DL380 Gen10: Rack Server Overview and Insight

The HPE ProLiant DL380 series has consistently been a market leader in the server space. The Gen10 released in 2017 further increased HPE's market...

Best Server Management Software & Tools 2021

Finding the best server management software tools for your organization can have a major impact on the success of your business operations. Manually handling...

IBM AS/400: Lasting the Test of Time

Some server operating systems (OS) were built to survive the test of time – the IBM AS/400 is one such system.  The AS/400 (Application System/400)...

What is Disaster Recovery?

The modern organization's heavy dependence on using data to drive their business has made having a Disaster Recovery (DR) plan in place a necessity....

Related Stories