Log Analysis Basics Page 2

By Martin Brown (Send Email)
Posted Jun 10, 2004


Log Contents

Contents
Log Types
Log Contents
Converting Logs Into Useful Information
Tracking Rather than Analysis

The first step to analyzing the contents of your log files for information is picking out the real data from the log. To do this, you must understand the format. With text files, the information is normally formatted in a specific way with defined fields, using either a single character delimiter like a space or a colon, or using fixed-width fields. In addition, individual fields may also be delimited or formatted according to their content. The block below is an example from an Apache Web server:

192.168.1.59 - - [11/Feb/2004:12:21:57 +0000] "GET / HTTP/1.1" 200 11669
192.168.1.59 - - [11/Feb/2004:12:21:59 +0000] "GET /mcslp.css HTTP/1.1" 200 4828
192.168.1.59 - - [11/Feb/2004:12:21:59 +0000] "GET /weather/images/3.gif HTTP/1.1" 200 566
192.168.1.58 - - [11/Feb/2004:12:22:21 +0000] "GET /mail/index.cgi?m=v&mbox=com-mcslp-lbt&id=2532 
  HTTP/1.1" 200 20656
192.168.1.58 - - [11/Feb/2004:12:22:22 +0000] "GET /mcslp.css HTTP/1.1" 304 0

This example shows a mixture of text delimiters for the fields in the form of spaces as well as field delimiters to signify the date/time and URL components of the log. Here's another example, this time from syslog:

May 16 18:14:30 twinsol sm-mta[22012]: [ID 801593 mail.info] i4GHEQxG022012: 
  from=<lwmeditors-bounces@shetland.sys-con.com>, size=20913, class=-30, nrcpts=1,
  msgid=<200405161600.i4GG06xf025868@shetland.sys-con.com>, 
  proto=ESMTP, daemon=MTA, relay=postfix@plunder.dreamhost.com [66.33.213.13]
May 16 18:14:30 twinsol sm-mta[22017]: [ID 801593 mail.info] i4GHEQxG022012: 
  to=<com-mcslp-lbt@gendarme.mcslp.com>, delay=00:00:01, 
  xdelay=00:00:00, mailer=cyrusv2, pri=194913, relay=localhost, dsn=2.0.0, stat=Sent

Being able to read and understand these logs helps focus your approach and provides a basis to analyze the data.

Most log analysis tools will provide a range of information, but the most common information to be reported are the basic statistics of the log information. For example, from a Web log you can obtain a list of URLs visited and a count of the number of times they were accessed. This provides useful information about the popularity of a particular page or area of your site.

If your logs provide a range of information, particularly with something like the date and time or the report, you can also use this information to generate statistics. You can, for example, monitor the access to a particular page or area of your site over a period of time, perhaps to determine the most popular times for visiting different pages of the site. Over the longer term, you can use this information to get usage statistics for the site, watching how access grows or how different parts of the site gain popularity.

Other logs provide alternative types of information and statistics. For example, I use a log processor on my syslog to generate a list of e-mail messages transferred through the machine, recording the date/time, source, and destination address. I'm not as concerned with actual statistics as I am about extracting the salient information from the log.

>> Making Logs Useful

Page 2 of 3


Comment and Contribute

Your name/nickname

Your email

(Maximum characters: 1200). You have characters left.