By Nelson King (Send Email)
Posted Jun 15, 2005

Monitoring many servers — i.e., keeping an eye on their operation — is one step toward meeting two priorities: troubleshooting and optimizing.

Keeping an eye on your servers requires more than eyeballs. Server monitoring tools oversee server operation, traffic, and usage. We look at what organizations are monitoring and provide a matrix to help determine which products meet your needs.

Hardware may fail, and software may not perform well enough. Servers may fail entirely, or worse, appear to be operating while they are no longer performing vital functions. The more servers a company is managing, the greater the likelihood of problems and the more difficult it becomes to monitor them. Then there's the not so small matter of getting bang for the buck. Are the servers performing well enough? Can a group of servers be considered reliable?

These and similar considerations are behind the need for specialized software that falls under the category of server monitoring tools.

Getting the most from server monitoring tools is more than a numbers game of managing as many servers as possible with the fewest number of people. It's also a matter of what is monitored. In most cases, this means three areas:

  1. Monitoring server operation (the running status)
  2. Monitoring server traffic (both in and out)
  3. Monitoring the results of server use (keeping logs, statistics, and analysis)

Within the three areas, the products that monitor servers also cover (albeit somewhat unevenly) a great deal of functionality, which can be broken down like this:

  • Physical: Monitoring the physical hardware includes keeping an eye on the temperature, power supply, and the functioning of components, such as disk drives. Many of these are critical elements, and failure means a dead server. Software that monitors the hardware can be very specific, for example, it works on IBM servers but not Dell servers.
  • Server Performance: Monitoring the performance of a server (e.g., CPU usage, available disk space, and memory availability), especially under a variety of conditions, helps with both troubleshooting and optimization.
  • Services: All servers run a number of services (e.g., DNS, POP3, and TCP). Many of these are critical to server operation. Again, if they fail, the server fails. Most monitoring software covers a wide range of services.
  • Network: An old and very large area of server monitoring is associated with operating a network. This is often considered a separate category of monitoring software, although such functionality is often built into general-purpose server monitoring tools.

In addition, many server monitoring tools are designed for a particular type of server (e.g., Web or database servers). We've provided at the end of this article a ServerWatch Functions Checklist for server monitoring tools. Although the matrix attempts to cover features generally available, it barely scratches the surface of the more-specialized features for monitoring Web servers or networks.

In all, server monitoring software is very diverse, and literally hundreds of products are on the market. Most offer "real-time" monitoring that displays the current condition of servers along with historical monitoring, which is the record of server performance over time. Server monitoring tools are also packaged in different ways: They are always included in the big server management suites, such as IBM Tivoli or Computer Associates Unicenter. There are a large number of general server monitoring products, such as GFI Software Network Server Monitor and BMC Software Server Monitoring and Management.

Specialized products provide features for specific operating systems (Microsoft Windows being an obvious example) and types of servers. To further complicate the choices, server monitoring tools can be purchased and operated by the user, hosted by a third-party company but operated by the user, or fully outsourced (i.e., hosted and operated by a third party). A cursory product search on the Internet will reveal scores of hosted and outsourced approaches.

