Certainly…Possibly…Usually CPU utilization has been the least troublesome tuning issue, in my experience. HTTP services aren’t CPU-intensive, so the major focus when addressing CPU issues is to make sure that you have a server fast enough to process the number of HTTP operations per second that your service requires. Closely linked to this is the need to run HTTP server programs that are well designed, and to make sure you are maximizing the efficiency of your CGI, database or other gateway programs.s
One area that is often overlooked is the amount of resources that a search and indexing engine will consume. This is a good argument for baseline measurements which will indicate which processes are responsible for the bulk of resource use.
When trying to improve the throughput of CGI programs, consider writing directly to the server’s API if available. Netscape, Microsoft and Apache all have very robust APIs available for their respective platforms which allow you to implement programs that can save both CPU and memory consumption. The downside, of course, is that the APIs usually require programs written using C or C++ in threaded environments, which increases the time for deployment, but if you are trying to maximize throughput, it is well worth the time spent.
Just Disking Around
Disk I/O requires a close look when optimizing performance. Almost all of the systems I work with are SCSI-based disk subsystems, which I believe give greater flexibility in drive configurations. I tend to favor using a number of small (typically 1 GB or less) hard drives as opposed to one or two large drives. This is due to the fact that disk I/O can be a substantial bottleneck on fast systems, and having multiple disks seeking and retrieving information can be a much more efficient use of resources.
On heavily-used servers, I recommend dedicating a single drive just for log files, as I find it to be the most active file system under load. Databases also fall under this requirement, as the file systems that contain database records may need to be spread among several disks to improve database throughput. When the situation permits, I prefer separating database servers from Web servers, as they tend to have individual tuning requirements that are not always compatible.