Web servers are designed around a certain set of basic goals:
- Accept network connections from browsers.
- Retrieve content from disk.
- Run local CGI programs.
- Transmit data back to clients.
- Be as fast as possible.
When discussing how a Web server works, it is not enough to simply outline a diagram of how low-level network packets go in and out of a Web server.
Unfortunately, these goals are not totally compatible. For example, a simple Web server could follow the logic below:
- Accept connection
- Generate static or dynamic content and return to browser
- Close connection
- Accept connection
- Back to the start…
This would work just fine for the very simplest of Web sites, but the server would start to encounter problems as soon as clients started hitting the site in numbers, or if a dynamic page took a long time to generate.
For example, if a CGI program took 30 seconds to generate content (certainly not an ideal situation anywhere, but not completely unheard of), during this time the Web server would be unable to serve any other pages.
So although this model works, it would need to be redesigned to serve more users than just a few at a time. Web servers tend to take advantage of two different ways of handling this concurrency: multi-threading and multi-processing. Either they support the inetd module on Unix (which is a form of multi-processing), multi-threading, multi-processing, or a hybrid of multi-processing and multi-threading.