Now we will use the server to its full capacity, by keeping all
MaxClients
clients alive all the time and having a big
MaxRequestsPerChild
, so that no child will be killed during the
benchmarking.
MinSpareServers 50 MaxSpareServers 50 StartServers 50 MaxClients 50 MaxRequestsPerChild 5000
NR NC RPS comment ------------------------------------------------ 100 10 32.05 1000 10 33.14 1000 50 33.17 1000 100 31.72 10000 200 31.60
Conclusion: In this scenario there is no overhead involving the parent
server loading new children, all the servers are available, and the
only bottleneck is contention for the CPU.
Now we will change MaxClients
and watch the results: Let’s reduce
MaxClients
to 10.
MinSpareServers 8 MaxSpareServers 10 StartServers 10 MaxClients 10
MaxRequestsPerChild 5000
NR NC RPS comment ------------------------------------------------ 10 10 23.87 # not a reliable figure 100 10 32.64 1000 10 32.82 1000 50 30.43 1000 100 25.68 1000 500 26.95 2000 500 32.53
Conclusions: Very little difference! Ten servers were able to
serve almost with the same throughput as 50 servers. Why? My guess
is because of CPU throttling. It seems that 10 servers were serving
requests 5 times faster than when we worked with 50 servers. In that
case, each child received its CPU time slice five times less
frequently. So having a big value for MaxClients
, doesn’t mean
that the performance will be better. You have just seen the numbers!
Now we will start drastically to reduce MaxRequestsPerChild
:
MinSpareServers 8 MaxSpareServers 16 StartServers 10 MaxClients 50
NR NC MRPC RPS comment ------------------------------------------------ 100 10 10 5.77 100 10 5 3.32 1000 50 20 8.92 1000 50 10 5.47 1000 50 5 2.83 1000 100 10 6.51
Conclusions: When we drastically reduce MaxRequestsPerChild
, the
performance starts to become closer to plain mod_cgi.
Here are the numbers of this run with mod_cgi, for comparison:
MinSpareServers 8 MaxSpareServers 16 StartServers 10 MaxClients 50
NR NC RPS comment ------------------------------------------------ 100 10 1.12 1000 50 1.14 1000 100 1.13
Conclusion: mod_cgi is much slower. 🙂 In the first test, when
NR/NC was 100/10, mod_cgi was capable of 1.12 requests per second. In
the same circumstances, mod_perl was capable of 32 requests per
second, nearly 30 times faster! In the first test each client waited
about 100 seconds to be served. In the second and third tests they
waited 1000 seconds!
Choosing MaxClients
The MaxClients
directive sets the limit on the number of
simultaneous requests that can be supported. No more than this number
of child server processes will be created. To configure more than 256
clients, you must edit the HARD_SERVER_LIMIT
entry in httpd.h
and recompile. In our case we want this variable to be as small as
possible, because in this way we can limit the resources used by the
server children. Since we can restrict each child’s process size with
Apache::SizeLimit
or Apache::GTopLimit
, the calculation of
MaxClients
is pretty straightforward: