Improving mod_perl Driven Site's Performance -- Part VI: Forking and Executing Subprocesses from mod_perl

Forking and Executing Subprocesses from mod_perl

It's desirable to avoid forking under mod_perl. Since when you do, you are forking the entire Apache server, lock, stock and barrel. Not only is your Perl code and Perl interpreter being duplicated, but so is mod_ssl, mod_rewrite, mod_log, mod_proxy, mod_speling (it's not a typo!) or whatever modules you have used in your server, all the core routines, etc.

This article has information and example code for forking a new process, freeing the parent process, detaching the forked process, avoiding zombie processes, a complete fork example, starting a long-running external program, starting a short-running external program, and executing system() or exec() in the right way.

Modern Operating Systems come with a very light version of fork which adds a little overhead when called, since it was optimized to do the absolute minimum of memory pages duplications. The copy-on-write technique is the one that allows to do so. The gist of this technique is as follows: the parent process memory pages aren't immediately copied to the child's space on fork(), but this is done only when the child or the parent modifies the data in some memory pages. Before the pages get modified they get marked as dirty and the child has no choice but to copy the pages that are to be modified since they cannot be shared any more.

If you need to call a Perl program from your mod_perl code, it's better to try to covert the program into a module and call it a function without spawning a special process to do that. Of course if you cannot do that or the program is not written in Perl, you have to call via system() or is equivalent, which spawn a new process. If the program is written in C, you may try to write a Perl glue code with help of XS or SWIG architectures, and then the program will be executed as a perl subroutine.

Also by trying to spawn a sub-process, you might be trying to do the ''wrong thing''. If what you really want is to send information to the browser and then do some post-processing, look into the PerlCleanupHandler directive. The latter allows you to tell the child process after the request has been processed and user has received the response. This doesn't release the mod_perl process to serve other requests, but it allows it to send the response to the client faster. If this is the situation and you need to run some cleanup code, you may want to register this code during the request processing via:

  my  = shift;
  sub do_cleanup{ #some clean-up code here }

But when a long term process needs to be spawned, there is not much choice, but to use fork(). We cannot just run this long term process within Apache process, since it'll first keep the Apache process busy, instead of letting it do the job it was designed for. And second, if Apache will be stopped the long term process might be terminated as well, unless coded properly to detach from Apache processes group.

In the following sections I'm going to discuss how to properly spawn new processes under mod_perl.

Forking a New Process

This is a typical way to call fork() under mod_perl:

  defined (my  = fork) or die "Cannot fork: \n";
  if () {
    # Parent runs this block
  } else {
    # Child runs this block
    # some code comes here
  # possibly more code here usually run by the parent

When using fork(), you should check its return value, since if it returns undef it means that the call was unsuccessful and no process was spawned. This is something that can happen when the system is running too many processes and cannot spawn new ones.

This article was originally published on Feb 27, 2001
Page 1 of 8

Thanks for your registration, follow us on our social networks to keep up-to-date