evolt.org: Using Apache to stop bad robots
September 20, 2001
"We're going to use the environment variable features found in Apache to fight our battle, specifically the 'SetEnv' directive. This is a simple alternative to mod_rewrite and almost everything needed is compiled in to the webserver by default. In this example, we're editing the httpd.conf file, but you should be able to use it in an .htaccess file as well."
"... The 'SetEnvIfNoCase' simply sets an enviornment (SetEnv) variable called 'bad_bot' If (SetEnvIf) the 'User-Agent' string contains Wget, EmailSiphon, or EmailWolf, regardless of case (SetEnvIfNoCase). In english, anytime a browser with a name containing 'wget, emailsiphon, or emailwolf' accesses our website, we set a variable called 'bad_bot'. We'd also want to add a line for the User-Agent string of any other Spidert we want to deny."