dcsimg

evolt.org: Using Apache to stop bad robots


Download the authoritative guide: Data Center Guide: Optimizing Your Data Center Strategy

Download the authoritative guide: Cloud Computing: Using the Cloud for Competitive Advantage

"We're going to use the environment variable features found in Apache to fight our battle, specifically the 'SetEnv' directive. This is a simple alternative to mod_rewrite and almost everything needed is compiled in to the webserver by default. In this example, we're editing the httpd.conf file, but you should be able to use it in an .htaccess file as well."

"... The 'SetEnvIfNoCase' simply sets an enviornment (SetEnv) variable called 'bad_bot' If (SetEnvIf) the 'User-Agent' string contains Wget, EmailSiphon, or EmailWolf, regardless of case (SetEnvIfNoCase). In english, anytime a browser with a name containing 'wget, emailsiphon, or emailwolf' accesses our website, we set a variable called 'bad_bot'. We'd also want to add a line for the User-Agent string of any other Spidert we want to deny." We're going to use the environment variable features found in Apache to fight our battle, specifically the 'SetEnv' directive. This is a simple alternative to mod_rewrite and almost everything needed is compiled in to the webserver by default. In this example, we're editing the httpd.conf file, but you should be able to use it in an .htaccess file as well.

This article was originally published on Sep 20, 2001
Page 1 of 1

Thanks for your registration, follow us on our social networks to keep up-to-date