block bad bots

 

How to block bad bots in WordPress

 

block bad botsBots”, is a short term  “robots” computer programs that “surf” multiple websites to perform a variety of automated tasks. . Examples of bots include those used by the search engines such as google bot or MSNbot. There are some good bots and bad one. Some bad bots go through your website looking for web forms and email addresses to send you spam. Other bad bots look for security vulnerabilities.

According to Wikipedia more than half of all web traffic is made up of bots. It is very important of the security of websites to block  as many  as possible of those bad robots form crawling your WordPress website.

There are two ways to block bad robots form Word press websites. The first one using robot.txt, and the other one  by modifing .htaceess  .

1- Block bad bots by editing robot.txt

robot.txt  a text file that is uploaded to the root, or main directory of a website. It contains  instructions  for the various bots and  that tells them how to index your website. For example you can disallow spiders to index certain files or directories.  A sitemap can be also be added to robot.txt that helps spiders index more pages of your website.

The is a basic robot.txt for WordPress

User-agent: *
 Disallow: /wp-admin/
 Disallow: /cgi-bin/
Sitemap: http://YOURWEBSITE/sitemap.xml

The above example robots.txt allow all bots  to index your website and  Denys all from  access to  wp-admin and cgi-bin directories .

How to bock bad bots by editing robot.txt

You can allow and disallow any robot or spider from  crawling  your website by simply adding two line of text for each bot or spider in robot.txt.

Here is an example of a robot.txt file that blocks 3 of the bad spiders:

# robots.txt
 # Begin block Bad-Robots from robots.txt
 User-agent: asterias
 Disallow:/
 # -------------------------------------
 User-agent: BackDoorBot/1.0
 Disallow:/
 # -------------------------------------
 User-agent: Black Hole
 Disallow:/
 # -------------------------------------

2- Bock bad bots by editing  .htaccess

.htaccess files are a simple ASCII text file with the name .htaccess.  it can be used to perform many task including the following

  • Customize Error pages.
  • Deny access to website
  • Password Protected directories.
  • Redirect visitors to another page.
  • … and more other things

in Order to block bad bots you need to know how to modify  .htaccess file, which is a hidden file on the server that can be used to control access to your website.

The following articles will help you understand about bots and how to block unwanted ones.

http://httpd.apache.org/docs/2.4/rewrite/access.html#blocking-of-robots

http://www.thesitewizard.com/apache/block-bots-with-htaccess.shtml

( warning – make a copy of your .htaccess file before making any changes to it.)

 

Leave a Comment