** 7/15/08 Turn off the security script while you do the WP 2.6 update
Now your WP Security plugin only catches bad bots if you tell it who the bad bots are. This is my current list. You can just copy and paste it into the banished agent list and hit the update button and it will add them to your list.
8/28/08
</script>
<SCRIPT>
AnotherBot
botpaidtoclick
Click Bot
cr4nk
curl
DA 5.3
DataCha0s
discobot
EBM-APPLE
EmailSearch
EmailSiphon
FAST ESP Document Retriever
Firefox 2.0
Ginxbot
GrubNG
gvfs
HTTrack
Incutio
Indy Library
Internet Explorer
Internet Ninja
Java
JetBrains
libcurl
libwww-perl
lwp-request
lwp-trivial
Macintosh; I; PPC
Microsoft Data Access
MJ12bot
Mozilla Firefox 5.0
Mozilla/4.0(compatible
Mozilla/4.08
Mozilla/4.61 (Macintosh
Mozilla/5/0(compatible
Mozilla/7.0
Mozilla/8
Mozilla/Firefox
Mp3Bot
MSIE6
NIPGCrawler
PEAR
PECL
PHPot
Provider Protocol Discover
PuxaRapido
PycURL
Security Kol
Site Sniper
Sogou
sun4m
Sunrise
syncrisis
topicblogs
User-Agent
W3CRobot
w:PACBHO60
WebDav
WebRipper
Wget
window.location
Winnie Poh
www.ranks.nl
X12R1
More information:
Perishable Press ( has a bot block list and other information )
Spiders and bots to block ( long list )
Top 10 Spam bots to block
Top web robots comment spammers
Harvester user agents
Spider identification
Notes:
9/8/08
I caught a new scraper today. “FAST ESP Document Retriever”
8/28/08
Interestingly I had an attack by syncrisis.com who tried to run the script in the user agent section rather than as a request. So I’m adding <SCRIPT>, </script>, window.location, syncrisis to the user agent field.
8/4/08
I have a user who tells me Mozilla/4.08 is a legit phone browser. You might not want that one on sites likely to visited by cell phones.
8/1/08
I added lots of bots today. Python-urllib, AnotherBot, Mozilla/9, Mozilla:, PuxaRapido, SiteSucker, newLISP, yourname were all added for not identifying themselves by url or email and not using robots.txt. bot@bot.bot, PHP/5 had no id and excessive hits, Test was banned for stupidity, Atomic_Email_Hunter, Jakarta, LeechGet, libwww-FM, WWW-Mechanize, and core-project were all banned for attempted badness.
7/23/08 I added fake browsers Mozilla/8 and Mozilla/Firefox to the list. I also added the W3CRobot. It is an open source webcrawler that can be used for good or evil. One of them hammered my personal website so I’m banning it. Do as you choose. Also I added topicblogs. Seems they have scraped lots of websites and all they say is coming soon. No way to tell if they are good guys or bad guys so I put them on the block list.
7/20/08 Lots of bad guys this week: ‘Indy Library’ appears to be an unidentified image grabber, sun4m, EBM-APPLE, both tried cross site script attacks, EmailSearch is an email scraper, NIPGCrawler and W3CRobot appear to be scrapers.
7/10/08 Bandwidth is down 1/3 on websites, number of human visitors is up. So the bad robots are getting filtered. I hadn’t realized how much bandwidth they took up. I found this bot trying cross scripting attacks
Macintosh; I; PPC
7/9/08 Lots of scrapers this week.
DA 5.3
Internet Ninja
7/6/08 Busy, busy little bots: I added 4 new ones to the list
Mp3Bot
gvfs
WebRipper
discobot
7/3/08
AVG is yet again hiding under fake user agents; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1), User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1), User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)
If you have ‘User-Agent’ on your list as I do you will be blocking AVG Toolbar users who prescan websites. This is not AVG’s first run in with webmasters. Most of the older security programs blocked the last user agent which is also malformed. ( there is no space between 5.1; and 1813 ). I personally am leaving it blocked. All of you should make your own decisions. More information is here: AVG disguises fake traffic as IE6, also see How to beat AVG’s fake traffic spew.
Another concern is that ‘User-Agent’ in the user agent string is used by one of the top all time forum and blog spam bots. Unblock AVG and you unblock the spam bot. But that is why I left the bots in your hands. Block or unblock them as it suits you.
7/2/08 Winnie isn’t so cute, caught Winnie Poh trying to hack WP
Winnie Poh
7/1/08 Fake web browser
Mozilla Firefox 5.0
I’m also seeing several entries that have the bulk of the user agent zero’d out.
Mozilla/5.0 (000000000; 0; 00000 000 00 0 000000; 00000) 000000000000000000 0000000 0000 000000 0000000000000 0000000000000
So far I have not seen bad behavior from this user agent so I am undecided on whether or not to ban it.
6/25/08
I added in MJ12bot for hammering the site.
MJ12bot
6/25/08 Version 1.7 of the security plugin prevents the webserver from banning itself so be sure to block this user agent now.
Incutio
6/21/08
mozilla/5.0
Mozilla/4.61 (Macintosh
6/20/08
Mozilla/4.08
lwp-trivial
6/19/08
Blocking WordPress also blocks wp-cron so don’t use that one. There is also a website scraper that uses that user agent. So if you are not using cron jobs, block it, but keep an eye on it. I’ll try to find another way to block the scraper that uses that as a user-agent. You’ll know by the ip number whether it is you or a scraper being blocked.
You can banish WordPress/2.3, WordPress/2.5, WordPress/4.0 and any other versions other than the WP you are using.
Many webmasters ban ‘larbin’ and ‘Jakarta’ I have not yet had trouble with either, so I am not currently banning them.
6/18/08 New Additions: ( I am not blocking Firefox or IE these are fake user agents I’m still testing this list will add to main list if no problems tomorrow )
Internet Explorer
Firefox 2.0
Mozilla/4.0(compatible
Mozilla/5.0(compatible
WordPress
6/16/08 Many webmasters are having problems with AVGs out of control bot. Should you wish to block it, I am not, add the following bot to your block list:
Windows NT 5.1;1813
6/13/08 New Additions:
EmailSiphon
Microsoft Data Access
WebDAV
Click Bot
PHPot
lwp-trivial
Did you accidentally trap a Google Bot or Yahoo? I haven’t caught Google yet, but the Yahoo bot is not especially bright and sometimes gets stuck. First verify the ip numbers Robot ip numbers and be sure you caught the real thing, not a fake. Then just remove its ip number from the ip banished list.
Or do just lookup the ip number and see if it is used by who it claims to be.
I have xmlrpc.php in my robots.txt file as Disallowed. Both YahooSlurp and the Amazon zermelo ignored that and were flagged because they attempted to crawl that file. I just removed their ips from the ip list. In the future let’s hope they read the robots.txt file.
3 responses so far ↓
1 tygern8r // Jul 22, 2008 at 5:40 pm
I love your tools! Browsing through my Security Logs has become a daily event now, thanks to your hard work. I do have a request, though. Is there a way to whitelist IPs? Google bot keeps getting caught and so does one or two from MSN. I even managed to ban myself once. Thanks again for the fantastic tools!
2 ljmacphee // Jul 22, 2008 at 8:12 pm
If you are comfortable with php yes, barebones directions are in the comments section here:
Security plugin
If not drop a note or another comment and we’ll figure something out.
3 tygern8r // Jul 24, 2008 at 8:18 pm
I read through all the comments, but didn’t pick up on the directions. Did I miss something obvious? I’m somewhat comfy with php, so if it’s easy for you to give me a pointer, I’d appreciate it. Seems like I could just have it make an ip whitelist table in the database and have it check there first, if ip exists, then exit.
You must log in to post a comment.