Looking at my site statistics I have been observing how for the past three months a bot scans my site multiple times a day. I don’t know the purpose, but I know that it is using a fake ISP name and user agent, companies running a bot for a good purpose, like research or archiving a site, don’t mislead webmasters with fake metadata data, I assumed this is a malicious bot and I obviously wanted it blocked.
The fake ISP that my analytic StatCounter software shows for this bot is labelled as “Merck & Co.”, a multinational pharmaceutical company, the operating system shows as Windows 8 and the browser is either Chrome or Firefox. At first glance everything looks fine, I only realised it is a bot because my blog has less than a hundred visitors a day and it was odd to see twenty daily visits from Merck & Co. when they are not even computer related. I dug deeper to see what URL it was visiting and it appears it scans all tags, I then looked at the IP host name, that is when it became clear to me that it was a bot with a fake ID, the hostname (reverse IP) belonged Amazonaws.com (Amazon Web Services), a cloud service renting cheap servers.
Every single Merck & Co.” hit scanning my site came from the same US Amazon data centre and they are using more than one server, they have thousands of IPs, although in the same range. On a side note, a month ago my Adsense account had a 1000% increase in earnings, it earned me $1000 in a single day when it normally earns me $1/day. It was obvious that a bot had been clicking on advertisements, and since this wasn’t me, I am assuming that a black hat hacker was trying to dry out Adsense funds out of a competitor.
As soon as I noticed it, I reported the scheme to Google myself to avoid any account suspension, you can report suspicious activity using your Google Adsense panel, the Adsense team never replied, I expected this, Google Adsense is known for never replying to their partners and treating them like garbage who don’t deserve a reply, the only reason for using them here is that for small traffic sites like mine they pay a little bit more than others.
There are two ways to block malicious bots from scanning your site, one of them is manually, it can be used if you have very few computer IPs you want to block and you don’t expect them to change. To manually block an IP from visiting your site download the .htaccess from your server using an FTP client like FileZilla, edit .htaccess in your computer, it will very difficult to do this with Notepad in Windows, Notepad has problems saving files with a dot (.) in front. To edit .htaccess use a proper editor for programmers, I recommend Notepad ++, it is the one I use and it is free (notice the ++ sign infront).
Add the following lines at the end of your .htaccess file, changing the listed IP address for the one you would like to ban:
# User IP Banning
<Limit GET POST>
deny from 18.104.22.168
allow from all
If you want to block more IPs, add more lines that say “deny from” followed by the offending IP, as many as you need. If you want to block a whole range use the line “Deny from 22.214.171.124/24” (notice the /24 at the end, that is not an IP, that is the 512 IP addresses). By the way, the IP example I am using is the real bot IP scanning my site.
A second way to block an IP range from hitting your website is using a security WordPress addon called Wordfence. This addon allows you see live traffic, it scans your website to find malware and it monitors changes to core WordPress files, Wordfence automatically blocks IPs attempting to login into the administrator page too many times, this stops brute force attacks, you can use this addon to block a single computer IPs, adding it manually, or you can block a whole IP range, that is what I did to stop the fake “Merck & Co” bot.
After installing Wordfence in your blog go to “Advance Blocking” and enter the IP range you would like to stop from visiting your site separated with a hyphen. You can learn the bot IP range by looking at the site analytics software, in this case the range I blocked was 126.96.36.199 – 188.8.131.52 and 184.108.40.206 – 220.127.116.11 those IPs belong to Amazon servers in Woodbridge, USA. And if you ever change your mind, click on “Delete blocking pattern” and the IP range will be able to access your site again.
Wordfence also gives you the choice of blocking an specific browser or user agent, but it will not be useful against malicious bot, like “Merck & Co.”, the user agent is fake. You are also able to block a whole country from visiting your site and setting up double login authentication using your smartphone, but those are paid features. For the small hobbyist webmaster, the free version of Wordfence is enough to protect you, even if no bot scans your site, it can protect you in other ways.