I am frequently asked by clients to advise on the issue of the spoofed bot activity and whether it could affect their search visibility at all.
And the answer is – yes, it can. Quite immensely actually.
Think about it, if you base crucial decisions on the bot activity – and you should - and it happens not to be genuine, then those ideas will most likely lead to outcomes far from what you’ve desired.
And that’s just one of the examples.
So, here’s more information on the spoofed bot activity and the negative effect it has on your SEO.
Let’s begin.
Not every bot visiting your site comes from a search engine, although many will disguise themselves as such. For example, one site we’ve analyzed saw an upsurge of bots pretending to come from Bing.
(Screenshot showing a spoof bot activity report)
Some of these bots aim to learn more about your site or to steal your content. Others might be trying to hack the site or cause other damage.
MaMa Casper worm is an example of the latter. This bot scans sites built on Joomla, a popular CMS framework, for vulnerabilities. And once successful, infects them with malicious code.
Your competitors might be sending bots too, to scrape your prices, for example. Then, there are the notorious spam bots that try to place their links on your site.
Which pretty much suggests one thing - most spoof bots visit your site with some dishonest intent.
And they are many.
According to the data by Imperva:
You already know that fake bots could harm your website, infect malicious code or even undermine your value proposition by allowing competition to undercut your prices. They could have a devastating effect on your SEO as well.
Specifically, they can:
So, let’s look at each of those in turn.
As the above data by Imperva proved, stealing content is the most common bot activity.
Such crawlers scrape website’s content and republish it without you even realizing it, leading to duplicate content issues and a potential loss of rankings as a result.
Every server has an allocated bandwidth – the amount of data it can process in any given month. And once it reaches the limit, a website becomes inaccessible until the new quota.
In other words, even a single resource, if it consumes too much data could take a website offline for some time.
Enter DDoS bot attack. The acronym stands for Distributed Denial of Service, and the strategy aims to make websites inaccessible by overwhelming them with traffic.
And according to the data, these attacks are more common than you might think. For example:
Now, in such instance, not only users cannot access your site but crawlers too.
Page load time affects search visibility in so many ways – from directly improving (or decreasing) rankings to providing the foundation for good usability.
Many spoof bots target resources that can slow down your site’s speed, in turn, affecting your visibility.
Finally, unknowingly using spoof bot activity to plan your technical SEO improvements will inadvertently lead to bad decisions about your visibility.
For example, such bots might add variables to URLs that don’t exist, leading to a spike of non-existent 404 pages in crawls, suggesting problems with Javascript or classes, for instance. However, in reality, such problems don’t exist.
The challenge? Spoof bots can be quite hard to spot.
The simplest way to examine bot activity is through a log file analysis. But as these examples from Jeff Starr show, spotting a spoof bot there isn’t that easy.
Consider this request:
At first glance, it looks like a Google user agent, with seemingly right referrer and other criteria.
So, what gives it away? Actually, two things:
The other method is to use a log file analzyer that automatically verifies IP addresses and other criteria to identify spoof bots.
For example, here’s the report from Bot Clarity showing spoof activity on a site over the course of a single day:
As you can see, within only 24 hours, three specific fake bots made a total of 1,400 requests to the site. Digging deeper into the data, we can see that they accessed 2xx pages, however, encountered some redirects.
Finally, analyzing the actual list of bots, we can see where they came from:
All this is an invaluable data to move technical and crawl-related strategies further.
Here’s what you should do to avoid any of the above issues – monitor for spoof bot activity regularly and forward their IP addresses to your network team to block those bots from being able to access your site.
To improve your technical SEO decisions, however, assess only how real bots access your site, and base your site audit crawls on their behavior only.
Spoof bots can harm your search visibility in many ways – make your site inaccessible to crawlers, slow it down or steal its content. And the only way to minimize their impact is by regularly identifying them and blocking their IPs at a network-level.