I am frequently asked by clients to advise on the issue of the spoofed bot activity and whether it could affect their search visibility at all.

And the answer is – yes, it can. Quite immensely actually.

Think about it, if you base crucial decisions on the bot activityand you should - and it happens not to be genuine, then those ideas will most likely lead to outcomes far from what you’ve desired.

And that’s just one of the examples.

So, here’s more information on the spoofed bot activity and the negative effect it has on your SEO.

Let’s begin.

How Spoof Bots Work

Not every bot visiting your site comes from a search engine, although many will disguise themselves as such. For example, one site we’ve analyzed saw an upsurge of bots pretending to come from Bing.

ex3.png

(Screenshot showing a spoof bot activity report)

Some of these bots aim to learn more about your site or to steal your content. Others might be trying to hack the site or cause other damage.

MaMa Casper worm is an example of the latter. This bot scans sites built on Joomla, a popular CMS framework, for vulnerabilities. And once successful, infects them with malicious code.

Your competitors might be sending bots too, to scrape your prices, for example. Then, there are the notorious spam bots that try to place their links on your site.

Which pretty much suggests one thing - most spoof bots visit your site with some dishonest intent.

And they are many.

According to the data by Imperva:

  • 3% of sites suffer from spoof bot attacks of some kind.
  • 21% of those bots claim to be Googlebot.
  • The vast majority of such bots post comment spam and also steal website content.

How Spoof Bots Harm Your Visibility

You already know that fake bots could harm your website, infect malicious code or even undermine your value proposition by allowing competition to undercut your prices. They could have a devastating effect on your SEO as well.

Specifically, they can:

  • Steal and repost your content
  • Overload your site, making it inaccessible
  • Slow your site’s load time
  • Skew your analytics, leading to bad decisions on your part

So, let’s look at each of those in turn.

#1. Stealing content

As the above data by Imperva proved, stealing content is the most common bot activity.

Such crawlers scrape website’s content and republish it without you even realizing it, leading to duplicate content issues and a potential loss of rankings as a result.

#2. Overloading the site

Every server has an allocated bandwidth – the amount of data it can process in any given month. And once it reaches the limit, a website becomes inaccessible until the new quota.

In other words, even a single resource, if it consumes too much data could take a website offline for some time.

Enter DDoS bot attack. The acronym stands for Distributed Denial of Service, and the strategy aims to make websites inaccessible by overwhelming them with traffic.

And according to the data, these attacks are more common than you might think. For example:

  • More than 2000 daily DDoS Attacks are observed worldwide,
  • And 30% of website downtime incidents are the result of DDoS attacks

Now, in such instance, not only users cannot access your site but crawlers too.

#3. Slowing the load time

Page load time affects search visibility in so many ways – from directly improving (or decreasing) rankings to providing the foundation for good usability.

Many spoof bots target resources that can slow down your site’s speed, in turn, affecting your visibility.

#4. Skewing analytics

Finally, unknowingly using spoof bot activity to plan your technical SEO improvements will inadvertently lead to bad decisions about your visibility.

For example, such bots might add variables to URLs that don’t exist, leading to a spike of non-existent 404 pages in crawls, suggesting problems with Javascript or classes, for instance. However, in reality, such problems don’t exist.

The challenge? Spoof bots can be quite hard to spot.

Identifying Spoof Bots

The simplest way to examine bot activity is through a log file analysis. But as these examples from Jeff Starr show, spotting a spoof bot there isn’t that easy.

Consider this request:

ex1.png

At first glance, it looks like a Google user agent, with seemingly right referrer and other criteria.

So, what gives it away? Actually, two things:

  • The request – starting off the crawl from an unusual URL
  • IP Address – hardly one that would be associated with the search engine. In this case, coming from China.

ex2.png

The other method is to use a log file analzyer that automatically verifies IP addresses and other criteria to identify spoof bots.

For example, here’s the report from Bot Clarity showing spoof activity on a site over the course of a single day:

ex4-2.png

As you can see, within only 24 hours, three specific fake bots made a total of 1,400 requests to the site. Digging deeper into the data, we can see that they accessed 2xx pages, however, encountered some redirects.

ex5.png

Finally, analyzing the actual list of bots, we can see where they came from:

ex6.png

All this is an invaluable data to move technical and crawl-related strategies further.

What to Do with Spoof Bot Data?

Here’s what you should do to avoid any of the above issues – monitor for spoof bot activity regularly and forward their IP addresses to your network team to block those bots from being able to access your site.

To improve your technical SEO decisions, however, assess only how real bots access your site, and base your site audit crawls on their behavior only.

ex7-1.png

Closing Thoughts

Spoof bots can harm your search visibility in many ways – make your site inaccessible to crawlers, slow it down or steal its content. And the only way to minimize their impact is by regularly identifying them and blocking their IPs at a network-level.