There are many technical reasons for duplicate content, from session IDs and syndicated content to the utilization of printer-friendly pages and URL parameters.
Yet they all boil down to one thing: duplicate URL structures that lead to the same content, which can kill search performance. Here’s how to fight back.
While the ultimate best practice is site architecture that doesn’t create duplicate content, 301 redirects and rel= canonical are two best-practice solutions to reduce duplicate content and encourage Google and other search engines to crawl your web pages effectively.
Note: For a comprehensive understanding of URL redirects, take advantage of our Technical SEO Guide to Redirects.
What are 301 Redirects?
301 redirects help search engines and users find content that has moved to a new URL. It’s like moving and giving the post office a change of address: it shows that the content of the page has permanently moved somewhere else.
When to Use 301 Redirects:
- When users access your site through multiple URLs.
Let’s say your home page can be reached via http://site.com/home, http://site.com, or http://www.site.com. It’s important to pick one URL as your canonical (preferred) destination, and utilize 301 redirects to route traffic from the other URLs to the preferred URL. You can also use Google Search Console to set your preferred domain.
- You need a seamless transition when migrating your site to a new domain.
301 redirects can effectively redirect results from the old URLs to the new ones.
- You're merging two websites.
Use 301 redirects to link outdated URLs to the correct pages.
Avoid These 301 Redirect Mistakes:
- Using 302 instead of 301.
Often times developers implement a temporary 302 redirect instead of a permanent redirect, which make the link juice lose its flow. There are many important differences between 301 redirects and 302 redirects that you should know.
- Redirecting ALL pages to a single URL in a migration.
For example, if there is a site migration of 500 pages, each page should have its own 301 redirect to the relevant page on the new site. A common mistake is to redirect all 500 pages to a single URL, typically the homepage.
- Missing 301 redirects for EVERY iteration of your domain.
Since http://site.com and http://www.site.com create different versions of the URL, be sure to set up a redirect from all of the different iterations of your brand's domain. So, if your preferred domain is www.site.com yet you type or direct to site.com, you will get an error that this site doesn’t exist.
What are Rel=Canonical Tags?
The link rel=canonical tag, often called “canonical link” (or simply just "canonical tag"), is an HTML element that helps webmasters prevent duplicate content issues. A rel=canonical tag lets search engines know that certain similar URLs are actually one and the same. It does this by specifying the “canonical URL” as a preferred version of the page.
Like 301 redirects, canonical link elements are inserted into the http header of your page, but they lead search engines to the canonical URL, where the original content (that should be ranked) lives.
In other words, use rel=canonical tags to push search engines to your most-complete content (your canonical URL), when you have multiple URLs or multiple pages with similar content for the same topic or item.
Rel=canonical tags pass the same amount of link juice as a 301 redirect, and can be implemented quickly. Of course, it’s vital to determine your canonical URL first.
When to Use Rel=Canonical Tags:
- Use canonical tags when your site generates dynamic URLs, temporary URLs generated from specific user searches, for the same product or item as you might have on a product page.
For example: https://www.site.com/product?category=paper&color=white and https://www.site.com/paper/white/whitepaper.html
- When your blog system automatically creates multiple URLs when you post one post under multiple categories or sections.
Rel=canonical tags help search engines push the URLs of each category to your canonical URL, for example: http://www.blog.example.com/category1/blog-post-name and http://www.blog.example.com/category2/blog-post-name
- When your server is configured to load the same content for the www/non-www subdomain or the http/s protocol.
Rel=canonical tags help search engines push both of these URLs to your canonical URL, for example: http://www.site.com/example and https://www.site.com/example
- When you have a lot of syndicated content. Content syndication lets you drive traffic by sharing your blog on other sites, and publishing other blogs on your site.
Rel=canonical tags can push search results — from those various blogs, web feeds and the like — to your canonical URL.
Avoid These Rel=Canonical Mistakes:
- Writing absolute URLs as relative URLs.
An Absolute URL uses the entire address on the page that you link to. For example:
<a href = http://www.site.com/xyz.html>
A relative URL uses the “relative” path, not the full address. It assumes that the page you type in is on the same site. For example:
<a <href = “/xyz.html”>
If you mistakenly use a relative URL, the search engines might ignore your rel=canonical tag. So, be sure to specify the full absolute URL. Make sure you are familiar with the differences between absolute and relative URLs.
- Setting the canonical URL to the first page of a paginated series.
If you do this, the search engines will only index the first page – skipping the content on subsequent pages. It’s best to specify a URL that has all of the content on a single page.
For example:
SINGLE PAGE WITH ALL CONTENT
example.html?page=all
(arrows from PAGE 1, 2 and 3 point to SINGLE PAGE)
PAGE 1 CONTENT example.html?page=1
PAGE 2 CONTENT example.html?page=2
PAGE 3 example.html?page=3
Final Thoughts
It's always wise to double-check your work – especially if you're picking up a template, setting multiple rel=canonical tags to different URLs or using an SEO plugin with default rel=canonical tags (Yoast is a popular plugin option). At the end of the day, use caution when implementing 301 redirects and rel=canonical tags. Used correctly, they can turn duplicate content into seamless search results. Used incorrectly, they have the potential to impede search performance and harm your site and analytics performance. It’s always good to test on a small set of URLs first to make sure you get the visibility you want, before implementing either of these solutions across your site.