Solving Canonical Problems with WWW

One of the most common problems I see in websites is the same content being available at both the WWW and non-WWW versions of a domain. I’ve encountered this in nearly every website I’ve done an SEO audit for, and I see it every day when browsing the web. Despite it being so prevalent, it is indeed a problem.

Having the same content available on both the WWW and non-WWW versions of a domain (such as authoritylabs.com and www.authoritylabs.com) is called canonicalization. While you and I might realize they are in fact the same page, search engines mistake them to be unique pages.

Most of the time, search engines can figure out that they are the same page and only include the canonical URL in their index. SEObook explains the canonical URL as:

The canonical version of any URL is the single most authoritative version indexed by major search engines. Search engines typically use PageRank or a similar measure to determine which version of a URL is the canonical URL.

Regardless, canonicalization can result in indexing problems and duplicate content issues. Most importantly, canonicalization will split the link juice between each version as people link to and share both.

What you want to see is a redirection from the WWW to the non-WWW, or vice versa, so that if the wrong version is entered or linked to, the user is automatically taken to the canonical URL. Fortunately, this is relatively easy to set up.

Google Webmaster Tools

If you’ve verified your site with Google Webmaster Tools, you can set your preferred domain by going to Site Configuration > Settings, and selecting either ‘Display URLs as www.yourdomain.com’ or ‘Display URLs as yourdomain.com’.

This will make sure that Google only indexes your preferred canonical URL. However, it doesn’t fix the problem of splitting your link juice so you should still set up a redirect using one of the following methods.

 

Redirect Using .htaccess

If your site is hosted on Apache, you can redirect from the WWW to the non-WWW, or vice versa, with a few lines in your .htaccess file.

Redirect WWW to non-WWW:

RewriteEngine On
RewriteCond %{HTTP_HOST} !^(yourdomain\.com)?$
RewriteRule ^(.*)$ http://yourdomain.com/$1 [R=301,L]

Redirect non-WWW to WWW:

RewriteEngine On
RewriteCond %{HTTP_HOST} !^(www\.yourdomain\.com)?$
RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [R=301,L]

 

Redirect Using cPanel

aka The Lazy Way to Redirect Using .htaccess

If your website is hosted with a provider that uses cPanel, you can even set up your redirects without touching a line of code. This actually adds the redirect rule directly to the .htaccess file, but sometimes I’d rather not get my hands dirty. To do this, log in to your cPanel, and go to Redirects.

Redirect WWW to non-WWW:

Redirect non-WWW to WWW:

 

Redirect Using IIS7

With IIS7, there are actually two ways to do this. The URL Rewrite extension is required for this.

The first method involves adding the following as the first rule in the system.webServer section of the web.config file of the site in question.

Redirect WWW to non-WWW:

<rewrite>
   <rules>
      <rule name="www to non www"" enabled="true">
         <match url="(.*)" />
         <conditions>
            <add input="{HTTP_HOST}" negate="true" pattern="^www\.yourdomain\.com$"  />
         </conditions>
         <action type="Redirect" url=http://www\.yourdomain\.com/{R:1}” redirectType="Permanent" />
      </rule>
   </rules>
</rewrite>

Redirect non-WWW to WWW:

<rewrite>
   <rules>
      <rule name="non www to www" enabled="true">
         <match url="(.*)" />
         <conditions>
            <add input="{HTTP_HOST}" negate="true" pattern="^www\.youdomain\.com$" />
         </conditions>
         <action type="Redirect" url="http://www\.yourdomain.\com/{R:0}" redirectType="Permanent" />
      </rule>
   </rules>
</rewrite>

The second way is using the user interface of the URL Rewrite module. You can follow the steps outlined on Scott Forsyth’s blog. I suppose you could call that the lazy way to redirect in IIS7.

 

Redirect Using nginx

Nginx is starting to gain popularity due to lower overhead and higher performance than other servers. For the redirect, you will add one of the following to the top of your site’s config file.

Redirect WWW to non-WWW

server {
    listen 80;
    server_name www.yourdomain.com;
    rewrite ^/(.*) http://yourdomain.com/$1 permanent;
}

Redirect non-WWW to WWW

server {
    listen 80;
    server_name yourdomain.com;
    rewrite ^/(.*) http://www.yourdomain.com/$1 permanent;
}

 

Whether you’re on Apache, IIS, or nginx these methods really only take a few minutes to set up, so you really don’t have much of an excuse not to.

Photo: Fabrizio Sciami/Flickr

About Dawn Smith

Dawn Wentzell is currently working in custom mobile app development as Project Manager, Mobile Technology at SpeakFeel Corporation. She has experience with SEO for both local businesses and national markets, loves to do site audits and hates IIS hosting. You can find her at dawnwentzell.com or on twitter at @saffyre9.

Comments

  1. Thanks to Brian for the addition of nginx to this.

  2. RewriteEngine On
    RewriteBase /
    RewriteCond %{HTTP_HOST} ^www.example.com [NC]
    RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]

    RewriteBase / is the default. No need to specify it.
    You must escape literal periods in the RewriteCond pattern.
    The code fails to redirect non-canonical non-www URLs with a trailing period and/or port number.

    RewriteEngine On
    RewriteCond %{HTTP_HOST} !^(example\.com)?$
    RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

  3. RewriteEngine On
    RewriteBase /
    RewriteCond %{HTTP_HOST} ^example.com [NC]
    RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

    RewriteBase / is the default. No need to specify it.
    You must escape literal periods in the RewriteCond pattern.
    The code fails to redirect non-canonical www URLs with a trailing period and/or port number.

    RewriteEngine On
    RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
    RewriteRule (.*) http://www.example.com/$1 [R=301,L]

    Use example.com in blog posts. RFC 2606 reserves example.com example.net and example.org for this very purpose.

  4. Love it. Simple wrap-up of solutions. Bookmarked for future client implementations. Really nice work here.

  5. Hi,
    Great article.
    Have you got any information about how to redirect /index.html to a version without the /index.html?
    Preferred method of making these changes are through the CPanel.

    • Tom, that should be pretty easy through cPanel – on the Redirects page, select the 301 redirect and your domain, add index.html to the field after the slash. In the redirects field, enter http:// and your domain – with or without the www, however you want it to end up – and check off “Redirect with or without www.”

      Hope that helps!

  6. For URLs with the index filename mentioned in the path part of the HTTP request, use this rule to strip the filename in a redirect:

    RewriteEngine On
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.(php[45]?|html?)(\?[^\ ]+)?\ HTTP/
    RewriteRule ^(([^/]+/)*)index\.(php[45]?|html?) http://www.example.com/$1? [R=301,L]

    Normally a request for http://www.example.com/folder/ is internally rewritten to /folder/index.html via the DirectoryIndex mechanism in order to serve the content. However, this rewritten path matches the RewriteRule pattern and if there was just a RewriteRule the rule would redirect again. You do not want that to happen.

    To avoid this, you MUST also test THE_REQUEST in a preceding RewriteCond in order to be sure that the internal pointer is set to /folder/index.html because those things were in the original incoming external HTTP request and not because they have been recently set by a preceding internal rewrite. You MUST test THE_REQUEST otherwise you will end up with an infinite redirect-rewrite loop.

    The index redirect MUST be placed before the non-www/www redirect. Failure to do so invokes an unwanted multiple step redirection chain for index requests for the non-canonical hostname.

    The above code strips parameters too. It can be modified to not do so if you want.

    Finally, the RewriteEngine On directive must appear just ONCE in the .htaccess file, and it must be placed before the very first ruleset.

    Inform if there’s a problem. The above code was typed from memory.

  7. Anthony Baker says:

    Thanks you so much for making this so easy. I was looking at other ways of doing this and it was intensely complicated. This was a simple solution that worked beautifully! :-)

  8. Had been searching for a simple to the point article to refer to my client to make him understand the issues related to URL canonicalization and how to remove them. Found this article useful and have sent the client a link.
    Thanks for making it easy.

  9. server {
    listen 80;
    server_name yourdomain.com;
    return 301 http://www.yourdomain.com$request_uri;
    }

  10. If you are facing canonical issue in the single webpage then you can solve it by placing the following code in the Meta section of that targeted URL

  11. I know this is ancient technology, but how would I accomplish this redirect on a FrontPage website?

    • Hi Debbie,

      It shouldn’t matter what technology or platform you use to build your website. Regardless of what type of server you are on, your .htaccess or config files can be edited in the plain text editor on your computer (ie. Notepad). If you happen to be on IIS, you have edit it using the graphical user interface for the server.

      Keep in mind, these are pretty technical changes and even a small error in any of the redirects could cause errors on your site, so you may be more comfortable having a developer do this for you.