This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of Moz, Inc.
In this article, we’re going to learn how to create the rel canonical URL tag using Google Tag Manager, and how to insert it in every page of our website so that the correct canonical is automatically generated in each URL.
We’ll do it using Google Tag Manager and its variables.
Why send a canonical from each page to itself?
Javier Lorente gave us a very good explanation/reminder at the 2015 SEO Salad event in Zaragoza (Spain). In short, there may be various factors that cause Google to index unexpected variants of a URL, and this is often beyond our control:
External pages that display our website but use another URL (e.g., Google’s own cache, other search engines and content aggregators, archive.org, etc.). This way, Google will know which one is the original page at all times.
Parameters that are irrelevant to SEO/content such as certain filters and order sequences
By including this “standard” canonical in every URL, we are making it easy for Google to identify the original content.
How do we generate the dynamic value of the canonical URL?
To generate the canonical URL, dynamically we need to force it to always correspond to the “clean” (i.e., absolute, unique, and simplified) URL of each page (taking into account the www, URL query string parameters, anchors, etc.).
Remember that, in summary, the URL variables that can be created in GTM (Google Tag Manager) correspond to the following components:
We want to create a unique URL for each page, without queries or anchors. We need a “clean” URL variable, and we can’t use the Page URL built-in variable, for two reasons:
Although fragment doesn’t form part of the URL by default, query string params does
Potential problems with protocol and hostname, if different options are admitted (e.g., SSL and www)
Therefore, we need to combine Protocol + Host + Path into a single variable.
Now, let’s take a step-by-step look at how to create our Page URL Canonical variable.
1. Create Page Protocol to compile the section of the URL according to whether it’s an http:// or https://
Note: We’re assuming that the entire website will always function under a single protocol. If that’s not the case, then we should substitute the Page Protocol variable for plain text in the final variable of Step #4. (This will allow us to force it to always be http/https, without exception.)
2. Create Page Hostname Canonical
We need a variable in which the hostname is always unique, whether or not it’s entered into the browser with the www. The hostname canonical must always be the same, regardless of whether or not it has the www. We can decide based on which one of the domains is redirected to the other, and then keep the original as the canonical.
How do we create the canonical domain?
Option 2.1: Redirect the domain with www. to a domain without www. via 301 Our canonical URL is WITHOUT www. We need to create Page Hostname, but make sure we always remove the www:
Option 2.2: Redirect the domain without www. to a domain with www. via 301 Our canonical URL is WITH www. We need to create Page Hostname without www (like before), and then insert the www in front using a constant variable:
3. Enable the Page Path built-in variable
Note: Although we have the Page Hostname built-in variable, for this exercise it’s preferable not to use it, as we’re not 100% sure how it will behave in relation to the www (e.g., in this instance, it’s not configurable, unlike when we create it as a GTM custom variable).
4. Create Page URL Canonical
Link the three previous variables to form a constant variable:
Page Protocol://Page Hostname CanonicalPage Path
Protocol: returns http / https (without ://), which is why we enter this part by hand
Hostname: we can force removal of the www. or not
Path: included from the slash /. Does not include the query, so it’s perfect. We use the built-in option for Page Path.
How can we insert the canonical into a page using Tag Manager?
Let’s suppose we’ve already got a canonical URL generated dynamically via GTM: Page URL Canonical.
Now, we need to look at how to insert it into the page using a GTM tag. We should emphasize that this is NOT the “ideal” solution, as it’s always preferable to insert the tag into the <head> of the source code. But, we have confirming evidence from various sources that it DOES work if it’s inserted via GTM. And, as we all know, in most companies, the ideal doesn’t always coincide with the possible!
If we could insert content directly into the <head> via GTM, it would be sufficient to use the following custom HTML tag:
<link href=”Page URL Canonical” />
But, we know that this won’t work because the inserted content in HTML tags usually goes at the end of the </body>, meaning Google won’t accept or read a <link rel=”canonical”> tag there.
var c = document.createElement('link');
c.href = Page URL Canonical;
And then, we can set it to fire on the “All Pages” trigger. Seems almost too easy, doesn’t it?
How do we check whether our rel canonical is working?
Very simple: Check whether the code is generated correctly on the page.
How do we do that?
By looking at the DevTools Console in Chrome, or by using a browser plugin like like Firebug that returns the code generated on the page in the DOM (document object model). We won’t find it in the source code (Ctrl+U).
Here’s how to do this step-by-step:
Click on the first tab in the console (Elements)
Press Ctrl+F and search for “canonical”
If the URL appears in the correct form at the end of the <head>, that means the tag has been generated correctly via Tag Manager
That’s it. Easy-peasy, right?
So, what are your thoughts?
Do you also use Google Tag Manager to improve your SEO? Why don’t you give us some examples of when it’s been useful (or not)?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!