We are commonly asked for best practice recommendations regarding canonical tags and hreflang tags. Here is a quick summary of how these tags should be implemented.
General rules for canonical tags:
- The canonical can be defined with a tag on the page, or passed as a header. Having this defined in both places is unnecessary, but can be helpful in strengthening/confirming the signal.
- The canonical URL is an absolute URL for the content displayed… this is static/permanent and associated with the content rendered, so the canonical will be consistent regardless of how the content is accessed.
- Every text/html url on the website presents a consistent canonical url… regardless of how the content is accessed.
- Query parameters should never appear in the canonical unless they are required to generate the unique page, at which point, only those parameters that are needed to generate the unique page should be included in the canonical url defined
- Any files (such as pdfs) that have text/html equivalents should have the canonical header referencing the text/html location
- The canonical URL should be used throughout the site on all internal links as much as possible. E.g. within sitemaps, navigation, plps, etc.. We should not have internal links pointing to URLs that canonical to a different location.
- If possible, we should redirect the user to the canonical URL if they have accessed the content “incorrectly”. A common exceptions to the redirects are for parameters, as they are often used for tracking and subtle changes to the page.
- The canonical should NEVER be generated by using the HTTP request, it should be generated by the server using rules or database entries for the specific content being served
General rules for hreflang tags
- The hreflang tag can be defined with a tag on the page, passed as a header, and/or included in a sitemap. The sitemap is typically used when the number of language variants are “cumbersome”, or there are technical challenges preventing the insertion on the page. Having this defined in all three places is unnecessary, but can be helpful in strengthening/confirming the signal.
- The language specified in hreflang must be ISO 639-1 format
- Regions should not be included in the hreflang tag unless we have multiple versions of the same language (e.g. If we have 2 English pages because we have a .com and a .ca, then we would define the .ca English page for Canada, and the .com would not include a region so it applies to all English speakers not in Canada)
- Any regions specified in hreflang must be ISO 3166-1 Alpha 2 format
- The alternate url specified in hreflang must be an absolute URL
- The alternate url specified should be the canonical url of the destination page, and should not pass through any redirects.
- All hreflang tags require return confirmation. For example, if you specify a French url from your English url, then you must also specify that same English url from the French url in order for this to work properly. The preferred method is to simply include all hreflang versions/code on all pages vs. logic that excludes the present page from listings.
- It is recommended to always specify hreflang=”x-default” with your primary English page. The exception is if you have a language selector page (typically not recommended).
This should cover most scenarios, but if I’ve missed one, please let us know in the comments!