Using Cloudflare to establish canonical URLs
31 May 2023 • ~1100 words • ~5 minute read
I look after a few websites for actual projects and organisations that require a proper web presence. One such example is Opera dei Lumi (ODL), which is the example I'll be using here.
I'd been aware for a wee while that the wider web (search engines, basically) identified the root domain (operadeilumi.org.uk
) and its classical www
subdomain version as separate sites, showing every page twice. For example, the about
page existed at both https://operadeilumi.org.uk/about
and https://www.operadeilumi.org.uk/about
. Yes, they would resolve to the same IP address in the end, and the right content was served up regardless of entry point, but this duplication is not ideal.
I occasionally use the free SEO tool on Seobility, which gives very detailed and useful information about how I can improve SEO and other content, as well as reporting on performance (how many media, CSS or JS files are downloaded, server configuration summary, etc). For what it's worth, in my own personal stuff I don't really care a great deal about SEO, but ODL is a registered charity with a mission for public engagement, so it actually does matter in this instance!
This has flagged up the issue a number of times, highlighting that the website "uses URLs with www and non-www subdomain" and that it might cause duplication and bad links to the website. At the very top of the report, marked as Very important! and in a big red box, it recommended:
Use 301 redirects to drive traffic to URLS with the same domain and sub domain (www and non-www subdomain).
I looked into this a bit more and discovered that this is a common issue. The solution is to establish a canonical URL,1 as this provides stability and consistency both to users and search engines.2
So, I wanted to make operadeilumi.org.uk
(the root domain) canonical, and make it so that any and all requests to www.operadeilumi.org.uk
get redirected to operadeilumi.org.uk
.
Checking DNS records
The site initially had an A
record for www
, pointing to the same IP as the A
record for root domain.
At first, I thought I could sort this problem by deleting the A www
record and replacing it with a CNAME
record.3 CNAME
records point to domains, not IP addresses, so I was going from
A operadeilumi.org.uk <IPv4 address>
A www <IPv4 address>
to
A operadeilumi.org.uk <IPv4 address>
CNAME www operadeilumi.org.uk
This made no difference at all: traffic going to www.operadeilumi.org.uk
was still getting content, no problem, but it was also still acting via www
, when I wanted it to use the root domain. CNAMEs
are effectively aliases, so under the hood it basically worked the same as when it had two A
records, providing two valid entry points to the site.
DNS ultimately maps domains to IP addresses, but it can't implement redirects as it's not really got capabilities for controlling HTTP methods and status codes. So, trying to fix this using DNS was a bit foolish!
Implementing 301 redirects in Cloudflare
The solution to this was permanent redirects.4 Usually I would want to do something like this at the server level (in nginx or Apache), but on this particular occasion there were a few reasons I wanted to avoid doing it this way.
As this domain's DNS records were already managed by Cloudflare, it was the next logical step to make use of their Page Rules, which I've never needed to use before.
I've always found Cloudflare's documentation to be pretty decent on the whole, but in this instance I didn't even need to go to the docs! Cloudflare have a short tutorial on this exact topic.5
When I initially went to the 'Redirect Rules' portion in the dashboard, I was confronted by a few dropdown menus. These can be ignored in favour of their expression editor, so I lifted the example expression from the tutorial. (Copy and paste often tends to be the solution 😜). I did have a wee read through their documentation on expressions as it seems a useful feature, with lots of scope for more complex scenarios.
In this case, the expressions were simple: when inbound requests match this expression (http.request.full_uri contains "www.operadeilumi.org.uk")
, then create a response using the following settings:
TYPE: Dynamic
EXPRESSION: concat(
"https://",
"operadeilumi.org.uk",
http.request.uri.path
)
STATUS CODE: 301
The http.request.uri.path
handles the relative path that follows the domain (for example, the about page is located at /about
), so any traffic targetting specific pages will still land on the page they were aiming for. One could probably just join the first two segments (https://
and operadeilumi.org.uk
), but it seemed neater and more sensible to have the expression handle the protocol, domain and URI path explicitly.
After saving that rule, I tried Seobility's redirect checking tool for checking redirects in more depth and without the SEO stuff. This checks the following four variations of your domain:
https://www.example.com
https://example.com
http://www.example.com
http://example.com
If they all resolve to the same domain (https://example.com
, for example, though the desired target can be specified by the user), then happy days - I've managed to correctly establish a canonical URL.
This process probably took no more than 5 minutes. After about a month like this, I've noticed search engines now only serve a single result per URI. The privacy-respecting analytics tool used on ODL's site now only displays visitors on each page once, instead of one each for www
and non-www viewers.
Resources and references
See Google's definition of canonicalization if this is a new concept to you. Quick definition, where potential duplicates exist, there should be an unambiguous winner.
CNAME
stands for Canonical Name record. See Cloudflare's explanation for use cases.
HTTP status code 301
indicates a permanent redirect, which is what I want in this case as this is never going to change. Other redirect status codes exist. See MDN documentation for more details.
Cloudflare have duplicated their article (word for word) in Cloudflare Fundamentals and Learning Paths, so take your pick 😀