Cloudflare Automatic Platform Optimizations (APO) strips UTM tag from URL

,

This is such a specific case, which requires multiple separate specific cases…
Let’s see if I can share a case @aseure_cf can reproduce himself without sharing any of my private domains.

You need a path, no end /, analytics query parameters and no custom query params.

For brevity I will use just one of the utm_* params, one is enough to reproduce the issue.

These will work just fine:

https://example.com/?gclid=test
https://example.com?gclid=test
https://example.com/path/?gclid=test

https://example.com/?utm_source=test
https://example.com?utm_source=test
https://example.com/path/?utm_source=test

https://example.com/?custom=test
https://example.com?custom=test
https://example.com/path/?custom=test
https://example.com/path?custom=test

These will, too (even though they will be re-ordered):

https://example.com/?gclid=test&custom=test
https://example.com?gclid=test&custom=test
https://example.com/path/?gclid=test&custom=test
https://example.com/path?gclid=test&custom=test

https://example.com/?utm_source=test&custom=test
https://example.com?utm_source=test&custom=test
https://example.com/path/?utm_source=test&custom=test
https://example.com/path?utm_source=test&custom=test

These will not:

https://example.com/path?gclid=test
https://example.com/path?utm_source=test
2 Likes

The reason it was working in my DevTools is because of the Disable Cache checkbox being checked. This was injecting a Cache-Control: No-Cache header which hid the issue. Without the header, I’m able to reproduce.

Keeping you updated.

2 Likes

Thanks for taking this head on!
I’ll try to replicate the issue on a smaller website see if it occurs too and will give you an update also…

Hi again…Activated APO “add on” for a smaller website…
Please check video olt_apo_utms_bug.mp4 - Google Drive

What I noticed, is that it only loses utms when I use “?utm_source”, if I use something like “?index” it doesn’t lose the utm… behaving as I “expected”

So, when accessing https://onlinetips.com/?index=1231 it redirects to https://www.onlinetips.com/?index=1231 and keeps query parameters.

If acessing https://onlinetips.com/?utm_source=1231 it will go to https://www.onlinetips.com losing query parameters.

I realised this behaviour above, yes.

1 Like

I’ve since set an APO test site and can confirm the suggested Transform Rule (Rewrite URL) works with APO and can be used to prevent the redirection in the case pointed by @casey9 and @matteo: with a path other than home page, no trailing slash, and either utm_ or gclid QS parameters.

When:

(http.host eq "apo.example.com" and not ends_with(http.request.uri.path, "/") and http.request.uri.query contains "utm_") or (http.host eq "apo.example.com" and not ends_with(http.request.uri.path, "/") and http.request.uri.query contains "gclid")

Then:

Path > Rewrite to > Dynamic > concat(http.request.uri.path, "/")
Query > Preserve

Alternatively, for @mejorainfotech and those whose redirects are caused for legacy reasons, like some old paths that are redirected to new paths, you can use Redirect Rules with Preserve Query String checked, instead of letting WordPress handle the redirects at the origin.

If that’s for some reason not desirable (you ran out of the redirect quota allowed for your plan level, or you want your origin server to log those requests, for instance), you can set a different Transform Rule (Modify Response Headers) to add back the query string to the Location header.

When:

(http.host eq "apo.example.com" and http.request.uri.path in {"/hello-world" "/old-path1" "/old-path2"} and http.request.uri.query contains "utm_") or (http.host eq "apo.example.com" and http.request.uri.path in {"/hello-world" "/old-path1" "/old-path2"} and http.request.uri.query contains "gclid")

Then:
Modify Response Header
Set > Dynamic >
concat(http.response.headers["location"][0],"?",http.request.uri.query)

1 Like

Hi,
If I am understanding correctly, this is a “band aid” on top of a APO malfunction no?

Should APO behave as exposed here, discarding query parameters when there is a redirect of some sort?

By the way, the beginning of this thread I mentioned that even when there is no redirect the query parameters are being lost with APO active since the 26 of February … because I saw on my google analytics dashboard my traffic with utms falling hard that day…
Also, it appears to only happen when it has “utm” in the query parameter, why?

❯ curl -sI "https://onlinetips.com/?utm_source=facebook"
HTTP/2 301 
date: Mon, 20 Mar 2023 17:12:51 GMT
content-type: text/html; charset=UTF-8
location: https://www.onlinetips.com/
cf-ray: 7aaf9518988f9501-LIS
cache-control: max-age=3600
expires: Mon, 20 Mar 2023 18:12:52 GMT
vary: Accept-Encoding
cf-cache-status: MISS
cf-apo-via: origin,resnok
cf-edge-cache: cache,platform=wordpress
x-redirect-by: WordPress
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=8FLvhgiqPtTaW7AXlDy9Jp6y3FBu3dKc32bP%2F7%2Fpa2pWfcb9iwqxAK9Zhm%2B3EmURrqK6N6qqapZfTW0MYJj%2FjZIXg%2BfB8xmiYiddTfWIU7LU0t9Jk9r3aZ1kxpqvarceHuAo%2BjTKMyfB%2FBSL0g%3D%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
server: cloudflare
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400


~
❯ curl -sI "https://onlinetips.com/?index=teste"
HTTP/2 301 
date: Mon, 20 Mar 2023 17:13:03 GMT
content-type: text/html; charset=UTF-8
location: https://www.onlinetips.com/?index=teste
cf-ray: 7aaf9563685e4894-LIS
cache-control: max-age=3600
expires: Mon, 20 Mar 2023 18:13:03 GMT
cf-cache-status: BYPASS
cf-apo-via: origin,qs
cf-edge-cache: cache,platform=wordpress
x-redirect-by: WordPress
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=jkN9TgQ%2FiG9HU2zjtR27X3lxI6Y%2FI%2F0wDEx8%2F6BbD5cMMYQhal6pvJYmSNa6%2BXa0CqttpmJggk9sDM78FRh63YTd1VZL2Tcmdtfk59k77iiYF44ANi49RBE%2BQX9JIzBsNv0gfVGfZtnCI0WgvQ%3D%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
server: cloudflare
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400

I set up the Transform Rule recommended by @cbrandt, however it looks like the redirect & removal of URL params is still happening when APO is enabled.

Transform Rule


However, changing the rules to trigger when the hostname equals my normal domain (e.g., “example.com”) instead of “apo.example.com” seems to work.

The suggested Redirect / Transform Rules are meant to enable the immediate activation of APO, while Cloudflare engineers look into the issue. Whether it’s a malfunction, I don’t know. I’ve only recently enabled an APO test site and therefore I haven’t witnessed its workings before the date you mentioned.

Just because a certain behavior started recently on your case (and a few other cases reported here in this topic and elsewhere in this community), it doesn’t necessarily mean that it’s an APO issue. Cloudflare proxies over 25 million websites. If about 1/3 of them are WordPress, in tandem with the reported global presence of WP, we would have seen hundreds, if not thousands, of complaints coming to this community. We’ve only seen less than a dozen.

As an example: A recent update of a plugin in one of my sites caused all URLs to lose the trailing slash, and I only learned about it when Google Search Console started complaining. I had to rewind the plugin to a previous version and wait for a fix from the plugin developers. If that site had been on APO, it would have started losing all utm_ query string parameters, but it wouldn’t have been APO’s fault.

That’s the idea. It should match the domain under which you have APO enabled. I just used apo.example.com as an example, in case you have more than one installation under your domain, but with APO in just one of them.

As a matter of fact, if you have all you zone under APO, you can simplify the rule and remove the reference to hostname.

I’m glad it’s working!

Hi,

Sorry, my words were probably too harsh on the “band aid” … thank you for all the effort you put on this to try circumvent issue while CF engineers look into it.

If pointed on the correct direction I can try and debug or analyse more in depth … I left the curl above, one responds with a miss, other with bypass… trying to understand why.

haha not a problem at all, I use this word all the time to refer to temporary fixes. I’m just not certain it will be temporary this time around. It may be the case that Cloudflare APO team decides that the reported behavior is expected, we’ll have to wait an see.

That’s explained by @michael’s response above and the documentation it points to.

1 Like

All good!

Thanks, of course the response above… I checked and to test did a curl

❯ curl -sI "https://onlinetips.com/?ref" | egrep '^HTTP/|cf-|location|content|cache'
HTTP/2 301 
content-type: text/html; charset=UTF-8
location: https://www.onlinetips.com/
cf-ray: 7ab781b439c55d33-LIS
cache-control: max-age=3600
cf-cache-status: MISS
cf-apo-via: origin,resnok
cf-edge-cache: cache,platform=wordpress
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=eaeqpuX8LQJ%2B%2FnMvF8aGAUDnkdyU78hZ5MU%2B%2BC4z1J6RxM5tJMF06U89Y7GSJ2raORj9UceVshB98sO8V47vxyQVLJtoPG5hR%2FVTfosbSDBBE8X6uqJKxoc4YFCENVKJf%2Btgn6BXbZovID6CZQ%3D%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
❯ curl -sI "https://www.onlinetips.com/?ref" | egrep '^HTTP/|cf-|location|content|cache'
HTTP/2 200 
content-type: text/html; charset=UTF-8
cf-ray: 7ab78276781c94f4-LIS
cache-control: max-age=2592000
cf-cache-status: HIT
cf-apo-via: tcache
x-cache-handler: cache-enabler-engine
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=qQFaalA%2B7azqdq%2FPhOc1efLr29LoDR%2BWb4ZGeEZWQCIlbLsCo6a%2Fi9Pb3Wj5gNJdtwJ%2Bchl5VkEIdAEwfi%2B6bLP%2BTrWcg5S7QG%2FAso0nFmZNyduCrQDItpGODxe6WBuonZG7f7ynHagngMzqIKxYOGE%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}

“ref” being of the cached ones and if there is a redirect from “non-www” to “www” it goes from HIT to MISS

When trying “wordpress” one of the query parameters that bypass cache

❯ curl -sI "https://www.onlinetips.com/?wordpress" | egrep '^HTTP/|cf-|location|content|cache'
HTTP/2 200 
content-type: text/html; charset=UTF-8
cf-ray: 7ab78ac0ce1d489d-LIS
cache-control: max-age=2592000
cf-cache-status: BYPASS
cf-apo-via: origin,qs
cf-edge-cache: cache,platform=wordpress
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=G1KZddj0ISZu24mrMwcp0vN44VPfaKzR9SoP7G%2F8r5AEq0kpKnzzi7MdlmhbwU0efnYybD5TMmcc4BQ28rqKkq2MSaA1aNFn%2B%2B79UiLhk%2FJZpRCDd6SV%2BJW%2FS8%2FAxDY9%2BaxXMxq%2FCOiVnY2YkIM5Xu0%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}

~
❯ curl -sI "https://onlinetips.com/?wordpress" | egrep '^HTTP/|cf-|location|content|cache'
HTTP/2 301 
content-type: text/html; charset=UTF-8
location: https://www.onlinetips.com/?wordpress
cf-ray: 7ab78af22e0bda7e-LIS
cache-control: max-age=3600
cf-cache-status: BYPASS
cf-apo-via: origin,qs
cf-edge-cache: cache,platform=wordpress
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=uTRtPN56rJSY3hIvuI8NjMSIgG8puyxmJ3EYKqnGZNlEB4NbSHR2Ee8DIsV%2Bo%2BUGAo9Roq2sHsOOfCPI1%2Fve37k0pi5CyPAZIBculElsuu0mLiJj2oewQFwolWPy9Saplnzetyu2VMgoc%2FP5oQ%3D%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}

it is always BYPASS… even with the redirect of “non-www” to “www”.

I will look into your Transform Rule for the smaller website with APO enabled.

All expected behavior. The 301 has a shorter TTL and therefore you’re more likely to get a MISS. The 200 gest a HIT, as it’s actually fetching the cache from example.com/ and not example.com/?ref.

wordpress is not on the list of query parameters, but on the second list, of cookie prefixes. These cookies will lead to BYPASS, as expected.

Okay, understood… but then, on the first example with “?ref”, all expected behaviour, but it loses the query parameters on the redirect. Should it?

Because with “?wordpress”, like you said it’s on the second list of cookie prefixes, it keeps the query parameter even with the same redirect…

Monday bump

@aseure_cf did you had any chance to link into this again?

Hi again,

So I went ahead and disabled APO on the smaller website.
Cleared cache and uninstalled CF plugin on wordpress.

Losing the utms with a redirection just stopped when doing this.
Leaving curl for reference

curl -sI "https://onlinetips.com/?utm_source=facebook"
HTTP/2 301 
date: Wed, 29 Mar 2023 17:10:22 GMT
content-type: text/html; charset=UTF-8
location: https://www.onlinetips.com/?utm_source=facebook
expires: Wed, 29 Mar 2023 18:10:23 GMT
cache-control: max-age=3600
x-redirect-by: WordPress
vary: Accept-Encoding
cf-cache-status: DYNAMIC
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=vSHlqrA0xMlyDgaU643Oal%2Ftz60tfI5%2BUGUCBlc%2FuziKxKr0hUS0idHarsxfhCLMbqU4FefoAL3z8nfZWMmZUQW%2F3Ht0dityI4zOgsgbEqP2DC3D1pZH2CTgVF0psCHs4ymMETSRiIAH3yDOtQ%3D%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
server: cloudflare
cf-ray: 7af9b9d8f98b9501-LIS
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400

We don’t have an update for you yet, but I wanted you to know we are still reviewing and looking at this.

3 Likes

I was able to talk to the team they are conitnuing to investigate. We likely will have no updates until next week though.

3 Likes

Hey @ncormier , should we expect some news this week still?
Best,