I’m looking to use transform rules to remove click ids (gclid, fbclid etc.) as these currently cause users from paid channels to bypass my full page caching.
I’ve set up a simplified version of my rule as follows, but can’t get it to kick in. Even switching to just a static query rewrite doesn’t seem to work. Is there something I’m missing? Transform rules never seem to kick in.
I am aware that I can use workers to do this, and I have a worker to do so, but it’s somewhat irritating to have to pay for every single request in front of the cache, even those without the query parameters, and transform rules seem designed to offer a simpler and more cost-effective solution.
Edit: This is on a business plan, which I understood does include regex functionality.
I not fluent in Regex, so I don’t see any problems with your substitution. @matteo and @sandro probably know a ton more about it than I.
To test, I’d start with a very basic Transform rule and start sending my own requests with that query string and Replace with nothing. Then step it up to a non-regex Dynamic replacement. It could just be the regex replacement itself.
May I ask, are you trying to match and catch (and remove them):
if having only ?gclid=something
and if having some before (without them), meaning the value &gclid=something
and also if having for example ?param=something&gclid=value&other=test,
Also, as an example, the below ones too? - as far as considering if Facebook could add it’s own too, in case if the URL would be shared and someone clicks on it via Facebook:
From above example, to make sure the first ?paramA=value stays as it could be a part of the application URL alongside if the URL is being shared on Facebook to keep it with the fbclid param, but being removed the part &gclid=value, before the second/third &fbclid=value.
Just making sure and thinking ahead, could be I am wrong here (if the URL wouldn’t be shared on Facebook).
Yes, the goal is to remove any query parameter that is only relevant to the front end. This would be at least fbclid, gclid and gclsrc. They could indeed be accompanied by other query parameters that do cause content to change and so need to be retained.
It’s similar to custom cache keys, when you want to exclude some query parameters but not others.
If it helps, here is a worker that does the same thing:
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request))
})
/**
* Respond to the request
* @param {Request} request
*/
async function handleRequest(request) {
const TRACKING_PARAMS = ['gclid', 'gclsrc', 'fbclid'];
const url = new URL(request.url)
TRACKING_PARAMS.forEach(param => {
if (url.searchParams.has(param)) url.searchParams.delete(param)
})
return fetch(url, request)
}