Best way to strip some parts of the query string

Hello all!

I’m not aware of all the changes made in Cloudflare in the last months, I tried to search for the right answer but I’m not sure of it, so I’d rather ask.

For caching performances, I’d like to strip some of the query string parts off my urls before it goes through Cloudflare cache and to my origin server. Basically I want to keep only ?amp and ?page= (which are not called at the same time). Every other thing is useless for my origin and shouldn’t go through.

What is the most clean and efficient way to do that? A worker? A transform rule?

Thanks in advance!

Greetings,

Thank you for asking.

Could you provide an example URL what you have and other what you want? :thinking:

My best guess is you’re using WordPress and Google AMP plugin.
Therefore, having it configured as https://example.com/article-title-here/?amp and wanting to exclude those /?amp URLs from the Cloudflare Cache? :thinking:

Are you using a Page Rule with Cache Level: Cache Everything feature, which would be why you’re asking how to “strip out”?

Using a Page Rule (or two) you could exclude (Bypass from Cache) those URLs.

Could be done using the Transform Rule, yes.

So, you want to “strip out” or “bypass from cache” those particular URLs with that query string? :thinking:

However, could be I am wrong.

1 Like

Hello fritex, thanks for the answer!

It’s not Wordpress, we use to be on Drupal and now we’re on Rails, but we had to keep the old ?amp format :grimacing:

So we have the amp articles like : https://mydomain.com/my-article-slug?amp
And we have the multiple pages of taxonomy page, like : https://mydomain.com/top-news?page=23

I want to keep all of them in Cloudflare cache like the rest of the website, but some of the taxonomy pages can be heavy on load with a few thousands articles in them, and I can see on the logs of my website that we have hits coming through Cloudflare cache because there are other query string parts (like the utm_ ones for Analytics, or ads stuff) that are being added to the path.

So what I’d like to do is strip everything but ?amp and ?page before it’s hitting Cloudflare cache.

1 Like

Hey there,

As @fritex mentioned, a transform rule should work for what you are trying to accomplish, specifically the Rewrite URL transform rule. In order to accomplish such behavior however, you will need to have your domain on a Business plan or above as it would require you to use regex to filter out the unwanted query strings.

If you do have access to a business plan for your domain, below is an example rewrite URL rule I had created to filter out Facebook related tracking query strings. You will need to modify this example rule to fit with the query strings you want to filter out.

When incoming requests match...
(http.request.uri.query matches "(&?(utm_source|utm_medium|utm_campaign|fbclid|fb_action_ids|fb_action_types|fb_source)=[a-zA-Z0-9_]+)*")


Then... Query → Rewrite to...
Dynamic: regex_replace(http.request.uri.query,"(&?(utm_source|utm_medium|utm_campaign|fbclid|fb_action_ids|fb_action_types|fb_source)=[a-zA-Z0-9_]+)*","")

With this rule in place, the following happens to a matching request:

Starting Query string → Final Query String
?page=4&utm_source=1111111&utm_medium=222222&fbclid=3 → ?page=4

I would suggest first trying out the rule in a staging/development enviroment and confirming that the transform rule works correctly before applying it for your entire domain to avoid potential issues with your live website.

For more information on the regex_replace function, I would suggest taking a look at this document:

1 Like

Hi @tech167, your topic has a solution here.

Let us know what you think of the solution by logging in and give it a :+1: or :-1:.


Solutions help the person that asked the question and anyone else that sees the answer later. Login to tell us what you think of the solution with a :+1: or :-1:.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.