Does CloudFlare cache search crawlers?


#1

My origin server needs to generate content for search crawlers (like googlebot/bingbot etc) in order for SEO to work properly. The origin server determines if it needs to prerender the content or send it as is, based on the user agent in the request.

My question is: Does CloudFlare serve cached content for requests from the search crawlers?
If it does, this would mean we run a risk fo serving non-prerendered content to the bots. In that case, is there an easy fix?


#2

Can you clarify the question. It sounds like the content here isn’t being cached as it is rendered at the origin per user agent, so based on what you describe googlebot would work the same way as other user agents retrieving content.


#3

We have a single page app built in Angular. It could be aggressively cached with “Cache Everything”.

However, single page apps aren’t always processed correctly by all search engine crawlers (or twitter/facebook etc). Therefore we want to generate a static version of the page at the origin if one of these crawlers hit the page. But we don’t want to spend the resources doing this unless it’s a crawler.

Now here’s the problem. What if a regular user hits the site, and the normal dynamic version is sent from the origin and cached by CloudFront. Then after that a web crawler hits the page.
Will the webcrawler get the previously cached version?

Is there an option to always let webcrawlers (you have all their IPs from what I understand) bypass the cache or cache them separately?


#4

Ah thanks for the clarification. Today there isn’t a way using page rules or other cache settings to bypass cace for a certain set of IPs or user agent. It is probably possible to achieve this using Cloudflare Workers though.


#5

This topic was automatically closed after 14 days. New replies are no longer allowed.