Is it possible to block meta searches and crawlers using your firewall or other rules? If so, can you show me an article? thanks
You could try with the following firewall rule expression.
(http.user_agent contains "facebookexternalhit") or (http.user_agent contains "facebookcatalog")
Blocking ASN 32934 might also work, though you might even block Facebook employees.
You could also find out if they honour robots.txt and use that as well.
Thanks, but I meant more like google search engines and skyscanner search engines
What exaclty do you want to achieve?
I want that crawlers from google, skyscanner or tripadvisor dont crawl my page - but rather get blocked while access.
Well, for starters, I would use robots.txt.
And if you really want to block something you can use cf.client.bot
from Fields reference · Cloudflare Ruleset Engine docs, respectively you will need to find out how all the automated systems you want to block identify themselves and then configure blocks based on that data.
But again, I’d start with robots.txt.
Thanks - do you have a hint to identify the crawlers of a specific engine?
Can be IP address, ASN, user agent. You need to check your log files for that.
thanks!
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.