So I run a favicon fetcher service (icon [dot] horse). It’s mostly used by people developing apps or websites to grab an icon for a site they have a link to. For example, we have one client that runs a private knowledge base service and they use the icons to visually identify links. I get this data (often) from the site’s HTML, where <meta> tags contain the relevant info.
Lately, more and more of my scraper’s requests hit Cloudflare bot protection - and since no icons are present on that page, the result is blank, and my API serves a fallback image.
I was wondering what can be done about this situation - I realise that bot protection is a good thing and that its actually doing its job properly and detecting my scraper as a bot.
But I also would love to be able to get favicons.
So is there a way I can solve this issue? Some specifics:
I don’t make a request to the same domain often - I cache the result for some time
I currently display a user agent that identifies my bot, but I’ve also tried using a real browser user agent before
The IP of my service is static - it doesn’t change
You may want to submit an application to have your bot added to Cloudflare’s known good bots directory. You can read about the requirements and how to apply here: Verified Bots Policy · Cloudflare bot solutions docs.
Hi @AlphaK the issue isn’t so much that my bot is being challenged on my own site(s), it’s that the bot is being challenged on other people’s sites. I’m guessing I cannot set up IP access rules for other people’s sites. Sorry if I’m misunderstanding the answer.