Getting 403 forbidden in my good bot (link checker)

I built an app (good bot) that checks for broken links in my customers Help Centers. The help centers are hosted by Zendesk and uses Cloudflare as an added protection.

The app works by scanning each help center article, getting all the links, and checking them if they are broken or not (http call returns an error message). The app is written in Python and hosted on AWS EC2.

In the last few days the app started getting the error 403 forbidden for most of the links, which rendered it useless. This is only when I try to run it from the production server. I get the same error if I try to access the links using curl.

However, everything works fine when running the app (or curl) from my local laptop. Therefore I’m suspected somehow the production app/server was blacklisted. I tried changing the IP address but getting the same issue.

Any help is higly appreciated.
Thank you

Take a look at the Firewall Events Activity Log. You should see some entries for your bot. Click on one to see which Cloudflare setting is triggering the block.

Thank you for responding.
I don’t have access to those logs, or to the Cloudflare settings for the pages I’m getting the error for.

My bot is a 3rd party tool that scans and detects broken links (ie 404s) in Zendesk-hosted Help Centers. The 403 error is occurring for many of these links (Zendesk or not) that point to pages protected by Cloudflare.

Weird thing, it only started to happen few days ago without any code/configuration changes on client side.

Any idea what it might be?

Looked at those suggestions but none of them help.
The links my bot is getting the 403 error for open fine in the browser, or when running the on my local laptop. However the error occurs when running in production, hosted by AWS EC2.

You don’t have access to the Cloudflare account that’s blocking your bot? If not, then that makes it impossible to debug from this end.

I don’t think the issue is caused by settings in a specific Cloudflare account.
Problem is occuring when the bot tries to access links from other domains, not just Zendesk.

Did some more investigation, I believe Cloudflare started to block requests coming from AWS EC2 IP addresses. When calling the links from my home laptop using curl or the bot, they work. When calling from an EC2 VM (tried different AWS accounts), I get the 403 error.

Anybody knows how I could get in touch with Cloudflare to whitelist my bot as a good bot? I filled out this form months ago but heard nothing back from them.

That form is the only option. Not everybody who asks is approved. In fact, it’s extremely rare considering how many of these requests they get.

thanks, so I need to keep insisting then :slight_smile:

