Allow local Python script to parse protected website

Hello,

I am running few local (on same dedicated server as website) python scripts that scrape my website. It worked flawlessly until I added CloudFlare protection and now it stops this python script from scraping site using requests. This is how I scrape the website:

class AppURLopener(urllib.request.FancyURLopener):

version = "Mozilla/5.0"

opener = AppURLopener()
......
with opener.open(url) as response:

    data = response.read()

    text = data.decode('utf-8')

    bs = BeautifulSoup(text, features="html.parser")

How can I configure CloudFlare in such way that it allows parsing from this Python script without harming my website’s security? We’ve been hit with a pretty bad botnet level7 attack so it’s important for me that I don’t lose any security that CloudFlare provides. Bare in mind that this script is only run locally from the same dedicated server where the website is hosted and it scrapes only that website.

I am using CloudFlare’s free plan.

Thanks in advance!

I cheat by adding my hostname to /etc/hosts with the origin IP address. Local scripts hit the local server instead of Cloudflare.

Or you can try whitelisting your server’s IP address in the Cloudflare firewall.

Why would you do this in the first place? Simply access the data straight from the source. What you are doing right now is adding a lot of complexity for no reason whatsoever.

I added my server’s IP address in Cloudflare firewall but it still gets blocked.

I plan to update the script so it works that way in the future, but I need this to work in next 48 hours. Unfortunately I can’t make it in that time…

Use the Tools section of the firewall to Whitelist an IP address.

I did that too but it doesn’t help at all… Nothing changes…

Anything show up in the Firewall Events Log?

Again, it might just be easier to modify /etc/hosts

Inside Firewall Events Log it says that “JS Challenge” action is done every time I run the script.

What should I actually add inside hosts file? Not really familiar with editing it, sorry :blush:

Does that JS challenge match the IP address you Whitelisted?

/etc/hosts should have an entry for the hostname that Python script connects to:

example.com 12.34.56.78