Allow wayback machine to crawl my site that is using cloudflare

What is the name of the domain?

www.hillbilly-music.com

What is the error number?

Unknown

What is the error message?

Website is not accessible via this address (www.hillbilly-music.com)

What is the issue you’re encountering

wayback machine cannot take a snapshot of my site.

What steps have you taken to resolve the issue?

I have enabled the ‘always on’ - but cloudflare keeps saying it is not accessible. I am tempted to just take cloudflare protection off the site for a day to just let archive.org take a snapshot. But I know I risk a massive BOT attack which is why I implemented Cloudflare. Has anyone using Cloudflare been successful in letting archive.org take a snapshot of a site using Cloudflare? This is rather frustrating.

Hi Dave,

You can implement a security rule to allow traffic from the Internet Archive to bypass your rules.

Security → Security Rules

Add a custom rule.
Field : Verified Bot Category
Operator: Equals
Value: Archiver
AND
Field : User Agent
Operator: Equals
Value: archive.org_bot

The final rule should look like:

(cf.verified_bot_category eq “Archiver” and http.user_agent eq “archive.org_bot”)

You can further refine this rules to only include know Archive.org IPs,…

Chose action: Skip and select what parts you want to skip.

This topic was automatically closed after 15 days. New replies are no longer allowed.