What is the name of the domain?
What is the issue you’re encountering
Can’t stop scraper with WAF
What steps have you taken to resolve the issue?
Robots.txt: disallowed gpt and other bots
WAF: blocked bunch of spiders, scrapers, IPs
Blocked access to sitemap
Tried rate limiting, but it is useless for period 10s when somebody just use link to scrape one post.
When I test scraper I can not see Agent-User or request IP in my server log so I can’t block it based on this.
WAF is not blocking referal when I tested it.
While even NewYorkTimes is not blocking it, this site Angi(dot)com is successfully blocking scraper and openai python script too. I am getting The error, net::ERR_HTTP2_PROTOCOL_ERROR. I am looking for solution how to implement this to wordpress+cloudflare so it will block ChatGPT module scrapers like Webpilot and others. I think it has something to do with headers and origins. What are the cloudflare options for this please?
Was the site working with SSL prior to adding it to Cloudflare?
Yes
What is the current SSL/TLS setting?
Full