i used live crawl in Google for https://www.shipmethis.com/robots.txt. Its showing a 403 error. I contacted the hosting support. They said cloudflare is blocking it. How to fix this.
If you’re seeing a 403 error without Cloudflare branding, this is always returned directly from the origin web server, not Cloudflare, and is generally related to permission rules on your server. The top reasons for this error are: 1. Permission rules you have set or an error in the .htaccess rules you have set 2. Mod_security rules. 3. IP Deny rules Since Cloudflare can not access your server directly, please contact your hosting provider for assistance with resolving 403 errors and fixing rules. You should make sure that Cloudflare’s IPs aren’t being blocked.
Cloudflare will serve 403 responses if the request violated either a default WAF rule enabled for all orange-clouded Cloudflare domains or a WAF rule enabled for that particular zone. Read more at What does the Web Application Firewall do? Cloudflare will also serve a 403 Forbidden response for SSL connections to sub/domains that aren’t covered by any Cloudflare or uploaded SSL certificate.
If you’re seeing a 403 response that contains Cloudflare branding in the response body, this is the HTTP response code returned along with many of our security features:
- Web Application Firewall challenge and block pages
- Basic Protection level challenges
- Most 1xxx Cloudflare error codes
- The Browser Integrity Check
- If you’re attempting to access a second level of subdomains (eg-
*.*.example.com) through Cloudflare using the Cloudflare-issued certificate, a HTTP 403 error will be seen in the browser as these host names are not present on the certificate.
If you have questions contact Cloudflare Support and include a screenshot of the message you see or copy all the text on the page into a support ticket.
Not familiar with the tool, but if it is claiming to be Google’s crawler but not coming from Google IPs assigned to Google then yes it would be blocked by your WAF if you have it enabled (you’d see a corresponding event in the event log). Really there’s nothing to fix if this is some kind of preview tool. A normal lookup of the robots.txt works just fine…