Is my site currently blocking crawl bots like google? (newbie)

Hi I’m new in cloudflare. I just wanted to confirm is my site https://www.make-invoice.com/ currently blocking crawl bots especially main search engines like google?

Reason I asked. On site audit in semrush it says their bot is currently blocked. so I’m confused as per my robots.txt it allows all. https://www.make-invoice.com/robots.txt

By default, Cloudflare doesn’t block known good bots such as Semrush. Have you changed any firewall settings here?

1 Like

This is my current setting in firewall

and for firewall rules. I haven’t used that yet. for manage rules. nothing as well. I’m on free plan btw.

@sdayman

also not sure how to do with this?

If the Cloudflare firewall is blocking Semrush access, it would show up in the Firewall Events Log in the Firewall -> Overview tab. If you “Download Log” from Semrush, does it give you more detailed information? Hopefully an HTTP response code, like 403.

This is the download log.

IP addresses of Semrush Bot ( Site Audit ): 46.229.173.66, 46.229.173.67, 46.229.173.68

REQUEST:
curl -i -sS -L --proto-redir -all,http,https --max-time 5 -A ‘Mozilla/5.0 (compatible; SemrushBot-SA/0.97; +http://www.semrush.com/bot.html)’ http://www.make-invoice.com/

RESPONSE:
HTTP/1.1 301 Moved Permanently
Date: Mon, 07 Sep 2020 02:02:03 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Cache-Control: max-age=3600
Expires: Mon, 07 Sep 2020 03:02:03 GMT
Location: https://www.make-invoice.com/
cf-request-id: 0507e59814000028bfa78fb200000001
X-Content-Type-Options: nosniff
Server: cloudflare
CF-RAY: 5cecd86cecae28bf-IAD
alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400
HTTP/2 406
date: Mon, 07 Sep 2020 02:02:04 GMT
content-type: text/html; charset=iso-8859-1
set-cookie: __cfduid=d70a428c8c38bf11c77dbee5c9efe37811599444123; expires=Wed, 07-Oct-20 02:02:03 GMT; path=/; domain=.make-invoice.com; HttpOnly; SameSite=Lax; Secure
cf-cache-status: DYNAMIC
cf-request-id: 0507e5984c0000c1755f8f0200000001
expect-ct: max-age=604800, report-uri=“https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct
strict-transport-security: max-age=15552000; includeSubDomains; preload
x-content-type-options: nosniff
server: cloudflare
cf-ray: 5cecd86d497ec175-IAD
alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400

Not Acceptable!

Not Acceptable!

An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.

@sdayman

  1. For whatever reason, it’s trying to crawl the HTTP version of your site, so that output showed a 301 (Permanent) redirect to HTTPS. Hopefully Semrush will figure that out.
  2. The error message shows it’s coming from Mod_Security, like in this post:
1 Like

Thank you for this! I will connect with SEMRush.

@roshan.kumar1 Let me know when you already resolved your issue I think we have same problem. Thank you.