Googlebot 403 errors

,

What is the name of the domain?

What is the error message?

From Google Search Console: Page is not indexed: Blocked due to access forbidden (403)

What is the issue you’re encountering

Googlebot blocked with 403 error no matter what features are disabled

What steps have you taken to resolve the issue?

For troubleshooting, all features which can be turned off in a free Cloudflare account are turned off, aside from removing DNS redirection.

I have also tried adding an explicit WAF skip rule to allow (cf.client.bot)

I cannot find anything in the logs (such as Security Events) to indicate why the block is occurring.

In Google Seach console “Test Live URL” succeeds, however after scheduling a re-index and checking back later it reports “Googlebot smartphone” is blocked with 403 errors.

It would seem that “Googlebot Smartphone” is not recognised as a “known” search index crawler and is thus being blocked.

Hopefully someone has some suggestions as I am at the point of turning the whole thing off.

If the block is due to Cloudflare, you can find the reason in your security event log or analytics…
https://dash.cloudflare.com/?to=/:account/:zone/security/events
https://dash.cloudflare.com/?to=/:account/:zone/security/analytics

The block is definitely due to Cloudflare, unfortunately as I said nothing relevant shows up in the logs that you’ve linked to.

I don’t currently have any rules enabled - everything that can be turned off is turned off but GoogleBot is still blocked. If I do a test with curl using the GoogleBot user agent it is blocked:

curl -A "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" -I https://www.staloysius.org
HTTP/2 403
date: Sun, 07 Jun 2026 15:56:42 GMT
content-type: text/html; charset=iso-8859-1
server: cloudflare
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=63072000; preload
nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
cf-cache-status: DYNAMIC
report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=eu%2FB8W8VhKV5oLqD7J9rrKSZEkvdzdZCPF5pDZSdqYMzptduqLtRChvzF%2BJdzLWZXwxI0xcDb81Ce63FtPAg%2FBM1Uh8HPEMVlkjlk55izJ%2FzVqOvGy7%2FsPwsVsHtKgdgvuQHw7c%3D"}]}
cf-ray: a080d92bfdc3c722-MAN
alt-svc: h3=":443"; ma=86400

However despite this being blocked it does not show up in the log pages you linked to. (I had already run this test and checked the logs before posting)

I’ve tried many different user agents - it seems to blindly block any user agent with the word “bot” in it. For example, not blocked:

 curl -A "Google" -I https://www.staloysius.org
HTTP/2 200
date: Sun, 07 Jun 2026 16:00:20 GMT
content-type: text/html; charset=UTF-8
server: cloudflare
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=63072000; preload
set-cookie: PHPSESSID=7949de09f417c4010c73b6d936ae9856; path=/
expires: Thu, 19 Nov 1981 08:52:00 GMT
cache-control: no-store, no-cache, must-revalidate
pragma: no-cache
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
referrer-policy: strict-origin-when-cross-origin
permissions-policy: geolocation=(), microphone=(), camera=(), usb=(), payment=()
x-permitted-cross-domain-policies: none
cross-origin-embedder-policy: none
cross-origin-opener-policy: none
cross-origin-resource-policy: none
content-security-policy: frame-ancestors 'self'; object-src 'none';
nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
cf-cache-status: DYNAMIC
report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=s0m9jvDVb%2F2JLry6UO4SGLUgw24TFe%2F8pSAOJRNmLwTg7P%2BuqpSR4NBQYPRP3KT7%2Fr1S4U0CgZ%2FJ1W5cKH12ryIQHn4OtqhdF1tV1qqh%2FnuZGzwt9HwHmyL6g%2BPYs%2BVP%2B4YFsrI%3D"}]}
cf-ray: a080de7d9cee6ab0-MAN
alt-svc: h3=":443"; ma=86400

Blocked:

curl -A "bot" -I https://www.staloysius.org
HTTP/2 403
date: Sun, 07 Jun 2026 16:00:07 GMT
content-type: text/html; charset=iso-8859-1
server: cloudflare
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=63072000; preload
nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
cf-cache-status: DYNAMIC
report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=RYJIl0jH9d4oal2l%2FY7xv2F7FI72ptpA1SSVcKwFWEg2NgxeinawKqf%2BCsdw3L%2BwQwaMEpOciE6de5HVNQv50Y%2FMlxxSVuiC8vB%2ByJ%2BMvJP40YWOAgqVG0NBpS99d920uXU%2FXGI%3D"}]}
cf-ray: a080de310e35b328-MAN
alt-svc: h3=":443"; ma=86400

curl -A "Googlebot" -I https://www.staloysius.org
HTTP/2 403
date: Sun, 07 Jun 2026 16:00:15 GMT
content-type: text/html; charset=iso-8859-1
server: cloudflare
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=63072000; preload
nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
cf-cache-status: DYNAMIC
report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=BMwkVMzXIfcLqjPN%2F%2B99qRZvJN%2FBetMzA7pQGX%2FdL%2FY%2BM%2Flo8QIyURdY2wJmS7y24NbwJCbYucaDNtrZOwbHHvBRZayMjrP4yrEjrUYiA%2FWAHaGFeA0XtnmgiTiduuZmebRfu2w%3D"}]}
cf-ray: a080de5ed844a935-MAN
alt-svc: h3=":443"; ma=86400

All bot related features are disabled.

If I do a trace test specifying the Googlebot user agent (or any user-agent with the word bot in it) it matches no rules but is blocked by 403 anyway:

This can’t be normal ? Seems like a bug to me.

Every single page Google tries to index fails with 403, here is an example:

The live test always works, a reindex is queued, the indexing fails.

I assume that “Test Live URL” uses a different user agent or source IP that is not blocked, while the actual crawling bot is blocked.

Digging through AI crawl control security it appears that “Search Engine Crawler” is a subset of “AI” crawling settings, and the vast majority of queries from Microsoft and Google Search Engine Crawlers are being rejected:

It’s now looking like a malfunction of the AI crawler blocking system ? (Which I have never enabled or used.

Block AI training bots is set to “Do not block (allow crawlers)”.

Sometimes the best way to solve a problem is to explain it to someone else…

It looks like the webhost itself is filtering on user agent and generating 403 blocks. Doh!

This would explain why Cloudflare is not logging anything. I’ve notified them to have a look and hopefully that’s all it is.