Google Search Console will report some pages/links Blocked due to access forbidden (403)
What steps have you taken to resolve the issue?
Occasionally, Google Search Console will report some pages/links Blocked due to access forbidden (403).
In the past, they were few and far in between so I tolerated them, but recently I am getting these reports with sometimes up to 59 pages “blocked” in a single report.
So, I finally decided to investigate it thoroughly today before posting my issue here.
On Cloudflare, for my domain, I visited this page:
Security > Analytics
Then picked the following filters:
Edge status code equals 403 Forbidden
Source ASN equals 15169 (Google)
Country equals United States
Mitigated equals No
I found 13 records.
Please see attached part screenshot of 1 record. (Source IP address is definitely Googlebot)
Please verify if your Super Bot Fight mode is set up to block Verified bots or if Block AI Bots is enabled. This would block known good bots like Googlebot.
12 hours later, there appears to be no changes. The report is now showing 73 records of Googlebot being blocked, with nearly 40 within the last 12 hours, or since I made the changes.
Because all Cloudflare features that issue a 403 should show up in the events log. I don’t think the same is true for the Mitigation field in the Analytics tab.
Are you maybe using any Workers? Or any Cloudflare optimization settings like Automatic Signed Exchanges, AMP Real URL, Early Hints, Smart Hints etc?
Because all Cloudflare features that issue a 403 should show up in the events log.
These are Googlebot IPs. So, they are verifiedbots. In my Custom Rules, it is the no. 1 rule to detect verifiedbots and action is set to Skip (WAF features or disables specific Cloudflare security products for matching requests.)
So in Security > Events page, these IPs just appear under the Skip action Events summary category and NOT anywhere else.
If you’ve got the “Block AI bots” feature enabled at Cloudflare, then if the User-agent contains GoogleOther despite it is on the Verified Bot List, it would be detected and blocked or challenged for a reason.
Could be fake Googlebots as well from Google ASNs such as Google Cloud Platform.
Check Security tab → Events to determine which path are they trying to visit therefrom determine which Cloudflare service was triggered and did blocked the reqeust coming from them despite you’ve configured your Custom Rule to SKIP.
If they appear as SKIP, great, meaning your DNS records are proxied and your Custom Rule is working as expected.
Despite you SKIPped Verified bot, the Analytics still shows 403 for fake Googlebots, or the ones coming from Google ASN and Google IPs I am afraid.
Mind you share a screenshot or fields you’ve used for you filter?
These are some bots, verified, others not from coming from Google (their ASN and their IP addresses):
So from above, I have quite a lot of Googlebots which I really don’t like and better they’re blocked or challenged from scarping and crawling my Website, generating unwanted traffic.
Regarding Google Search Console, I’d re-check for my robots.txt file, possible sitemap.xml file which I am might be missing, otherwise robots meta tag at my Website if it’s blocking something, despite the Cloudflare Security options available to me which I would tune-up to allow only real Googlebot crawl my Website (no AI Googlebot).
Under ‘IP Access Rules’ for my domain I have ASN AS396982 Google LLC - which I believe is GCP - set to action: Managed Challenge.
Anyway, all the IP addresses appearing here in my issue/report are all verified Google bot IPs. I have even verified them manually, just to be sure, by running reverse and then forward DNS lookups on them.
Yes, they all appear as “SKIP”. Everyday Googlebot visits my site to the tune of 10s of thousands of requests successfully, so this is just affecting a fraction typically but certainly a “slowly growing issue” that is affecting my domain that I wanted to bring up. See the Google Search Console report for one site, for example:
Security/Analytics does not equal Security/Events, which is the problem.
Note: “IP” has changed name to “Source IP”, since the screenshots were taken.
That way is what both @GeorgeAppiah mentioned, and what @Laudian contributed further to above, with the Magic Link that takes you directly to the correct location.
The above (and following) Magic Link will take you directly to the correct place, where you can do a such search: