Google Search Console is reporting “blocked due to other 4xxx issue” when attempting to access https://guardiandigital.com/%7B%=file.url%%7D
I’ve created an apache rewriterule that redirects to the root, and it works from the command-line, but GSC is continually failing on this URL. My guess would be the double percent signs as some type of security violation? How do I bypass this?
I think the problem is that it’s only possible to review a 24-hour period so it’s very difficult to identify any issues.
Google says the last crawl date was the 13th, so I checked on the 13th and 14th and only found two “bot challenges” on that date that appear to be related to Google, but aren’t related to this specific URL. I’m not even sure these are legitimate Google requests, as it’s only based on the user agent and both IPs are from AWS, not google’s cloud.
I’ve now experimented with a number of WAF rules, including even setting up a rule that bypasses “google” wildcard user agents, but after trying the GSC live test for this specific URL, I don’t see any entries in the cloudflare logs or in my apache logs, but it’s still failing.
I’ve also tested other URLs in GSC that are failing and I see activity in my apache logs immediately.
During all these tests, I’ve cleared the cloudflare cache and also enabled development mode during these tests.