I have been using Google Search Console and Yandex Webmaster Tools for my site for years without a hitch (including more than 2 months under CF). Google- and Yandex-supplied HTML files are used for the site verification. Today I received a message from Yandex saying that my site verification has been reset and I need to re-verify my ownership of the site. The Yandex file is in its place and its content is correct. When I open the file from my Chrome browser I see the correct content. But here’s what Yandex bot sees when it opens the file according to Yandex:
## Response "https://art.nouveau.world/yandex_b96d4f3bc492cfb4.html" → Yandex.Webmaster bot
|HTTP status code|200 OK|
|Server response time|78 ms|
|IP address|104.21.41.220|
|Encoding|UTF-8(unicode-1-1-utf-8, UTF8)|
|Page size|571 B|
* Date: Sun, 04 Apr 2021 06:30:09 GMT
* Content-Type: text/html
* Transfer-Encoding: chunked
* Connection: keep-alive
* Set-Cookie: __cfduid=d0531ac3745222ace74718543f16039051617517809; expires=Tue, 04-May-21 06:30:09 GMT; path=/; domain=.nouveau.world; HttpOnly; SameSite=Lax; Secure
* X-Content-Type-Options: nosniff
* Last-Modified: Wed, 27 Feb 2019 16:22:56 GMT
* Cache-Control: max-age=7776000
* Expires: Sat, 03 Jul 2021 06:30:09 GMT
* Vary: Accept-Encoding,User-Agent
* CF-Cache-Status: DYNAMIC
* cf-request-id: 093d2c2866000076a9e50be000000001
* Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
* Report-To: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report?s=1pTQ7JgrT%2Bs4P2jqovaNduCtZkIl%2BpLF1B8eYFi73UlDQu8J90f3Ul4nB%2BFLWecU6OTvL7%2Fq5quDBeHebqc9B34mmixLqj85Ru4SSWv%2BuHtzuw%3D%3D"}]}
* NEL: {"report_to":"cf-nel","max_age":604800}
* Server: cloudflare
* CF-RAY: 63a87c870bc076a9-DME
* Content-Encoding: gzip
* alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400
Page content
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<script async src='/cdn-cgi/bm/cv/6**********7/api.js'>
</script>
</head>
<body>
Verification: b**************4
<script type="text/javascript">
(function(){
window['__CF$cv$params']={
r:'63a87c870bc076a9',m:'0a858aba700bde3f1e8f150cf762727ae55aa1d9-1617517809-1800-ASdA/VoRLWQqgf2dmA1TKik5NM44g8GeLBcHjrpFmqGF3L+Xgi3EHfgAE4PCzsM9iHbJ8XxZdOx1y/tyDxlUOmAAW3X0yCs727eqiWaB/nSgRo2gqxXPVG9wX/S2ZrNctPnv75F9649cYptdPTrBR10=',s:[0x56823ba17b,0xb1001f8025],}
}
)();
</script>
</body>
</html>
The two scripts are clearly added by CF even though Web Analytics and Browser Insights have always been off. Yandex won’t recognize the botched file as correct. There are no special page rules for Yandex bot (it indexes everything with no problem). What’s going on?
PS Just in case, here’s the unbotched yandex file:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>Verification: b*********************4</body>
</html>
PPS What else might be important: I have Bot Fight Mode on (since forever) and a have a Firewall Rule to block any known bots but those I explicitly allow (those include Google and Yandex AS’s):
(cf.client.bot and not ip.geoip.asnum in {15169 13238 7941 43037 14618 32934 132892} and http.request.uri.path ne "/robots.txt")