Crawler Hints sends bad URLs for Workers Sites

This has been mentioned before, see e.g. Crawler hints submits wrong URLs

We use Cloudflare Workers Sites, and today I turned on Crawler Hints.
Shortly after publishing updates for our static generated site to Cloudflare, Bing Webmaster Tools shows URLs that Cloudflare pinged Bing about (this is Crawler Hints in action).

Some of the URLs are not what I expected and do not want Bing to be notified about.
For example https://www.cdnplanet.com/static/img/4x3/li.7d2c31ff85.svg . What is that 7d2c31ff85 doing in the URL? The URL of this image as we use it on the site is just li.7d2c31ff85.svg.

This happens for page URLs too, e.g. /index.8e3d39gt56.html

This is bad because Bing is given a hint to crawl those URLs and this can lead to duplicate content issues.

Another Cloudflare user mentioned the same issue over on Discord : Discord

I disabled Crawler Hints now … I’m not risking dupe content issues

IMHO, this is a bug in Crawler Hints: /li.7d2c31ff85.svg is internal to Cloudflare Workers Sites.
Surely Cloudflare has not gotten any incoming requests for that URL, so why ping Bing about it?

1 Like

Eventually I turned off Crawler hints. There are no settings and no opportunity to control what is being sent to search engines. I could not understand any logic of url selection.

Luckily, Index now has a very simple API and there are dozens modules and plugins for any CMS to implement such feature. There is no need in Crawler hints, especially while it’s broken.