Problems with web crawlers not respecting robots.txt file

I am sorry to hear that.
I also receive daily thousands of requests (user agents like bing, yandex, iar-something, admantex, dotbot, semrush, ahrefs, MJ12bot, seekport, python-something, petalbot, huawei something, blexbot, etc), but block most of the requests from them either by their user-agent or AS number.

Nevertheless, many crawlers do not respect what is written into the robots.txt file.
Furthermore, many of them just go directly to try if they can access the sitemap.xml or sitemap_index.xml file from which they just crawl-up the URLs.

I believe yes.
Kindly, apply the Firewall Rules as needed using below articles which contain mostly all of the “bad bots” including the crawlers, etc.:

The ASNs list to block - not all, but some of them at least, here:

You could also create a Firewall Rule to allow the good bots using cf.client_bot field/option.