@seoworks, I fully understand that you are experiencing an issue which is beyond your control and that it might be annoying, what I do not appreciate however is your attitude and your apparent hostility.
As I said earlier, the community does not need handle anything. We are volunteering our free time here to help people like you with problems and come up with correct answers and ways forward to fix them and nobody is in a position to tell us how bad of a job we are doing or - quoting you - accuse us of anything.
This was also the case here, you were told relatively early on what the situation was, why it was happening, whom to contact for a fix, and what to do in the meantime. Instead of listening to any of that or following the advice you became argumentative. That is not a way to solve the issue.
I agree with this, but it’s also important to have this discussion here as a reference for everyone who’s running into the same problem. If I discovered a topic like this one a couple of days ago, it could have saved me hours of time debugging the issue.
Currently, top search result for “CloudFlare WAF #100202” brings a closed topic which has an answer from CloudFlare staff stating that “Those events and blocks issued by the WAF look correct, they are not Bingbot.”. It’s not very helpful.
For the topic in question that response might have been very valid however. That’s also why I criticised the statement here about billions of users experiencing this problem, as this was seemingly taken entirely out of context.
The OP used previous threads where people complained about non-issues to back his case that this issue has been going on for a while and that does not seem to be true. My assumption is these two IP addresses are relatively new, and so this might be the issue. Microsoft might even automatically fix this in a few days.
I read, and the explanation is not clear in why is not a viable solution. Because that ASN is used only by Microsoft - not azure that is where the bing bot can be fakes. For sure, no one inside Ips in Microsoft Offices and datacenters will be faking the useragent “for fun”
Again, this is not a ASN used by the public in Microsoft services like Azure (for what i research)
This ASN is used only by Microsoft itself, not “public use”.
I did in fact. I have a Pro account.
And this is the answer from the support:
They didn’t even read my first ticket message, where i explained the issue - and the response provided is a “generic one” about a issue with 403 errors.
I understand that this is not fault from Cloudflare, and indeed is Microsoft fault. But as i said before, Cloudflare must adapt they rules - and there is no deny in that “contact Microsoft” will not solve the issue, even more when Cloudflare also have responsibility here until some extend.
Because it would allow everyone else on the Microsoft network to pose as Microsoft crawler, unless …
… that statement is true. Where do you take this from? As far as I can tell Microsoft’s hosted services run off of 8075 as well.
You need to engage with support and clarify this.
First of all, why do you think Microsoft won’t fix this? Second, why does Cloudflare have to do something but not Microsoft? The logic seems a bit foreign, especially considering it is a Microsoft issue.
Microsoft controls the DNS here (as has been demonstrated in the thread). So ‘can’t do the DNS fix’ seems unlikely.
This doesn’t work for any other WAF or security company which follows Microsoft’s published specifications on how to determine if a request from their crawler is legitimate or not. So everybody else seems to be anyone who isn’t blocking Bing bots which are not legitimate as defined by Microsoft’s specification.
I see no deficiency in @sandro’s answer from a technical perspective. The answer is in fact that Microsoft needs to update their documented mechanisms for determining valid bots or bring this bot into compliance with the published specification.
Not liking the answer you receive doesn’t make it wrong, nor does it call into question the intelligence of the poster. This community has a set of guidelines for behavior, please try to respect them.
I will pass along this information to our WAF team, perhaps they know someone at Microsoft and can put a bug in their ear, but it is perfectly reasonable for a vendor to follow the established mechanism from a company on how to validate their own tools and to expect a vendor to follow their own standards when deploying it.
It is unlikely we would take this approach as it represents a security risk on many levels to our customers.
This is absolutely an assumption you are welcome to make by whitelisting that ASN, I don’t think I have the same level of trust with regards to any network myself. Microsoft has the same risks with regards to viruses, malware and compromised machines as any other company unfortunately.
The rule is built using the specification Microsoft provided to determine if a crawler is legitimate or not. I have not seen evidence in this threat that the rule has fired incorrectly based on that specification.
Hopefully they will resolve the issue with their misconfigured crawler.
Eh I did nudge somebody who nudged somebody across the aisle. No idea if Bing had already escalated internally, but hopefully they had and ours was just an additional notice. Lots of reasons things get turned up on new IPs (dev/staging/pilots) and those types of deployments should be considered non-prod and not impact production… it’s just hard sometimes to coordinate moving those to production when they are green-lighted.
Imagine if a dev version of the tool had a bug loop and hammered a website repeatedly with 2x the number of additional requests for the same resource if it received a 200 response. How many seconds would it take to bring down someone’s production website with that dev/beta crawler? That’s why production versions are set to specific whitelists generally. We (the collective we) assume the prod versions won’t blow up websites based on the increased access they are granted.
I did contacted at 6 AM in the Morning New York Time long before anybody sugested to do anything a Search Engine Authority that has excelent contacts to the Search Engine Crawler Teams at Google and Bing with a link to this thread here and got a reply in 1 Minute back that he will forward the Message and the Link to this Thread about Blocked Crawlers to the Search Engineers at Google and Bing.
Could be very well be that this early email in the Morning worked guess.
We will see.
Hey, I’m not saying there’s anything wrong with the rule itself - I trust the rule was written 100% according to the specification. I wouldn’t expect anything less from engineering talent that CloudFlare has.
To be clear, I’m in no way blaming CloudFlare. The rule is unreliable not because of how CloudFlare implemented it, but because of Microsoft not following their own specification. My point is, the rule is still unreliable at this time, despite CloudFlare implementing it to the specification. It’s not CloudFlare’s fault that this rule is unreliable.
Becouse of wrong answers by sandro i got myself misled since Days believing wrongfully that all this BingBots firewalls blocks are fakebots and not legit.
Lucky i got a message by Microsoft Bing that i should check my Website otherwise i would still belive till now this wrong Answers marked as Solutions which clearly turn out to be not the Solutions.
The result of believing this wrong Solution and not do anything is getting mass deindexing of Websites by Bing …
And now for some really interesting info: I blocked this IP in the Ubuntu UFW firewall on one of my websites and after about a week it was deindexed from BING. It seems that BING is much more strict and if they can not crawl your website they take drastic measures!