Microsoft Bing Crawler Blocked

Agreed.

@seoworks, I fully understand that you are experiencing an issue which is beyond your control and that it might be annoying, what I do not appreciate however is your attitude and your apparent hostility.

As I said earlier, the community does not need handle anything. We are volunteering our free time here to help people like you with problems and come up with correct answers and ways forward to fix them and nobody is in a position to tell us how bad of a job we are doing or - quoting you - accuse us of anything.

This was also the case here, you were told relatively early on what the situation was, why it was happening, whom to contact for a fix, and what to do in the meantime. Instead of listening to any of that or following the advice you became argumentative. That is not a way to solve the issue.

I agree with this, but it’s also important to have this discussion here as a reference for everyone who’s running into the same problem. If I discovered a topic like this one a couple of days ago, it could have saved me hours of time debugging the issue.

Currently, top search result for “CloudFlare WAF #100202” brings a closed topic which has an answer from CloudFlare staff stating that “Those events and blocks issued by the WAF look correct, they are not Bingbot.”. It’s not very helpful.

For the topic in question that response might have been very valid however. That’s also why I criticised the statement here about billions of users experiencing this problem, as this was seemingly taken entirely out of context.

The OP used previous threads where people complained about non-issues to back his case that this issue has been going on for a while and that does not seem to be true. My assumption is these two IP addresses are relatively new, and so this might be the issue. Microsoft might even automatically fix this in a few days.

I read, and the explanation is not clear in why is not a viable solution. Because that ASN is used only by Microsoft - not azure that is where the bing bot can be fakes. For sure, no one inside Ips in Microsoft Offices and datacenters will be faking the useragent “for fun” :slight_smile:

Again, this is not a ASN used by the public in Microsoft services like Azure (for what i research)
This ASN is used only by Microsoft itself, not “public use”.

I did in fact. I have a Pro account.
And this is the answer from the support:

They didn’t even read my first ticket message, where i explained the issue - and the response provided is a “generic one” about a issue with 403 errors.

I understand that this is not fault from Cloudflare, and indeed is Microsoft fault. But as i said before, Cloudflare must adapt they rules - and there is no deny in that “contact Microsoft” will not solve the issue, even more when Cloudflare also have responsibility here until some extend. :slight_smile:

Because it would allow everyone else on the Microsoft network to pose as Microsoft crawler, unless …

… that statement is true. Where do you take this from? As far as I can tell Microsoft’s hosted services run off of 8075 as well.

You need to engage with support and clarify this.

First of all, why do you think Microsoft won’t fix this? Second, why does Cloudflare have to do something but not Microsoft? The logic seems a bit foreign, especially considering it is a Microsoft issue.

Guys, can we agree that BOTH Microsoft and CloudFlare have responsibility here?
Microsoft should identify Bingbot IP addresses;
CloudFlare should not enable an unreliable rule by default.

I did open a support ticket with Bing Webmasters.

Edit: After giving this some thought, I think it’s totally on Microsoft to fix this.

:+1:t2:

Microsoft controls the DNS here (as has been demonstrated in the thread). So ‘can’t do the DNS fix’ seems unlikely.

This doesn’t work for any other WAF or security company which follows Microsoft’s published specifications on how to determine if a request from their crawler is legitimate or not. So everybody else seems to be anyone who isn’t blocking Bing bots which are not legitimate as defined by Microsoft’s specification.

I see no deficiency in @sandro’s answer from a technical perspective. The answer is in fact that Microsoft needs to update their documented mechanisms for determining valid bots or bring this bot into compliance with the published specification.

Not liking the answer you receive doesn’t make it wrong, nor does it call into question the intelligence of the poster. This community has a set of guidelines for behavior, please try to respect them.

I will pass along this information to our WAF team, perhaps they know someone at Microsoft and can put a bug in their ear, but it is perfectly reasonable for a vendor to follow the established mechanism from a company on how to validate their own tools and to expect a vendor to follow their own standards when deploying it.

It is unlikely we would take this approach as it represents a security risk on many levels to our customers.

This is absolutely an assumption you are welcome to make by whitelisting that ASN, I don’t think I have the same level of trust with regards to any network myself. Microsoft has the same risks with regards to viruses, malware and compromised machines as any other company unfortunately.

The rule is built using the specification Microsoft provided to determine if a crawler is legitimate or not. I have not seen evidence in this threat that the rule has fired incorrectly based on that specification.

Hopefully they will resolve the issue with their misconfigured crawler.

1 Like

Sweet, I even missed that part. A real cutie, we have here :wink:

To be fair, there are other indicators that may call into question the intelligence of Sandro. :smiley: However s/he still appears to be smarter than the average bear (on average anyway)

1 Like

Microsoft has acknowledge this is a legit crawler and this is an issue on their side. They are working to resolve (by setting it to conform to their published expected behavior).

3 Likes

That’s not how you talk to a Lady.

I’d know certain people on certain forums who’d strongly disagree (@eva2000 probably knows).

And we could have settled that so peacefully ten hours ago.

Eh I did nudge somebody who nudged somebody across the aisle. No idea if Bing had already escalated internally, but hopefully they had and ours was just an additional notice. Lots of reasons things get turned up on new IPs (dev/staging/pilots) and those types of deployments should be considered non-prod and not impact production… it’s just hard sometimes to coordinate moving those to production when they are green-lighted.

Imagine if a dev version of the tool had a bug loop and hammered a website repeatedly with 2x the number of additional requests for the same resource if it received a 200 response. How many seconds would it take to bring down someone’s production website with that dev/beta crawler? That’s why production versions are set to specific whitelists generally. We (the collective we) assume the prod versions won’t blow up websites based on the increased access they are granted.

2 Likes

As writen before but not explained in detail.

I did contacted at 6 AM in the Morning New York Time long before anybody sugested to do anything a Search Engine Authority that has excelent contacts to the Search Engine Crawler Teams at Google and Bing with a link to this thread here and got a reply in 1 Minute back that he will forward the Message and the Link to this Thread about Blocked Crawlers to the Search Engineers at Google and Bing.

Could be very well be that this early email in the Morning worked guess.
We will see.

Hey, I’m not saying there’s anything wrong with the rule itself - I trust the rule was written 100% according to the specification. I wouldn’t expect anything less from engineering talent that CloudFlare has.

To be clear, I’m in no way blaming CloudFlare. The rule is unreliable not because of how CloudFlare implemented it, but because of Microsoft not following their own specification. My point is, the rule is still unreliable at this time, despite CloudFlare implementing it to the specification. It’s not CloudFlare’s fault that this rule is unreliable.

Thanks for the update!

Thank you!

Not saying you asked for it, however that request was made on several occasions in this thread.

This is what has been said from the beginning :slight_smile:

Becouse of wrong answers by sandro i got myself misled since Days believing wrongfully that all this BingBots firewalls blocks are fakebots and not legit.

Lucky i got a message by Microsoft Bing that i should check my Website otherwise i would still belive till now this wrong Answers marked as Solutions which clearly turn out to be not the Solutions.

The result of believing this wrong Solution and not do anything is getting mass deindexing of Websites by Bing …

And now for some really interesting info: I blocked this IP in the Ubuntu UFW firewall on one of my websites and after about a week it was deindexed from BING. It seems that BING is much more strict and if they can not crawl your website they take drastic measures!

Excuse me? This thread is barely a day old.

Should you be referring to other threads, then I would suggest you re-read this thread once more very carefully as I addressed that as well.

Apart from that I don’t have anything else to say that what I already did here

Please be assured that you most certainly won’t get any answers from my side on this forum any more, so no worries about getting “wrong answers” :roll_eyes:

</thread>

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.