I noticed that if I go into check my traffic analytics I can see x amount of visitors over 7 days, however switching over to the new Web Traffic analytics that number of visitors is higher.

If I look through the paths visited I can see listed examples like //news/wp-includes/wlwmanifest.xml where that page does not exist on my site. I don’t use Word Press at all, but rather a minimal setup written from scratch.

This makes me think that some crawlers are being included in the results. I am wondering if the only solution is to exclude these obvious outliers every time I want to find out the number of human visitors.

I figure all this that maybe some crawlers/bots are showing up in the traffic logs, but appear human to me as they only check the / path. I was hoping there is a good way to easily filter out all of these fake views.

Note: I tend to use the new analytics and filter for the / path and look at the total unique IPs. I would expect my site to see very few visitors at the moment, so any fake impressions would heavily skew my results.

You can start by enabling Bot Fight Mode on your site. It’s under Security > Bots in the Dashboard.

If you have a paid Plan, the Bots screen will have a link at the top right called “Configure Super Bot Fight Mode” - where you can set the two categories “Definitely Automated” and “Likely Automated” to use Managed Challenge.

That would be a good start to help with crawlers we don’t trust.
You can create a WAF Rule if you want to block even the Trusted crawlers, as well as setting Verified Bots to “Block” on that Bots page.

I see the option for “Definitely Automated” and it was not set to block them.
I don’t see the option for “Likely Automated”.

Should it be set to “block”, or just use the “managed challenge” option? It was set to allow by default.

I wonder what " Verified bots" includes, but I am inclined to allow those as I fear maybe it messes with SEO and such.

Also, I noticed I was reading the stats wrong. The total IPs is not actually the number of unique IPs.
I realized I could set the filters and download data so see all the IPs, but many of them have 0 for the count.

I am guessing there is no way to filter out the known and verified good bots from my data?

I’d love to know more.

