Looks an complex way of doing it
For GoogleBots we use;
Block if (ip.geoip.asnum in {15169 396982} and not cf.client.bot and not ip.src in $googlebot_ip_lists)
where we created the googlebot_ip_lists from Googles List - needs to be checked and updated regularly
For Bing / DuckDuck
Block if (ip.geoip.asnum eq 8075 and not cf.client.bot and not ip.src in {157.55.39.0/24 207.46.13.0/24 40.77.167.0/24 13.66.139.0/24 13.66.144.0/24 52.167.144.0/24 13.67.10.16/28 13.69.66.240/28 13.71.172.224/28 139.217.52.0/28 191.233.204.224/28 20.36.108.32/28 20.43.120.16/28 40.79.131.208/28 40.79.186.176/28 52.231.148.0/28 51.8.235.176/28 51.105.67.0/28 20.125.163.80/28} and not ip.src in {20.191.45.212 40.88.21.235 40.76.173.151 40.76.163.7 20.185.79.47 52.142.26.175 20.185.79.15 52.142.24.149 40.76.162.208 40.76.163.23 40.76.162.191 40.76.162.247})
First IP List is Bing, second is DuckDuckGo - but we found “not cf.client.bot” doesnt work reliably with DuckDuckGo bot which regularly gets blocked
We are on free plan so can only have one list, on other plans it would be much neater to keep lists of verified Biing, DuckDuckGo, and Google IPs so they can be created at account level then re-used by each site in account
Even verified BingBot IPs will try to trawl non-existent paths, /wp, /dev, /blog, etc for some reason - these are blocked in .htaccess by returning a 403 for forbidden rather than a 404 not found