Workers Fighting Printer Bot Net - not any more


#1

We’ve been getting HEAD requests to ‘/’ from Designjet - looks like a bot net. When I first wrote the worker it worked - no more of these requests coming through to log. But today they are back. What have I done wrong?

async function handleRequest(request) {
var agent = request.headers.get(‘user-agent’)
if (agent && request.method==“HEAD” && agent.startsWith(“Designjet”) ) {
return new Response(’’, {
status: 403,
headers: {
‘Location’: ‘/’,
‘User-Agent’: ‘Designjet’
}
})
} else {
const response = await fetch(request)
//console.log(‘Got response’, response)
return response
}


#2

You could also look into a User Agent blocking rule and block that UA across your entire zone, without using Workers.

In your case, make sure the firewall on your origin server is set up to only allow incoming connections from Cloudflare IPs, as otherwise someone could bypass Cloudflare and hit your origin server directly.


#3

Can I block wildcard agent strings? Each client has a unique reference, e.g.

Designjet/1.0 (HP Latex 335 Printer; V7L47A; NEXUS_03_14_00.8; 4ecf0ceb-18d5-5642-9264-3001abadf4bb) X-Middleton/1


#4

Can you try simplifying the code to see if it’s somehting in the logic?

addEventListener('fetch', event => {
  event.respondWith(fetchAndApply(event.request))
})

async function fetchAndApply(request) {  
  if (request.headers.get('user-agent').includes('Designjet')) {
return new Response('Sorry, this page is not available.',
    { status: 403, statusText: 'Forbidden' })
  }

  return fetch(request)
}

#5

My code now looks just like yours.

The status of the worker shows that I am getting “success” (about 500 / hour) but I still see the printer bots entries coming through on my server log.

latency: “0.002s”
referer: “-”
remoteIp: “181.223.102.238”
requestMethod: “HEAD”
requestUrl: “/”
status: 403
userAgent: “Designjet/1.0 (HP Latex 570 Printer; N2G70A; STORM_00_07_00.7; 240fa3cc-bd59-5447-bf62-b75e9ecc2931) X-Middleton/1”
}

The 403 in this response is from my back-end code, but obv. I really want to fend off these things with a worker. On the route I put: mydomain/. (trailing slash)


#6

If you’re not using mod_cloudflare or the real IP module, the IP address in that request looks to not be proxied from Cloudflare

whois 181.223.102.238

owner:       CLARO S.A.

This can be fixed by only allowing requests that originate from Cloudflare IPs, some methods in this SE post.

If you are using mod_cf/realip then disregard this.


#7

I’m using app-engine, would be great to somehow only allow CF proxied traffic in…


#8

See this documentation on creating app engine firewalls. You’ll need to whitelist the IP ranges listed at https://www.cloudflare.com/ips/ then have a final rule for denying *, something like the “example firewall” in those docs.


#9

Thanks for this, I’ve read the docs. I would like to implement a CF-only source firewall, but in my app engine log I never see non-CF originating IPs (even though the proxy is working fine), so I can’t be sure if adding CF IP whitelist will work.

My original worker did reject bad user agents. But now it does not. I doubt that these evil printer bots are now routing around Cloudflare. Unfortunately I don’t know how to verify exactly what is happening: only origin IPs are presented to me in the app engine log.


#10

Possible these are coming on another path like www? Looking at the routes you have it wouldn’t fire there.


#11

Thanks this turned out to be the correct answer - adding an * before my domain name in the route and now the bots are being turned away.

Only one very long day of floundering…

Thanks very much everyone who chipped in to help.

I really think the worker docs should be updated to inform users that route URL patterns are not just domain names. Unless that really is obvious to most.