Issue With Worker-To-Worker HTTPS request

Looks like there’s a worker-to-worker fetch issue and here’s how to reproduce it.

  1. Create two workers, worker1 and worker2.
  2. Use the default “hello world” code for worker2.
  3. Use the following code for worker1 (changing worker2 to the actual domain name for worker2).
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  const url = new URL(request.url);
  let originUrl;
  switch (url.searchParams.get('test')) {
    case '0': originUrl = 'https://www.cloudflare.com/robots.txt'
    case '1':
      originUrl = originUrl || 'https://worker2/';
      return fetch(originUrl)
    default:
      return new Response(null, {status: 403})
  }
}
  1. Go to https://worker1/?test=0 and there’s no problem.
  2. Go to https://worker1/?test=1 in a browser (not in the sandbox) and you’ll get an “Error 522” page after a long timeout period.

I have the same issue and confirm 522 on worker to worker fetch requests (not sandbox).

Hi @signalnerve, I hope @hero gave an exhaustive explanation about the issue. Is there any news from Cloudflare side? I think there’s a huge use case for such situations and having them not available brings some limits on the Cloud Workers…

1 Like

@signalnerve

There was a protection against a possible attack vector in place back in March, not sure if still stands, but I presume yes.

Worker fetching another worker within the same account: returns error 1101
Worker fetching another worker in different user account: returns 200 OK

The vector could potentially generate a burst attack on a target by doing a chained fetch of workers, and in the last step, a targetted fetch.

Or to avoid a fetch loop.

You will still be billed for that.

50^5 for five steps attack. Purely theorised and not tested.

@harris can you help out with this one?

2 Likes

Hi @hero, @adaptive,

Regarding @hero’s original problem statement: it’s not clear to me whether the workers in question are part of the same zone or not. If they’re on the same zone, then the first worker’s subrequest would go to the “origin”, which doesn’t exist for workers.dev, explaining the 522 error. If they’re on different zones, then the first worker’s subrequest would invoke the second worker, and so we’d need to know what that worker’s behavior is, and how its origin, if any, is configured.

Regarding worker composition (one worker invoking another worker): such request chaining isn’t possible on same-zone subrequests. Instead, we always send those to the origin. This limitation is in place to resolve an ambiguity: how do we know that a subrequest is “trusted” and should be sent directly to the origin, versus “untrusted” and should go in the front door, running all Cloudflare features (including Workers) from the start? (Note that this ambiguity doesn’t exist for cross-zone subrequests: such subrequests are obviously untrusted from the perspective of the other zone.)

For example, consider a worker which just modifies request URL paths, and is deployed on *example.com/*. Should its subrequests go to the origin, or to itself? It may actually have a legitimate recursive design, with a base case that should eventually go to the origin. But how do we know the difference between that base case and the general case?

One possibility might be to require users to explicitly disable workers on their routes that should pass through to the origin, and make recursion be the default. This would make worker configuration painfully complex – essentially broken by default – and does not address the more general question of what other Cloudflare features we should run.

Another possibility might be a policy of only allowing worker composition involving more than 1 worker script. This, too, feels like a footgun: we would have no way of expressing whether a user actually intends for two workers to interact like this or not.

Instead, it seems that we need some sort of support in the Fetch API itself to indicate whether a subrequest should be considered “trusted” and go to the origin, or “untrusted”, and go in the front door, running all Cloudflare features, including Workers. We’ve made progress on designing such a system internally, but it’s slow going.

Note that the lack of Worker composition on same-zone subrequests does not really mitigate an attack vector, because an attacker could just use two zones and accomplish the same thing.

Harris

9 Likes

Finally. Thanks for the explanation.

I guess you can (or should) make a special case for URLs on workers.dev domains, since they are originless and there’s no ambiguity as to where the requests should go. Having this (fail-slow) limitation for workers.dev domains precludes a lot of useful modular (not just recursive) designs without achieving anything.

Hi there,

Is there any progress on this @harris? I am facing the same issue, a worker trying to call another worker under the same account.

Thank you :slight_smile:

Regards,
Victor Fernandez

No, I’m afraid I don’t have any updates on this. Same-zone worker composition is still something we’d like to enable, but we don’t have a timeline for when it might happen. For now, you’ll need to bundle logic into a single worker script, or use multiple zones.

Harris

1 Like

I just stumbled across this thread while implementing the first thing anyone would do while playing around with a distributed system: Fibonacci numbers…

Anyway here is my crude (and due to the limitation stated above disfunctional) attempt to implement it:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

/**
 * Respond to the request
 * @param {Request} request
 */
async function handleRequest(request) {
  const init = {
    headers: {
      'content-type': type,
    },
  }
  var [base, n] = request.url.split('?')
  n = n || '0'
  n = parseInt(n)
  if (n <= 1) {
    return new Response(JSON.stringify({'fib': n}), {status: 200, headers: {'content-type': type }})
  } else {
    var [n1url, n2url] = [base + '?' + (n - 1), base + '?' + (n - 2)]
    var [n1r, n2r] = await Promise.all([fetch(n1url, init), fetch(n2url, init)]);
    var [n1, n2] =  await Promise.all([n1r.json(), n2r.json()])
    return new Response(JSON.stringify({'fib': (n1.fib + n2.fib)}), {status: 200, headers: {'content-type': type}})
  }
  
}

const type = 'application/json;charset=UTF-8'

Robert

I think being able to call a Worker from another Worker got more relevant since Workers Sites got released. Without this feature apps that leverage Workers as a proxy between users and content cannot use Sites as the hosting platform.

2 Likes

You can still modify the Worker itself and run whatever you want while generating the response via Worker Sites, that doesn’t change. There are some issues were additional services on Cloudflare have some trouble with Worker hosted sites.

Scenario: we deploy our assets to Sites via Wrangler. This creates a new Worker that serves anything at /*. The problem with our existing entry point Worker (e.g.: /entry.html) is that it cannot call the Worker created by Wrangler. It can access the created KV, but it doesn’t have access to the asset manifest, which maps the hashed asset names to the regular ones. So it cannot fetch the actual entry.html asset (as it is named entry${hash}.html).

Do you know of a workaround that doesn’t involve significant complexity (when compared to hosting the assets on a regular storage service)?

As I said previously, merge the two Workers in one. The Worker being deploy is the index.js file in the workers-site directory. Change that and merge all your custom logic there, also because you can’t run more than one worker per path (unless it’s a Cloudflare App Worker, but it’s a different story).

2 Likes

Any update on this?

We’re in need of communicating between workers on different subdomains, but it only work on entirely separate domains.

Hi @thomas4, sorry, nothing new to report yet.

4 Likes

I think I may be running into this, but not quite sure. If it’s a separate issue please let me know and I’ll be sure to open a separate ticket.

I’m getting “522 Connection timed out” when trying to invoke workers across separate domains.
Both domains are on Cloudflare and both have all DNS records proxied.

There’s a CNAME for foo.domain1.com to point to bar.domain2.com. This bar.domain2.com contains the Worker script (with “bar.domain2.com/*” as its route) and works correctly when accessed directly. However, when trying through foo.domain1.com it hangs.

I’ve also tried disabling CF proxying on the foo.domain1.com CNAME record.

Any help greatly appreciated!

If you’re using fetch, check out the redirect option.

The docs on this is still undone, but there’s this option:

fetch(url, { redirect: "manual" }) which makes sure the redirect isn’t followed.

There’s also error and follow options.

Ref: https://github.com/github/fetch/issues/137

Thank you, but the worker code should not change. The bar.domain2.com is already written as a general-purpose standalone service. The foo.domain1.com is supposed to be able to just setup a DNS record & nothing more.

This would have been the case had bar.domain2.com been a standard server instead of a worker script.

I think the redirect from foo.domain1.com is triggering a GET request on bar.domain2.com, which could be a Worker to Worker trigger. I would try this, tell the browser to make the redirect based on 302 or 301 header, by using a worker on foo.domain1.com.

Something like this:

addEventListener('fetch', event => {
  event.respondWith(Redirect(event.request))
})

/**
 * Fetch and log a given request object
 * @param {Request} request
 */
async function Redirect(request) {
    return new Response('', {
        status: 302,
        headers: {
          'Location': 'https://bar.domain2.com'
        }
      })
}