Issue With Worker-To-Worker HTTPS request

cadienvan · June 26, 2019, 4:19pm

Hi everyone, here’s my simple worker:

 addEventListener('fetch', event => {
   event.respondWith(handleRequest(event.request))
 })

/**
 * Respond to the request
 * @param {Request} request
 */
async function handleRequest(request) {
  return fetch('https://FAKE_URL.workers.dev/')
    .then(response => response) // Here's the issue
    .then(response => {
      return new Response("ok5", {status: 200});
    });
}

This works perfectly fine ( I know I could just remove the “Here’s the issue” line, please keep reading to properly understand ), but as soon as I change the “Here’s the issue” pipe from response to response.json():

My sandbox/editor preview works fine.
Calling my worker directly from url gives you a 1101 Error

What should I do to resolve such issue? response.json() is a Promise, so I need to “then-ify” it in order to properly get the info I need.

signalnerve · July 1, 2019, 2:39pm

Hey @cadienvan, I might need a bit more info to start diagnosing the problem, but it may help to use async/await to debug a little easier. Here’s a rewrite that you can start playing with:

async function handleRequest(request) {
  const response = await fetch('https://FAKE_URL.workers.dev/')
  const parsed = await response.json()
  return new Response("ok5", { status: 200 })
}

Can you give me more details about where you’re running this code, in comparison to the FAKE_URL you have in the code?

hero · July 2, 2019, 10:10am

Looks like there’s a worker-to-worker fetch issue and here’s how to reproduce it.

Create two workers, worker1 and worker2.
Use the default “hello world” code for worker2.
Use the following code for worker1 (changing worker2 to the actual domain name for worker2).

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  const url = new URL(request.url);
  let originUrl;
  switch (url.searchParams.get('test')) {
    case '0': originUrl = 'https://www.cloudflare.com/robots.txt'
    case '1':
      originUrl = originUrl || 'https://worker2/';
      return fetch(originUrl)
    default:
      return new Response(null, {status: 403})
  }
}

Go to https://worker1/?test=0 and there’s no problem.
Go to https://worker1/?test=1 in a browser (not in the sandbox) and you’ll get an “Error 522” page after a long timeout period.

alexi · July 4, 2019, 7:53pm

I have the same issue and confirm 522 on worker to worker fetch requests (not sandbox).

cadienvan · July 5, 2019, 7:32am

Hi @signalnerve, I hope @hero gave an exhaustive explanation about the issue. Is there any news from Cloudflare side? I think there’s a huge use case for such situations and having them not available brings some limits on the Cloud Workers…

hero · July 16, 2019, 2:56pm

@signalnerve

adaptive · July 17, 2019, 7:06pm

There was a protection against a possible attack vector in place back in March, not sure if still stands, but I presume yes.

Worker fetching another worker within the same account: returns error 1101
Worker fetching another worker in different user account: returns 200 OK

The vector could potentially generate a burst attack on a target by doing a chained fetch of workers, and in the last step, a targetted fetch.

Or to avoid a fetch loop.

You will still be billed for that.

50^5 for five steps attack. Purely theorised and not tested.

@harris can you help out with this one?

harris · July 18, 2019, 4:54pm

Hi @hero, @adaptive,

Regarding @hero’s original problem statement: it’s not clear to me whether the workers in question are part of the same zone or not. If they’re on the same zone, then the first worker’s subrequest would go to the “origin”, which doesn’t exist for workers.dev, explaining the 522 error. If they’re on different zones, then the first worker’s subrequest would invoke the second worker, and so we’d need to know what that worker’s behavior is, and how its origin, if any, is configured.

Regarding worker composition (one worker invoking another worker): such request chaining isn’t possible on same-zone subrequests. Instead, we always send those to the origin. This limitation is in place to resolve an ambiguity: how do we know that a subrequest is “trusted” and should be sent directly to the origin, versus “untrusted” and should go in the front door, running all Cloudflare features (including Workers) from the start? (Note that this ambiguity doesn’t exist for cross-zone subrequests: such subrequests are obviously untrusted from the perspective of the other zone.)

For example, consider a worker which just modifies request URL paths, and is deployed on *example.com/*. Should its subrequests go to the origin, or to itself? It may actually have a legitimate recursive design, with a base case that should eventually go to the origin. But how do we know the difference between that base case and the general case?

One possibility might be to require users to explicitly disable workers on their routes that should pass through to the origin, and make recursion be the default. This would make worker configuration painfully complex – essentially broken by default – and does not address the more general question of what other Cloudflare features we should run.

Another possibility might be a policy of only allowing worker composition involving more than 1 worker script. This, too, feels like a footgun: we would have no way of expressing whether a user actually intends for two workers to interact like this or not.

Instead, it seems that we need some sort of support in the Fetch API itself to indicate whether a subrequest should be considered “trusted” and go to the origin, or “untrusted”, and go in the front door, running all Cloudflare features, including Workers. We’ve made progress on designing such a system internally, but it’s slow going.

Note that the lack of Worker composition on same-zone subrequests does not really mitigate an attack vector, because an attacker could just use two zones and accomplish the same thing.

Harris

hero · July 19, 2019, 11:17am

Finally. Thanks for the explanation.

harris:

Regarding worker composition (one worker invoking another worker): such request chaining isn’t possible on same-zone subrequests. Instead, we always send those to the origin. This limitation is in place to resolve an ambiguity: how do we know that a subrequest is “trusted” and should be sent directly to the origin, versus “untrusted” and should go in the front door, running all Cloudflare features (including Workers) from the start? (Note that this ambiguity doesn’t exist for cross-zone subrequests: such subrequests are obviously untrusted from the perspective of the other zone.)

For example, consider a worker which just modifies request URL paths, and is deployed on *example.com/* . Should its subrequests go to the origin, or to itself? It may actually have a legitimate recursive design, with a base case that should eventually go to the origin. But how do we know the difference between that base case and the general case?

I guess you can (or should) make a special case for URLs on workers.dev domains, since they are originless and there’s no ambiguity as to where the requests should go. Having this (fail-slow) limitation for workers.dev domains precludes a lot of useful modular (not just recursive) designs without achieving anything.

victor7 · October 3, 2019, 4:02pm

Hi there,

Is there any progress on this @harris? I am facing the same issue, a worker trying to call another worker under the same account.

Thank you

Regards,
Victor Fernandez

harris · October 3, 2019, 5:51pm

No, I’m afraid I don’t have any updates on this. Same-zone worker composition is still something we’d like to enable, but we don’t have a timeline for when it might happen. For now, you’ll need to bundle logic into a single worker script, or use multiple zones.

Harris

robert.steuck · October 14, 2019, 8:22pm

I just stumbled across this thread while implementing the first thing anyone would do while playing around with a distributed system: Fibonacci numbers…

Anyway here is my crude (and due to the limitation stated above disfunctional) attempt to implement it:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

/**
 * Respond to the request
 * @param {Request} request
 */
async function handleRequest(request) {
  const init = {
    headers: {
      'content-type': type,
    },
  }
  var [base, n] = request.url.split('?')
  n = n || '0'
  n = parseInt(n)
  if (n <= 1) {
    return new Response(JSON.stringify({'fib': n}), {status: 200, headers: {'content-type': type }})
  } else {
    var [n1url, n2url] = [base + '?' + (n - 1), base + '?' + (n - 2)]
    var [n1r, n2r] = await Promise.all([fetch(n1url, init), fetch(n2url, init)]);
    var [n1, n2] =  await Promise.all([n1r.json(), n2r.json()])
    return new Response(JSON.stringify({'fib': (n1.fib + n2.fib)}), {status: 200, headers: {'content-type': type}})
  }
  
}

const type = 'application/json;charset=UTF-8'

Robert

tomas6 · November 12, 2019, 10:28pm

I think being able to call a Worker from another Worker got more relevant since Workers Sites got released. Without this feature apps that leverage Workers as a proxy between users and content cannot use Sites as the hosting platform.

matteo · November 13, 2019, 12:14am

You can still modify the Worker itself and run whatever you want while generating the response via Worker Sites, that doesn’t change. There are some issues were additional services on Cloudflare have some trouble with Worker hosted sites.

tomas6 · November 13, 2019, 1:07pm

Scenario: we deploy our assets to Sites via Wrangler. This creates a new Worker that serves anything at /*. The problem with our existing entry point Worker (e.g.: /entry.html) is that it cannot call the Worker created by Wrangler. It can access the created KV, but it doesn’t have access to the asset manifest, which maps the hashed asset names to the regular ones. So it cannot fetch the actual entry.html asset (as it is named entry${hash}.html).

Do you know of a workaround that doesn’t involve significant complexity (when compared to hosting the assets on a regular storage service)?

matteo · November 13, 2019, 2:04pm

As I said previously, merge the two Workers in one. The Worker being deploy is the index.js file in the workers-site directory. Change that and merge all your custom logic there, also because you can’t run more than one worker per path (unless it’s a Cloudflare App Worker, but it’s a different story).

thomas4 · January 26, 2020, 8:34pm

Any update on this?

We’re in need of communicating between workers on different subdomains, but it only work on entirely separate domains.

harris · January 27, 2020, 9:14pm

Hi @thomas4, sorry, nothing new to report yet.

lukeed · February 2, 2020, 5:25am

I think I may be running into this, but not quite sure. If it’s a separate issue please let me know and I’ll be sure to open a separate ticket.

I’m getting “522 Connection timed out” when trying to invoke workers across separate domains.
Both domains are on Cloudflare and both have all DNS records proxied.

There’s a CNAME for foo.domain1.com to point to bar.domain2.com. This bar.domain2.com contains the Worker script (with “bar.domain2.com/*” as its route) and works correctly when accessed directly. However, when trying through foo.domain1.com it hangs.

I’ve also tried disabling CF proxying on the foo.domain1.com CNAME record.

Any help greatly appreciated!

thomas4 · February 2, 2020, 10:20am

If you’re using fetch, check out the redirect option.

The docs on this is still undone, but there’s this option:

fetch(url, { redirect: "manual" }) which makes sure the redirect isn’t followed.

There’s also error and follow options.

Ref: Disable follow redirect · Issue #137 · github/fetch · GitHub