Websocket support

jdavis · March 15, 2018, 11:02am

@KentonVarda when will we be able to read/write websockets (within workers), and not just proxy them? We also need UDP support from the workers. With UDP support I can take advantage of sendmmsg/recvmmsg or specialized kernel bypass nics within my backend.

KentonVarda · March 15, 2018, 4:36pm

Hi @jdavis,

There are two tricky questions to answer with WebSocket support:

What’s the API? Service Workers in the browser today actually can’t intercept WebSockets, and so there’s no standard API defined for doing so. We’ll need to make up something new, hopefully with some chance that it will be accepted into the standard later.
How should WebSockets affect the per-request CPU time limit? Is a whole WebSocket just one request, with the same CPU limit as any other? Or do we treat it as multiple requests (and bill accordingly)?

I can’t make any firm promises, but I would expect WebSocket termination support to show up in the next month or two, but probably treated as a single request with a single CPU time limit. From there we’ll have to see in practice whether changes to the time limits are needed.

jdavis · March 15, 2018, 9:25pm

Wouldn’t it be better to leave websocket open and bill max(bandwidth, time) per minute?

KentonVarda · March 15, 2018, 11:05pm

@jdavis Workers are billed on number of “requests”, not on bandwidth nor CPU time. I don’t think we’d want to add CPU or bandwidth metering that applies only to WebSocket. So we need to turn WebSockets into “requests” somehow…

jdavis · March 16, 2018, 8:47am

Wasted effort, probably best to just forget websockets in that case. Maybe push this effort into TCP anything?

KentonVarda · March 16, 2018, 4:13pm

I think we can find a reasonable way to count WebSockets as requests for billing purposes.

jdavis · March 17, 2018, 10:24am

So we’re talking about being able to keep websockets open for an extended period of time? If not, what’s the point?

jdavis · March 17, 2018, 10:25am

On another note, when is Cloudflare going to purchase ZeroTier and use it’s kernel-bypass expertise to make ZT run screaming fast? In addition, making ZT run within websocket framing on 443 would also be quite disruptive.

janusz · July 26, 2018, 1:17pm

Hello @KentonVarda ,

has there been done any progress internally on the Websockets? I would be curious whether we can expect it soon or rather you found complications and decided to postpone it for time being.

Many thanks for reply!

KentonVarda · July 28, 2018, 5:00pm

@janusz We have a lot on our plate and unfortunately this is not our top priority, but in the time between other things I have managed to create a prototype implementation allowing a Worker to terminate WebSockets as a client or a server. There are still problems with this prototype:

In my prototype, an outgoing WebSocket can only be used in the context of the FetchEvent that initiated it. You can’t store it to a global variable and use it across multiple incoming requests. This is because subrequests are intrinsically tied to the FetchEvent that created them in our implementation’s concurrency model. For some use cases this is fine, but for others it may be disappointing. It’ll take a fair amount of work to change this, but it can be done.
There’s still no way to extend the CPU timeout beyond 50ms, as mentioned above. Solving this is more of a product question than an engineering question, so requires some discussion within the organization.
I’ve only implemented the bare minimum components of the API. The standard defines a wider API surface that needs to be filled out.
I have had to extend the Fetch, Service Workers, and WebSocket APIs in non-standard ways, and would like to run my changes past WHATWG before we commit to anything.

Unfortunately since we have a lot of other high-priority things competing for our time, it’s hard for me to say how long it will take to finish this up. It seems like at least a few months away.

I’d be interested to know which of the above constraints would be problems for your use case. If none of them are, then maybe we can have something working for you relatively sooner.

jan3 · October 16, 2018, 4:20pm

My initial idea would be to treat websocket termination more as a high level Cloudflare service similar to how currently http2 and ssl services are handled and expose just message events to worker functions, were the message events have additional matadata properties to know everything relevant about the connection. This way it could be easy to bill 3 or 4 websocket messages as one request and the concurrency and timout characteristics of workers would play nicely.

nesh · December 26, 2018, 5:45pm

@KentonVarda

I was wondering if there will there be any updates for WebSocket support any time soon?

Thanks in advance

KentonVarda · January 7, 2019, 9:24pm

Hi @nesh,

The status is still the same as in my previous comment. We need to figure out how to allow long-running requests and properly bill for them. This has moved slower than I’d hoped, but it’s still a priority. Sorry, I don’t have a date I can tell you.

nesh · January 9, 2019, 7:44pm

Ah I see no worries

maxwell.gerber42 · March 28, 2020, 8:43pm

Is there any update on this in 2020?

sklabnik · March 30, 2020, 2:44pm

Not specifically; this is still something we would like to support but do not currently support.

nonother · July 31, 2020, 5:03am

Congrats on all the recent launches this week. Any updates in terms of how Cloudflare is think about adding (or not) WebSocket support, particularly in light of the new billing options?

arunesh90 · July 31, 2020, 6:12am

guy.barnard · October 1, 2020, 1:58pm

Looks like websocket support has been added in companion to Durable Objects. While the documentation for durable objects is live, there is not yet any documentation on the WebSocketPair additions to the fetch spec.

Can you comment on pricing for websockets, and how long and how many websockets are supported?

github.com

cloudflare/workers-chat-demo/blob/9b4d4d230744c2a28580fa603a82891526588708/chat.mjs#L243


      
          case "/websocket": {
            // The request is to `/api/room/<name>/websocket`. A client is trying to establish a new
            // WebSocket session.
            if (request.headers.get("Upgrade") != "websocket") {
              return new Response("expected websocket", {status: 400});
            }
          
          
  // Get the client's IP address for use with the rate limiter.
            let ip = request.headers.get("CF-Connecting-IP");
          
          
  // To accept the WebSocket request, we create a WebSocketPair (which is like a socketpair,
            // i.e. two WebSockets that talk to each other), we return one end of the pair in the
            // response, and we operate on the other end. Note that this API is not part of the
            // Fetch API standard; unfortunately, the Fetch API / Service Workers specs do not define
            // any way to act as a WebSocket server today.
            let pair = new WebSocketPair();
          
          
  // We're going to take pair[1] as our end, and return pair[0] to the client.
            await this.handleSession(pair[1], ip);
          
          
  // Now we return the other end of the pair to the client.

denis.truffaut · December 18, 2020, 8:29am

–

What we know, from the beta :

WebSockets are billed per connection (1 connection = 1 Worker Request)
It does not matter how much data you send one way or the other through the socket
You are limited by CPU (50ms with regular Worker), or billed by CPU (Unbound Worker)
If the Worker crash, all sessions are invalid, all clients need to reconnect
Periodically (TBD), old clients would probably be disconnected because idle
Periodically (TBD), old clients would probably be disconnected because new clients
If a client is disconnected, it might need to reconnect (1 connection billed)

–

If you use Websockets for WebRTC, it is probably fine.
Video/Audio will require asking the user to validate a user gesture in their browser.

–

If you use Websockets for push notifications, it is probably fine.
Depending on your appetite for GAFA dependency, you might want to switch on Apple / Google Push notification system, but it requires asking the user to validate a user gesture in their browser. Ground truth : 90% of push notifications asked by websites are refused by users. It means you can save / offload 10% of your push notification load to GAFA. Of course you’ll need to encrypt those messages to ensure their confidentiality. Depending on volumes, it might be worth (or not).

–

If you use Websockets as a replacement of HTTP, the best strategy would be :

const doRequest = async (data) => {
  // If there is no WebSocket connection currently active
  //   Open a Websocket connection 
  // Send data over Websocket
}

That way you create a WebSocket connection only when needed, then reuse it until being disconnected, and only reconnect when needed.
If you are exclusively using this mode (no-push-notifications, no-chat, no-WebRTC), then avoid auto-reconnect. It would be a waste of connections.
If you are mixing this usage with other usages (push-notifications, chat, WebRTC), then you’ll have to auto-reconnect because somehow you will need push from server.

–

Spare connections when user is idle

const f = async () => {
  const {userState, screenState} = detector;
  if (userState === 'idle' || screenState === 'locked') { 
    // If Websocket connected
    //   Close WebSocket
  }
  else { 
    // If there is no WebSocket connection currently active
    //   Open a Websocket connection 
  }
};

const detector = new IdleDetector();
detector.addEventListener('change', f, { passive:true });

–

Spare connections when page is hidden (minimized, etc)

let hidden, visibilityChange;
if (typeof document.hidden !== 'undefined') { // Opera 12.10 and Firefox 18 and later support
  hidden = 'hidden';
  visibilityChange = 'visibilitychange';
} else if (typeof document.msHidden !== 'undefined') {
  hidden = 'msHidden';
  visibilityChange = 'msvisibilitychange';
} else if (typeof document.webkitHidden !== 'undefined') {
  hidden = 'webkitHidden';
  visibilityChange = 'webkitvisibilitychange';
}

const f2 = async () => {
  if (document[hidden]) {
    // If Websocket connected
    //   Close WebSocket
  } else {
    // If there is no WebSocket connection currently active
    //   Open a Websocket connection 
  }
}

document.addEventListener(visibilityChange, f2, { passive:true });

–

Keep in mind that a closed connection means you are out of sync thus you don’t receive real time updates. If you need 100% uptime on realtime updates, just stay connected (functionnality being more important than cost impact).

–

Last point, if you are receiving lots of messages (ex: gaming : positions of players), you might want to apply backpressure on client, to reduce CPU usage. In this case, using the WebSocketStream API is one option.

It works like pulling in local, reading what you received in the buffer, before actually processing it. As it is a pull strategy, you won’t receive updates in real time, since the action of reading your local data is at your hand.

The current example uses a while true strategy, which is common in gaming rendering loops, but not so much in classic web programming.

// This code sequentialize the events : 
// No more parallelization nor real time
// This is how client backpressure works
// Use it only if high loads of events are received
// but your process function takes time 
// and cannot follow the cadence
// (like game engine rendering)

const wss = new WebSocketStream(WSS_URL);
const {readable, writable} = await wss.connection;
const reader = readable.getReader();
const writer = writable.getWriter();

while (true) {
  const {value, done} = await reader.read();
  if (done) {
    break;
  }
  const result = await renderMyGamingEngine(value);
  await writer.write(result);
}