I have been using Cloudflare tunnels to serve a web site from a Windows IIS site. This site has been up for a few months and is working well. However, beginning last week, my cloudflared process is using consistently 20% cpu on a 4 vcpu server. The site is serving roughly 20GB of traffic per day. Cloudflared has historically used 5-10% cpu and now it is using 15-25% cpu. I’m curious to know what affects cpu usage of that process and where I should begin to troublleshoot what caused this increase over the last week. The increase cpu cannot be tied to an increase in traffic. It is quite the opposite. Approx a week ago, total traffic to my site decreased 10-20% at the same time my cpu increased 10-20%.
I’ve confirmed my theory: your Tunnel was automatically enrolled into using protocol: quic (transparently) roughly 1 week ago.
It is not surprising that QUIC-based proxying uses a bit more CPU, since the congestion control, reliable delivery, etc… all happen in user land, whereas plain old HTTP1/2 delegates a lot of that to the TCP netstack from the kernel.
Interesting! I did notice the connection error last week when I didn’t have the quic protocol opened outbound from my web servers. I opened the protocol and the tunnel started successfully.
For my deeper understanding, Is there any somewhere I can read about these differences?
No offense, but your and my understanding of a ‘pretty good discussion’ are very different.
This discussion thoroughly explains how expensive quic is in terms of cpu usage and packet exchange. All of which is above my head and probably 99% of Server/Network admins.
I think I will look into configuring my tunnels to not use quic. Otherwise I have to justify to my boss why we are spending 50% more CPU on these servers.
The general idea is that the usual HTTP over TCP we’re used to has had a long time to be optimized, hardware accelerated and deeply integrated into kernels whereas QUIC is a relatively new technology that hasn’t had as long to get there yet.