Using Profile-Guided Optimization (PGO) for Cloudflare products

Hello.

Right now I investigate the Profile-Guided Optimization (PGO) topic and its usage across the IT industry. According to my multiple findings and benchmarks (all of them are available in my repo: https://github.com/zamazan4ik/awesome-pgo) PGO helps with achieving better performance for CPU-bounded tasks for many kinds of applications: databases, networking, compilers, developer tooling, and much more.

Since Cloudflare cares about the end-customer experience and its own TCO (e.g. it’s defined in https://blog.cloudflare.com/cloudflare-gen-12-server-bigger-better-cooler-in-a-2u1n-form-factor/), I want to suggest to the Cloudfldare dev team evaluate using PGO approaches for their internal infrastructure.

There are multiple things to consider with PGO: which PGO kind to use (instrumentation vs sampling), how to run PGO continuously (like it’s already been done in Google for years with their AutoFDO approach) and many others.

All of these questions should be carefully considered. However, right now I suggest starting to evaluate this optimization approach for the Cloudflare systems. I believe it will bring value not only to Cloudflare itself but to its customers as well.

Would be happy to answer any of your questions!

P.S. I created this post here since I didn’t find a better way to talk with the Cloudflare devs. If you know such a way - please let me know.