(Cross post from Discord, didn’t receive a reply there.)
Hi all, I’m wondering if someone can help me debug why my D1 latency is so high, or how to improve it?
Context: I’m experimenting with a very small D1 table (25 rows), and I’m doing a SELECT on an indexed unique column.
I’m seeing up to 500 - 1500ms latency doing a read from a Worker, but the console is telling me the query latency is always < 1ms. If the console is telling the truth, then the query is fast, but the network is slow.
My query latencies seem to be bimodal:
if I’ve made a request in the last ~minute, my observed latency is 60-120ms (lower w/ smart placement turned on)
if I haven’t, my first request will be very slow, always above 500ms, often 1300ms - 1500ms, sometimes 1900ms+
The complete code I’m measuring is:
let d1ReadResults = await env.DB.prepare("SELECT * FROM Entries WHERE hash = ?1")
.bind(hash)
.first();
I was expecting D1 latency from a worker to be very fast, but right now it’s often slower than my network calls to services outside of Cloudflare! I’d love to improve this if I can.
On the return object, there should be an object of meta and an attribute will be duration (docs). This will tell you how long it took for the D1 query to complete. What is the time there? My guess is that it is pretty low, and it might be your database is farther away which causes long round trip times.
As you suggested, I checked out the meta object on the D1 query to verify it was reporting the same thing as the dashboard. Sure enough, all the queries run in < 1ms. Here’s a handful of requests w/ query time and the total time the request took:
You can see the pattern I described: after some very high latency requests (> 500ms), there are some moderately fast ones (< 80ms). I waited a minute, queried again, and the pattern repeated, with very high latency (>1600ms) followed by lower (< 60ms).
I don’t think it’s possible that the database is far away from the worker, because that would indicate a minimum latency. As in, no requests would be able to complete faster than that round trip time. But instead, we see some requests completing quickly, and others taking a very long time.
Hmm, that is strange. Almost seems like your worker is getting evicted from running very quickly. If you are able to then I would open a ticket as I am stumped, but maybe another @MVP would know.
It’s weird, right? Yeah, I’m starting profiling after the worker has already begun execution and ran some simple code (an auth check), so I’m not even measuring how long it takes the worker to start.
I wonder whether there’s something with cloudflare giving the worker a D1 server to talk to. Maybe there’s some initial latency and short (~1m) caching that happens there.
Thank you again for the reply, and I’m glad you suggested opening a ticket, because I didn’t realize I could do that! I will give it a shot when I have bandwidth and report back if I learn anything.
I think this is due to D1 not being distributed unless it receives larger amounts of traffic (Like with KV) - so yes, you’d get very high latency if the request goes from China or more commonly Russia to US.