Non-cached request logs with OriginResponseTime: 0

My team is using LogPush to stream Cloudflare logs into S3, then parse them to monitor traffic health.

Some of these requests have timing data that I don’t understand. I’ve included a full log below, but the relevant parts are:

  • CacheCacheStatus: unknown
  • EdgeStartTimestamp:1616025598083000000 (2021-03-17 18:59:58)
  • EdgeEndTimestamp:1616025600399000000 (2021-03-17 19:00:00)
  • OriginResponseDurationMs: 0
  • OriginResponseTime: 0

Specifically we’re trying to use OriginResponseDurationMs to monitor round-trip latencies.

Here’s the full log:

{
"Cookies":{"id_token":"REDACTED"},
"RequestHeaders":{},
"ResponseHeaders":{},
"ParentRayID":"00",
"RayID":"631a2d940fe4ecee",
"SecurityLevel":"med",
"ZoneID":76556440,
"CacheCacheStatus":"unknown",
"CacheTieredFill":false,
"CacheResponseBytes":94570,
"CacheResponseStatus":200,
"ClientASN":46562,
"ClientCountry":"ca",
"ClientDeviceType":"desktop",
"ClientIP":"REDACTED",
"ClientIPClass":"noRecord",
"ClientRequestProtocol":"HTTP/2",
"ClientSrcPort":59136,
"ClientSSLCipher":"AEAD-AES128-GCM-SHA256",
"ClientSSLProtocol":"TLSv1.3",
"ClientXRequestedWith":"",
"ClientRequestBytes":13273,
"ClientRequestHost":"REDACTED.com",
"ClientRequestMethod":"POST",
"ClientRequestPath":"REDACTED.aspx",
"ClientRequestReferer":"REDACTED.aspx",
"ClientRequestURI":"REDACTED.aspx",
"ClientRequestUserAgent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0",
"EdgeColoCode":"YUL",
"EdgeColoID":70,
"EdgeEndTimestamp":1616025600399000000,
"EdgePathingOp":"wl",
"EdgePathingSrc":"macro",
"EdgePathingStatus":"nr",
"EdgeRateLimitAction":"",
"EdgeRateLimitID":0,
"EdgeRequestHost":"REDACTED.com",
"EdgeResponseBytes":27672,
"EdgeResponseCompressionRatio":3.78,
"EdgeResponseContentType":"text/html",
"EdgeResponseStatus":200,
"EdgeStartTimestamp":1616025598083000000,
"FirewallMatchesActions":[],
"FirewallMatchesRuleIDs":[],
"FirewallMatchesSources":[],
"OriginIP":"",
"OriginSSLProtocol":"unknown",
"OriginResponseBytes":0,
"OriginResponseDurationMs":0,
"OriginResponseHTTPExpires":"",
"OriginResponseHTTPLastModified":"",
"OriginResponseStatus":0,
"OriginResponseTime":0,
"WAFAction":"unknown",
"WAFFlags":"0",
"WAFMatchedVar":"",
"WAFProfile":"unknown",
"WAFRuleID":"",
"WAFRuleMessage":"",
"WorkerCPUTime":18290,
"WorkerStatus":"ok",
"WorkerSubrequest":false,
"WorkerSubrequestCount":4
}

This example came from a log file with 3,765 entries. Of those, 403 logs have "OriginResponseDurationMs":0, "CacheCacheStatus":"unknown". Some more example Ray IDs are 631a2dfbdf76e35a, 631a2d9d586d02b8, 631a2d433a64c510, and 631a2d0718e0023b.

Can anyone help me interpret these logs? Thank you!

Looks like a CF Worker request. Are you doing any form of caching/cache bypass within a CF Worker? Where bypass is via method other than cache control headers? i.e. bypass cache on cookie

Do you have similar requests with for GET requests? Unless you’ve written a CF Worker to cache POST requests, then Cloudflare usually doesn’t cache "ClientRequestMethod":"POST" POST requests only GET.

Though don’t have an explanation of originresponse duration/time being 0. However, from my logpush logs only time I see origin response time/duration not equal != 0 using CF Workers, is for Worker subrequests (WorkerSubrequest = true) and not the parent request (WorkerSubrequest = false) for my CF Worker guest full HTML page cache based Worker setup which makes sense. So check and filter requests where WorkerSubrequest = true

edit: actually try filtering the request on EdgeStartTimestamp to see the request’s flow/trail through CF.

I did this for my Wordpress blog with custom CF Worker guest full HTML page caching and can see the the reason why origin response duration/time maybe 0 is that for parent request/parent rayid 635470e884a0173c (where logging output for \(.RayID)-\(.ParentRayID)) there was an exception which forced the Worker to go back to my origin for a response and thus finally a subrequest which connected back to the origin. So in this context cache status unknown might just mean a cache bypass to origin via CF worker subrequest i.e. Worker exceptions ?

pzcat /home/cfcmm-logs/20210325/*.log.gz | jq -r 'select(.BotScore <=100 and .ClientRequestHost == "blog.domain.com" and .EdgeResponseStatus == 200 and .EdgeStartTimestamp == "2021-03-25T01:42:10Z") | "\(.EdgeStartTimestamp) \(.ClientIP) \(.RayID)-\(.ParentRayID) \(.ClientRequestURI) \(.ClientRequestMethod) \(.ClientRequestReferer) \(.EdgeResponseStatus)-\(.EdgeResponseContentType) \(.OriginResponseStatus) \(.OriginResponseDurationMs)-\(.OriginResponseTime) \(.EdgeRequestHost) ctf-\(.CacheTieredFill) cs-\(.CacheCacheStatus) crs-\(.CacheResponseStatus)-\(.CacheResponseBytes) \(.EdgeColoCode) \(.ClientCountry) \(.ClientIPClass) \(.WorkerStatus)-\(.WorkerSubrequest)-\(.WorkerSubrequestCount) \(.BotScore) x \(.BotScoreSrc) \(.ClientRequestUserAgent) d-\(.ClientDeviceType) edgeip-\(.EdgeServerIP)"'

2021-03-25T01:42:10Z 2a06:98c0:3600::103 635470e8f4de173c-635470e884a0173c / GET  200-text/html 0 0-0 blog.domain.com ctf-false cs-hit crs-200-34738 SIN au unknown unknown-true-0 10 x Machine Learning Statusflare beta d-desktop edgeip-

2021-03-25T01:42:10Z 2a06:98c0:3600::103 635470e884a0173c-635470e6d847173c / GET  200-text/html 0 0-0 blog.domain.com ctf-false cs-unknown crs-200-34333 SIN au unknown exception-true-2 10 x Machine Learning Statusflare beta d-desktop edgeip-

2021-03-25T01:42:10Z 2a06:98c0:3600::103 635470e904e7173c-635470e884a0173c / GET  200-text/html 200 290-71000000 blog.domain.com ctf-false cs-unknown crs-200-36934 SIN au unknown unknown-true-0 10 x Machine Learning Statusflare beta d-desktop edgeip-2400:cb00:115:1024::ac44:4915

2nd request was the first with exception-true-2 for \(.WorkerStatus)-\(.WorkerSubrequest)-\(.WorkerSubrequestCount) mean Worker exception occurred and the 2 subrequests are the 1st and 3rd from output where only 3rd showed origin duration and response time 290-71000000 (\(.OriginResponseDurationMs)-\(.OriginResponseTime) )

1 Like

I’m curious how did you enable “Cookies” field in Logpush? I can’t find it anywhere in my Logpush configuration. Or, maybe you have WAF payload logging enabled?

If you are using Workers to handle POST request and return the results to your visitor then I think * OriginResponseDurationMs : 0 makes sense - as the request was not reaching your origin server anyway.

1 Like

If 403 is returned from the WAF or other Cloudflare security features then this also makes sense. Cache status of “unknown” is usually refers to “dynamic”, and blocking page generated by Cloudflare is “dynamic”.

2 Likes

This is a beta feature, I believe, which allows for us to define specific headers to pass through. Cookies just happened to be one of those headers.

2 Likes

Sorry for the confusion, I see how what I said was misleading.

When I said 403, I was referring to the number of logs with "OriginResponseDurationMs":0, "CacheCacheStatus":"unknown", not the 403 Forbidden status. In other words, 403 / 3,765 = 10% of our logs have these unexpected values (in one sample file, at least).

I see :laughing:

Thanks everyone for your great suggestions so far.

I decided to divide the 3,765 requests in one log file into three categories:

  1. "CacheCacheStatus":"hit" - since these requests are expected to have "OriginResponseTime":0, I filtered them out of the rest of the analysis.
  2. "OriginResponseTime":0 - these logs (n=403) are confusing me so I’ll call them “bad” requests
  3. "OriginResponseTime":>0 - the remaining logs (n=921) I’ll call the “good” requests

Log fields that seem unrelated to the issue

ClientRequestMethod bad good
GET 312 630
POST 89 282
PUT 2 4
DELETE 0 5
WAFAction bad good
unknown 403 921
OriginResponseBytes bad good
0 403 921
EdgeResponseContentType bad good
(blank) 97 117
application/json 131 620
application/octet-stream 7 0
application/pdf 0 2
application/x-javascript 5 7
image/gif 14 13
image/jpeg 2 4
text/css 4 6
text/html 133 131
text/javascript 1 0
text/plain 9 10
text/xml 0 11
FirewallMatchesActions bad good
(blank) 378 913
allow 4 0
block 2 0
log 15 6
log,log 4 2

Log fields that seem relevant to this issue

CacheResponseStatus bad good
0 89 0
200 264 749
201 0 18
202 0 66
204 0 6
301 24 18
302 11 16
304 1 0
400 1 6
401 1 22
403 0 3
404 4 7
504 0 9
524 8 1

For the “good” requests, EdgeResponseStatus mirrors CacheResponseStatus. They differ for “bad” requests.

EdgeResponseStatus bad good
200 274 749
201 0 18
202 0 66
204 6 6
301 61 18
302 11 16
304 1 0
400 1 6
401 1 22
403 2 3
404 4 7
499 34 0
504 0 9
524 8 1
OriginResponseStatus bad good
0 403 1
200 0 749
201 0 18
202 0 66
204 0 6
301 0 18
302 0 16
400 0 6
401 0 22
403 0 3
404 0 7
504 0 9
ParentRayID bad good
00 387 573
(not 00) 16 348
WorkerSubrequest bad good
true 16 348
false 387 573
WorkerStatus bad good
ok 308 0
unknown 95 921
WorkerSubrequestCount bad good
0 95 921
3 4 0
4 274 0
5 8 0
6 13 0
7 6 0
8 2 0
9 1 0

I think everything points at our workers as @eva2000 and @erictung suggest above.

In one instance, we have a worker on POST /foo that makes a sub-request to POST /bar. These are logged like:

{
    "ParentRayID":"00",
    "RayID":"631a2d293edded2f",
    "ClientRequestPath":"/foo",
    "OriginResponseDurationMs":0,
    "OriginResponseStatus":0,
    "OriginResponseTime":0,
    "EdgeStartTimestamp":1616025580993000000, // 2021-03-17 18:59:40
    "EdgeEndTimestamp":1616025592197000000, // 2021-03-17 18:59:52
    "WorkerStatus":"ok",
    "WorkerSubrequest":false,
    "WorkerSubrequestCount":4
}
{
    "ParentRayID":"631a2d293edded2f",
    "RayID":"631a2d2a525eed2f",
    "ClientRequestPath":"/bar",
    "OriginResponseTime":10949000000,
    "EdgeStartTimestamp":1616025581176000000, // 2021-03-17 18:59:41
    "EdgeEndTimestamp":1616025592193000000, // 2021-03-17 18:59:52
    "WorkerStatus":"unknown",
    "WorkerSubrequest":true,
    "WorkerSubrequestCount":0
}

I guess my confusion boils down to the fact that the OriginResponseTime of the initial request to /foo does not include the OriginResponseTime of the sub-request to /bar, even though the timestamps show that the child request blocked the parent. Are the timestamps the only way to calculate the duration of the parent request?

maybe that is a by product of how long your origin takes to process the POST request. With GET requests, most of my EdgeStartTimestamps are closer together.

Hi there @dkeller,

I’ve escalated your case over to our engineering team for further review.

We’ll provide you an update via the support ticket you raised as soon as one becomes available.

Cheers,

1 Like

Blas and the engineering team helped me via support ticket to understand the expected logs when a worker intercepts a request.

In those logs, we are seeing that subrequests have the WorkerSubrequest field set to True . All of these requests also have a OriginResponseTime and OriginResponseDurationMs metrics populated.

This falls in line with our expectations. The Parent request’s origin is technically the Cloudflare Worker. We don’t find much value in the OriginResponseTime metric for that kind of request because we aren’t measuring latency to a customer origin.

We will have those origin timing fields for subrequests though, which is what the logs reflect. The subrequest reaches the customer’s actual origin, so we will log origin metrics.

You may want to look into EdgeTimeToFirstByteMs on the parent request, though it won’t contain the time it takes to complete streaming the response to the client.

Per their suggestion I’ve added EdgeTimeToFirstByteMs to my LogPush configuration. In my testing, it’s roughly equal to the difference between EdgeEndTimestamp and EdgeStartTimestamp on both the parent request and the worker subrequests. So now it’s my go-to for tracking request latencies.

Thanks everyone for your help understanding these logs!

1 Like

Thanks for updating us :slight_smile: