SXG subresources not found in webpkgcache.com

I enabled SXG on my website. The Signed Exchange of HTML is correctly prefetched in Google results or on prefetch page (https://signed-exchange-testing.dev/prefetch). However, subresources are prefetched - in the best case - only partially. And partial prefetching of subresources in SXG means no subresources will be actually used due to privacy reasons (https://github.com/WICG/webpackage/blob/main/explainers/signed-exchange-subresource-substitution.md). So I’m left with prefetching HTML only.

Here is a demo SXG-enabled page:
https://www.planujemywesele.pl/experiments/worker_tests/10

It correctly generates SXG when asked for. It includes subresources with signatures. The main document gets cached by webpkgcache.com, when I use SXG validator Chrome extension (https://chrome.google.com/webstore/detail/sxg-validator/hiijcdgcphjeljafieaejfhodfbpmgoe).

However, when I try to preload it using prefetch page (https://signed-exchange-testing.dev/prefetch/https://www.planujemywesele.pl/experiments/worker_tests/10), I get CORS errors for the resources:

I found the CORS errors happen because the resource being prefetched doesn’t exist in webpkgcache.com and it returns HTML with a client-side redirect to the actual resource:

> curl -siH "Accept: application/signed-exchange;v=b3" https://www-planujemywesele-pl.webpkgcache.com/sub/5S5BVKziDyaa/s/www.planujemywesele.pl/experiments/worker_tests/10.jpeg
HTTP/2 200
location: https://www.planujemywesele.pl/experiments/worker_tests/10.jpeg
cache-control: private
x-silent-redirect: true
content-type: text/html; charset=UTF-8
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
date: Thu, 26 Oct 2023 19:16:17 GMT
server: sffe
content-length: 363
x-xss-protection: 0
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

<HTML><HEAD>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>Redirecting</TITLE>
<META HTTP-EQUIV="refresh" content="0; url=https://www.planujemywesele.pl/experiments/worker_tests/10.jpeg">
</HEAD>
<BODY onLoad="location.replace('https://www.planujemywesele.pl/experiments/worker_tests/10.jpeg'+document.location.hash)">
</BODY></HTML>

The browser gets HTML instead of the SXG version of the subresource. This HTML response doesn’t contain “Access-Control-Allow-Origin” header, and this causes the CORS error.

How to put subresources into webpkgcache.com? I would appreciate any help.

I have some more info to share. I installed Cloudflare worker proxy to intercept and log requests. This way I can log what Googlebot does. Specifically, I was interested if Googlebot downloads subresources. And it does!

"request": {
      "url": "https://www.planujemywesele.pl/experiments/worker_tests/10.gif",
      "method": "GET",
      "headers": {
        "accept": "*/*;q=0.8,application/signed-exchange;v=b3",
        "cf-connecting-ip": "66.249.65.67",
        "cf-ipcountry": "US",
        "cf-ray": "81cadc774a250b76",
        "cf-rum-ctag": "sxg_enabled",
        "cf-visitor": "{\"scheme\":\"https\"}",
        "user-agent": "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.5993.117 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)",
        "via": "sxgrs"
      },

As you can see in the above log excerpt, the user agent contains the "Googlebot" substring. It asks specifically for the "application/signed-exchange" content type.

So Googlebot downloads 10.gif file, but it doesn’t put it in webpkgcache.com.

The other idea was to test the URL using a live-inspection tool from Google Search Console.
Here are the HTTP headers of the response Googlebot got from Cloudflare. It clearly contains all subresources.

HTTP/1.1 200 OK
age: 33
cache-control: max-age=600, public
cf-as-number: 15169
cf-cache-status: HIT
cf-ray: 81cae9acb3fa2c94-DFW
content-encoding: mi-sha256-03
content-type: text/html; charset=utf-8
date: Fri, 27 Oct 2023 12:23:53 GMT
digest: mi-sha256-03=QIzS1L5UIW6WLMF0haik9lJ5aGjaGa5eeQBwkPcdZhw=
link: <https://www.planujemywesele.pl/experiments/worker_tests/10.gif>;as=image;rel=preload,<https://www.planujemywesele.pl/experiments/worker_tests/10.gif>;rel=allowed-alt-sxg;header-integrity="sha256-KyBKX3EuDQ+G9fHfffaHoEgLGU6IJgPouhtlLZMTQ3M=",<https://www.planujemywesele.pl/experiments/worker_tests/10.svg>;as=image;rel=preload,<https://www.planujemywesele.pl/experiments/worker_tests/10.svg>;rel=allowed-alt-sxg;header-integrity="sha256-NqWoVvKrYhz0C0ScoQTIQu5gPqx5ivG2ZRbVm2nXHUk=",<https://www.planujemywesele.pl/experiments/worker_tests/10.jpeg>;as=image;rel=preload,<https://www.planujemywesele.pl/experiments/worker_tests/10.jpeg>;rel=allowed-alt-sxg;header-integrity="sha256-5S5BVKziDyaapTTcFpVM4yo1worif9jiL2/alEyLFyQ=",<https://www.planujemywesele.pl/experiments/worker_tests/10.css>;as=style;rel=preload,<https://www.planujemywesele.pl/experiments/worker_tests/10.css>;rel=allowed-alt-sxg;header-integrity="sha256-d4HNY6uGf25QgAB7vg4p9agBoZsUZ9WTg0oMEKvUGLE=",<https://www.planujemywesele.pl/experiments/worker_tests/10.js>;as=script;rel=preload,<https://www.planujemywesele.pl/experiments/worker_tests/10.js>;rel=allowed-alt-sxg;header-integrity="sha256-lwJjHKH3bwisNomE3Qs4818TTdhN+FuVcLwGiDp948M="
referrer-policy: origin-when-cross-origin
server: cloudflare
status: 200 OK
x-app-id: 1
x-content-type-options: nosniff
x-download-options: noopen
x-frame-options: SAMEORIGIN
x-request-id: 7753979e-9dd1-4d1a-99dd-6c675f521449
x-xss-protection: 1; mode=block

I did the same test for one of subresources. It looks correctly too:

HTTP/1.1 200 OK
access-control-allow-origin: *
cache-control: max-age=31536000, public
cf-as-number: 15169
content-encoding: mi-sha256-03
content-type: text/css; charset=utf-8
digest: mi-sha256-03=lQHqU5uF6lGlhpspGrpINoTM/1tu4BfADBOeurXzsrQ=
etag: W/"ade7444c67f528da46ad738cc387ca3e"
referrer-policy: origin-when-cross-origin
server: cloudflare
status: 200 OK
x-content-type-options: nosniff
x-download-options: noopen
x-xss-protection: 1; mode=block

I have 2 js subresources. One is correctly saved to webpkgcache.com, while the other is not.
The HTTP headers of resources differ. Maybe some headers are problematic? However, I don’t see any prohibited headers from Cloudflare documentation here.

Good headers (url: https://www.planujemywesele.pl/scripts-proxy/gtag.js?id=G-60MLFKYS4J):

Age: 97069
Alt-Svc: h3=":443"; ma=86400
Cache-Control: max-age=604800
Cf-Cache-Status: HIT
Cf-Ray: 81cc14c068f735cb-MAN
Content-Encoding: br
Content-Type: application/javascript; charset=UTF-8
Cross-Origin-Resource-Policy: cross-origin
Date: Fri, 27 Oct 2023 15:48:03 GMT
Expires: Fri, 27 Oct 2023 12:11:25 GMT
Last-Modified: Fri, 20 Oct 2023 12:11:25 GMT
Nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=bQ1Q9zADga74iflpAoTtuEWES4ANt6SaHC9uhG5xQusv14BlwempRrMIoFhcBzorLxZczGSXygb%2BQX1mqkPVhHO7FgXyiXQgXC9sj%2FxyB%2BbOTD992ISl7YBQea7ILc5EorhzTamknCg%3D"}],"group":"cf-nel","max_age":604800}
Server: cloudflare
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
X-Xss-Protection: 0

Bad headers (url: https://www.planujemywesele.pl/assets/public-1d226fba0fe711932c9a6f17e92734b776840082b5b69c9f3f10b5c2e4291afc.js):

Access-Control-Allow-Origin: *
Age: 65230
Alt-Svc: h3=":443"; ma=86400
Cache-Control: max-age=31536000, public
Cf-Cache-Status: HIT
Cf-Ray: 81cc17e42c6cb2e7-MAN
Content-Encoding: br
Content-Type: application/javascript
Date: Fri, 27 Oct 2023 15:50:11 GMT
Etag: W/"651c3765-5efb6"
Expires: Wed, 02 Oct 2024 15:47:18 GMT
Last-Modified: Tue, 03 Oct 2023 15:46:45 GMT
Nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=T%2FbydV7LwsSAqrODn916NnFzHglR7vXezxJslRok1Wfipz1TunSSMQen03zY7IzdXwMggp2YY0kHTN%2BBY6u4FS9YDvn90r0Yc0kqXiLXQXPjFWgRNJLlX8uCt%2BQMF3Q%2BYu7jje2%2B2cQ%3D"}],"group":"cf-nel","max_age":604800}
Server: cloudflare
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Vary: Accept-Encoding
X-Content-Type-Options: nosniff

For the people struggling with SXG, I found a solution.

The issue is caused by HTTP headers that change over time. I describe how to fix them in my blog posts:

  1. https://blog.pawelpokrywka.com/p/fixing-sxg-prefetching-errors-caused-by-mutable-subresources
  2. https://blog.pawelpokrywka.com/p/debugging-complex-signed-exchanges-subresource-issue

tl;dr: disable etags for static assets, set Vary header to Accept-Encoding, disable Enhanced HTTP/2 Prioritization or route static assets through a worker.

2 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.