Caching issue with Custom Domain for preview branches

For Workers & Pages, what is the name of the domain?

relatecx.com

What is the issue or error you’re encountering

When going to lower environments (dev/staging) CFPages is serving up production files on different requests

What steps have you taken to resolve the issue?

Purging cache, setting up cache bypass rules, redoing our DNS records to ensure accuracy

What are the steps to reproduce the issue?

We followed this guide: Add a custom domain to a branch · Cloudflare Pages docs

Setup

Preview Branches: dev & staging
Production branch: prod

  • we have set this under our Workers & Pages / / Settings
  • we are not using git integration. We have CI pipelines that use wrangler pages deploy along with --branch and --commit-hash to ensure we push to the appropriate CD Branch.

For our 3 environment names we have custom domain assigned

  • each custom domain is a hostname we use Cloudflare to manage our DNS for. Following the instructions, we created the custom domain from our workers and pages app. Then for the preview branches, we went to our DNS records and updated the target to be dev.<page-name>.pages.dev and staging.<page-name>.pages.dev.
  • As per the CF documentation we have left them proxied:

Currently, this setup is only supported when using Cloudflare DNS.

If you attempt to follow this guide using an external DNS provider, your custom alias will be sent to the production branch of your Pages project.

Our custom domains look like:

  • app. → prod (CF branch name)
  • app-dev. → dev
  • app-staging. → staging

Problem
On our preview sites (app-dev.<domain> and app-staging.<domain>) we have noticed that cloudflare is returning inconsistent files. Most of the time it returns the correct file deployed to environment branch.

  • Sometimes the file request returns with 404 status when a file only exists on the preview branch but has not made it to prod
  • Sometimes the file request returns the prod version of the file instead of the dev or staging version. For example one of our bundle files is statically named: service-worker.js. And we have a unique identifier inside of it per release which allows us to confirm we’re getting the prod file when we should be getting the preview version.

We’ve tried:

  • purging cache (everything)
  • cache rules to bypass everything
2 Likes

I’m having the same problem, opened before I found this topic: Pages confuse preview domain and ship production deployment

1 Like

I’m deealing with the same issue on my site https://timhortonsmenus.com/ Even after purging cache and setting bypass rules, changes don’t reflect properly. Sometimes old versions of files load instead of the updated ones. Has anyone found a solid fix for this?

2 Likes

We notice it in (Chrome, Safari, and Firefox) browsers, postman, and curl. We really need some help with this as Cloudflare doesn’t expose some of these settings and just happens automagically.

1 Like

Just wanted to report we’re seeing the same issue as of a few days ago.

It manifested itself as intermittent but regular 404s in our Sveltekit app, seemingly only the preview branch, ultimately breaking the app due to a Content-Type mismatch as well.

The underlying cause does indeed seem to be sporadic routing to the prod deployment.

We generate multiple builds from the same underlying git ref for our test and production deployments. Sveltekit automatically inserts the build timestamp into version.json.

Constant GET requests to /_app/version.json on our preview domain are occasionally displaying the timestamp of the last production build.

These particular timestamps have never been deployed to a preview branch, only production, so it’s definitely not just a case of stale data.

1 Like

I’m getting sporadic Content-Type mismatch too. Most of the time my end-to-end tests fail because of my content security policy (preview domain is not allowed to fetch from production). webkit is affected as well, eg.

   3) [webkit] › e2e/index.e2e.ts:8:3 › e2e › title ─────────────────────────────────────────────────

    Refused to load https://rvt.app/_build/assets/roboto-latin-500-normal-CkrA1NAy.woff2 because it does not appear in the font-src directive of the Content Security Policy.

    Error: page.goto: Test ended.
    Call log:
      - navigating to "https://preview.rvt.app/", waiting until "load"

Another breadcrumb: I have another project where the production domain is a subdomain (ie. not apex), and it doesn’t seem to be affected by this issue: GitHub - phi-ag/solid-pages: Opinionated demo app running SolidStart on Cloudflare Pages

The affected project (where the production domain is apex) now fails almost always right after deployment (e2e tests): Preview · Workflow runs · phi-ag/rvt-app · GitHub

Starting 9 hours ago I got three consecutive passing pipelines :astonished_face: Maybe this was fixed by Cloudflare?

Can confirm we haven’t seen the issue in the last ~48-60h.

1 Like

I just had a failing pipeline: preview tried to fetch from production again. Retrying the same deployment fixed it, but this issue is not completely resolved. :person_shrugging: