[Resolved] Edge Cache Servers seem to be stale/out of sync

pagerules
loadbalancing
caching

#1

I have been attempting to deal with a frustrating issue for over a week now.

I’m seeing multiple problems with stale content:

  • Seems like multiple Cloudflare Edge Cache servers are all out of sync.
  • Hard refresh (Command+Shift+R on Chrome+Mac) will show a different page each time.
  • GET API calls also seem to be cached, but I am expecting it to be dynamic
  • Most of the website logged in experience is dynamically generated pages. Also seeing a lot of static content here
  • Hard to get a good load of the page where everything is fresh.

Here is the page/website that I am having problems with.

https://maskil.church/songs

Setup:

  • Single origin server, not load balanced
  • Always works in development when hitting the machine directly, these problems are only observed in production (behind Cloudflare)

In my deploy script, I will make an API call to clear the cache (Purge Everything) as a post-deploy step. I also manually press the button on the Caching settings multiple times, problem still persists.

I created this page rule:

*maskil.church/*
Disable Security, Browser Cache TTL: 30 minutes, Always Online: Off, Cache Level: Bypass, Edge Cache TTL: 2 hours, Origin Cache Control: Off, Disable Apps, Disable Performance

still things being served vary on each reload, and are stale.


#2

Right now, it’s not using Cloudflare, so I can’t test it out.

That Page Rule you have only applies to the home page. If you want it to apply to everything, you need to match mask.church/* (with that wildcard).

Edge Cache servers don’t sync to each other. They independently pull from your origin, then cache until it expires or gets purged. A visitor might not hit the same edge server every time.

Purge Everything will purge all the edge servers, but give it a minute or so to do this.


#3

Okay, I just turned on Cloudflare proxying again.

I had turned it off to debug. It seems that Cloudflare doesn’t refresh immediately.

Also, my page rule pattern is

*maskil.church/*

I accidentally left out the * in my original post, or looks like it treated the trailing * as markdown for italicize. Do I need the beginning *?


#4

Additional details:

  • On my origin server I am running Apache, web application is Python/Django.
  • Server is a VPS hosted at Linode
  • Google Chrome Developer Tools > Network Tab > Disable Cache is open and checked.
  • issue is observable from multiple devices, multiple locations

I am not running any caching plugins on Apache (double checked that mod_cache) is disabled, and from Python/Django I verified that I am not caching HTTP responses.

I am tailing my origin server logs, and when I do a hard refresh on the page, can confirm that only a fraction of the files being served on that page are actually hitting the origin server as requests, so it must be served as cache.

So I’m perplexed – is Cloudflare cache purge not working, or is there something else that is doing intermediate caching?

Some of the HTML being served today (very obviously stale) are over a week old. If you refresh enough times, sometimes the top nav strip on the site will display text navigation links in the top right, and sometimes it will display only a menu icon. The latter is the updated and correct behavior.


#5

@sdayman I updated the details to my original post and additional comments.

Would you be able to take another look please?

Thanks!


#6

Weird. It works in Chrome, but not Firefox. Chrome has the hamburger for your menu, but Firefox has the full text. Firefox also struggles with loading the song list. I got it to load once. Safari also struggles with the site.

Now it’s extra weird. Chrome bounces back and forth with the menu and songlist between reloads.

If someone else can reproduce this: Load the site in Chrome and see if the top menu fluctuates between hamburger and text, and Song List loading intermittently. Maybe @MarkMeyer is around.

If he can confirm the strange behavior, it would be time to open a Support ticket.


#7

I had to think about what you meant with hamburger for a few seconds…

Chrome, fresh install, no add-ons

I’d say that a script responsible for the responsive layout is not working correctly :thinking:

It’s different with Chrome Mobile. No text but the song list is not loading. Or very slow.

Same behavior with Safari mobile.


#8

Thanks for looking into it. Basically over a week ago we decided to just simplify the menu and not have text in the navigation bar, and always show the “hamburger” menu.

So, not sure where the cached content would be coming from.


#9

@sdayman @MarkMeyer

If you look at the menu on the left hand side, if you reload the page a lot, sometimes it will show icons on all the menu items, sometimes it won’t. The ones missing menu icons are due to a change in the code where we upgraded to FontAwesome 5.0 from v. 4.0.

So the HTML being served back on every request is changing, as that’s where CSS is being loaded.


#10

You wouldn’t happen to have duplicate DNS records here at Cloudflare, would you? Like an A record for maskil.church that points to an IP address, then another A record for maskil.church that points to some other IP address? Maybe a duplicate CNAME?

I know you said it’s a single origin server that’s not load balanced, but it’s sure behaving like it’s hitting two different versions of the website.

If you open a Support Ticket, maybe they can account for the differing responses.


#11

@sdayman no, I don’t have multiple A records.

Here is a screenshot

CloudApp

What details should I put in the Support Ticket?

Thanks!


#12

When you open a Support ticket, let them know the page source is inconsistent when browsing from Chrome. Multiple reloads yield different versions.

I think it’s a weird interaction between Cloudflare and how your site handles different visitors. I can see that the menu sometimes loads in “mobile-menu-button” mode, and sometimes not. It makes me think your site has some mobile-friendly toggle that switches between views, and Cloudflare is confusing it.


#13

@sdayman @MarkMeyer thanks for helping to investigate.

I was able to find the root cause. It turns out that multiple instances of Apache were running on the system, and somehow a stuck zombie process of Apache was intercepting and serving requests some of the time.


#14

Yay! I’m glad you got that figured out. That was a weird one. Thanks for following up!