It looks like right now a Worker is executed on every request to a route, before the edge cache is checked, is that correct?
Are there any plans to allow workers to run between the edge cache and the origin? So the worker would only run if the route wasn’t already cached in the edge cache, and the worker’s response would be cached in the edge cache?
That’s correct, Workers run “before” cache, so that subrequests can hit cache. We’ve found that this is preferable for most use cases, because one of the main use cases for Workers is to transform and break up queries in order to improve edge cache hit rate. Moreover, most workers run very quickly (less than 1ms), and so there’s not much benefit to caching the output.
With that said, there are obviously some use cases where caching the output of a worker could be useful, e.g. if it performs a CPU-intensive operation like resizing an image. To that end, we plan to implement the standard Cache API. However, we don’t currently have any estimate when that might become available.
Do you have a use case that would benefit from caching output?
I’m about to start playing heavily tomorrow, testing different scenarios, and was wondering a few things:
During the beta are there any guidelines on public writeups of performance benchmarking. I’m assuming something like “run it by us first to make sure you’re not doing something stupid that would falsely represent how kickass our product is”
I see that Workers run before the Cache hits. Can you clarify where it runs with respect to things like a) page rules b) WAF rules c) IP Firewall d) Rate Limiting?
@nick6 If you do performance benchmarks I’d love to see what you end up with. I’m pretty happy with what we have so far, but I also expect there’s lots of room for improvement. So, if you see a performance number you’re disappointed with, it’s likely we can improve it.
The rule of thumb for workers is that they run “after security but before everything else”. Note that whether they come before or after page rules depends on the rule – a page rule disabling WAF runs before workers, but most page rules run after.
Thanks. So would the following be a reasonable expectation:
CF DNS > WAF (IP or Rules) > Workers > Rate Limiting > Load Balancing > Argo
Assuming we are doing more filtering like https://developers.cloudflare.com/workers/recipes/return-403/ and we had Rate Limiting also enabled. So if we 403’d the customer or other, then we would assume it would not impact Rate Limiting, Argo or Load Balancing metrics/spend?
Because if the request happens to be a POST, it’s gonna need its body!
This is something that tripped me up a little; while the bug was easy enough to identify, it happened in part because I based my worker code on one of the examples.
Looks like there can only be one script per zone id. I feel that this is quite limiting. A given zone id can contain many different sites with drastically different requirements. I know you can add a conditional at the begining of the script to handle this case, but it’s introducing unnecessary complexity and increasing the likelihood of bugs (via shared global scope etc). So what I would like to see is the ability to add multiple scripts per zone, with routes dictating when the script executes, and each script running in it’s own worker instance.
The trouble is, each script needs to be set up in its own sandbox, which has some fixed overhead. Although our implementation is very efficient, it nevertheless tends to be the case that a site with two scripts will take twice the resources compared to the same site with one script containing all the logic. We figure most customers would rather merge their code than pay twice the price. We do, however, allow enterprise customers to use multiple scripts – but this is somewhat of a temporary measure until we can come up with better tools.
Note that you can keep your code organized by splitting it into multiple event handlers. For example, you can write:
addEventListener("fetch", event => {
// This handler is for foo.example.com only.
if (new URL(event.request.url).hostname != "foo.example.com") return;
// ... handle /foo ...
});
addEventListener("fetch", event => {
// This handler is for bar.example.com only.
if (new URL(event.request.url).hostname != "bar.example.com") return;
// ... handle /bar ...
});
All event handlers will run on every request, but at most one can call event.respondWith(). If none call respondWith() then the request is sent to the origin as normal.
@nick6 I think Rate Limiting is considered a “security feature” and thus comes before Workers, though I could be wrong. We should have better documentation on this soon.
Regarding how Workers affects usage-based billing for other features, unfortunately we have not worked out the details yet. At present, please assume the other features will be charged regardless of what the Worker does. I agree, though, that that’s probably not the right answer in the long-term, we just need some more time to get this figured out.
@dan42 I agree, but unfortunately, due to a bug, the code you wrote currently will throw an exception for GET requests (which have no body). This is actually a bug in our implementation; under the spec it’s supposed to work. We’ll fix this soon.
I’ve also proposed to the Service Worker spec authors that we really ought to have a nicer way to rewrite a request URL, that doesn’t require listing all members of Request. method, headers, and body are the most important ones, but technically to make a perfect copy you also want redirect and possibly other fields.
In the editor the response is as expected; a bunch of XML is spat out. But when requesting the URL directly - for example in my browser or using curl - an “Error 1101” (rendering error) is returned.
And that works fine. I guess the assumption I was making was that the headers from the fetch('https://httpbin.org/xml') would be preserved when responding.
The actual URL I’m trying to retrieve is https://news.google.com/news/rss/?ned=en_za&gl=ZA&hl=en which works in the editor using the httpbin example code block but not with this Google News RSS…
I think what you are observing might be browser-side behavior. Using Google Chrome, I see something different from what you described: In the preview, I see just the unstyled text “Wake up to WonderWidgets! Overview Why WonderWidgets are great Who buys WonderWidgets” – this appears to be the textual content of the XML file with all the styling removed. But when deployed to prod, I see Chrome’s XML renderer, which gives me a tree view of the XML AST. It looks like the difference is because XML in an iframe is rendered differently from XML outside an iframe. The preview is shown inside an iframe, but if I extract the frame’s address and load that directly in a new tab, I see exactly what I see when the script is deployed.
What browser are you using? Could this “Error 1101” actually be your browser complaining that it doesn’t know how to render the XML? But in an iframe, it decides to dump it as text instead?
Note that https://httpbin.org/xml returns a Content-Type of application/xml, not text/xml. text/xml is not the correct MIME type for XML, so when you change it to that, your browser’s XML rendering behavior is probably disabled, hence “fixing” the problem.
Using Google Chrome, I see something different from what you described
I think the issue was actually the RSS feed I was trying to consume. I’ve now run through a whole bunch of other feeds that all work perfectly.
But when deployed to prod, I see Chrome’s XML renderer, which gives me a tree view of the XML AST
So I see the same behavior with most of the feeds with the exception of the Google News one. That one renders without any styling in the editor preview but results in an error when deployed to prod.
This code will cause my prod deployment to return that Error 1101:
I suspect that the problem may actually be with some other Cloudflare feature you’ve turned on, but unfortunately I don’t know what it could be. Could you try going through the dashboard and turning off things like email obfuscation, Rocket Loader, minification, etc. and see if any of those fix the problem? If so, please let me know which feature was the offender so we can alert the appropriate team.
This is a brand new Cloudflare setup (I set it up specifically to try out Workers).
I haven’t really changed much from the default configuration with the exception of some of the SSL/TLS settings (turning on TLS 1.3 support, redirecting http to https) and redirecting www.* to the www-less domain. I haven’t touched anything else. I’ve tried toggling these setting and retrying but I’m still getting the same error when deployed to prod.
Could it be that your site has something special turned on that’s making it work?
Would it help if I gave you the domain I’m working with? It is southafricanne.ws. I’m not sure if you have access to the logs?
Also, since the Error 1101 page is a 500, is this a Cloudflare error or something wrong with my worker?
@kieran.hunt92 We’ve tracked this down to an internal bug which we hope to have fixed within a few days. FWIW, error 1101 means that JavaScript threw an exception (I should have known that, but forgot). In this case it’s complaining about a redirect loop, because it thinks the remote site is returning a 301 redirect to itself. This is due to our bug in the way we handle subrequests.