It doesn’t look like I can fetch the HTML from a URL, parse it / turn it into a DOM object that I could operate over, and then output the HTML from my modified DOM.
Do I have that correct?
Do I need to do simple searches and replaces against text / html to modify the contents of a webpage? No, css or tag targetting possible?
If I can turn it into a dom to operate? Do you have an example?
Hey @mattp - glad to see you’re finding this useful, and thanks @zack for sharing the post!
I’ve just tested this against an actual site (as opposed to the Playground) and oddly I’m not seeing the same error - could I check if you copy/pasted the source from the raw gist?
If you’re still having trouble, let me know and I’ll try a few other things this evening
Awesome, happy to hear it worked! I’ve updated the link in my blog post to go directly to the raw gist, so hopefully that helps anybody stumbling across this in the future.
Also, you are probably already doing this but to make sure you don’t mess with images, css, etc, I am doing this hack right now:
// Make sure we only modify text, not images.
let type = response.headers.get("Content-Type") || ""
if (!type.startsWith("text/")) {
// Not text. Don't modify.
return response
}
if(type == "text/css"){
return response;
}
Images were getting broken and it took me a bit to figure it out…
Thanks for sharing that, that’s a really good shout - i’ll look into modifying the worker to incorporate that too and let you know! In this case I think we can just bypass unless it’s exactly text/html?
Yeah, that was my thought, too! I can probably localize my code down to just text/html…was worried that if the headers weren’t set right, I wouldn’t catch the right content but probably a silly concern.
Native DOM parsing is an absolute MUST HAVE feature for me.
For the past 5+ years, I have written all my websites in raw HTML5/CSS3/JS, using only the standards and consistently refusing to use any of the plethora of frameworks like node.js or JQuery or anything like that. I really like to write all my stuff by hand and have full control.
Now I want to port my global status page from client-side javascript to server-side javascript using Cloudflare Workers. This would be very easy to do if Cloudflare Workers natively supported DOM manipulation.
Without native DOM manipulation, the simplest approach I can think that would technically work would be to use a huge template string and generate HTML as text, but this would be a very painful approach compared to simple DOM manipulation. I simply DO NOT want to bring in random dependencies for this.
Please implement native DOM manipulation soon! This would open a world of possibilities!