Replace or insert HTML tag

I’m using workers to update some content I need in webpages, and generally it’s working great. I want to ensure tags are set from some KV data I have. I have some pages where the tag is missing, if it’s missing I would like to add an element, and if it’s there just replace it with the content.

I can replace the contents of an element fine, and if I assume it’s not on the page I can insert a new element too, but I get duplicates for pages where the title is already there. What I can’t do is the logic to say add if it’s not there. I’ve tried creating a closure to log anything I have replaced and then insert anything still left, but because the .on() is async, it seems that it depends on the speed each ElementHandler runs what happens first. I am loading a bunch of ElementHandlers depending on what I get back from some KV data, this could be partly causing my issue. I don’t see much documentation on the ordering so I’m assuming I’m hitting a timing issue.

I’ve also tried grabbing the HTML of the page and running some Regex checks, but I seem to be locking the ReadableStream when I do that.

Is it possible to do what I want? Update if an element exists, if not then insert it into the page? As mentioned both things work but replace and insert are both running so for example I get two tags on my page… with the correct content, but I only want one.

2 Likes

I got some help in the Cloudflare Discord, and the underlying problem was when you call HTMLRewriter(), you can add handlers using .on() but the order they run seems determined by the layout of the HTML rather than the other that you add them. So I was adding a handler to replace the then flag it had been done. If it was done, avoid inserting one, but if it was required, insert a with the content into the HEAD. The ordering of the handlers hit the tag first, so always inserted then replaced the content which was not what I wanted.

The solution was to create two HTMLRewriters, one chain with the checking for existence and one chain for the replace/inserts as required. One issue is that the handlers only run when the HTMLRewriter() is called to process something, so I had to call .text() on the first rewriter and ignore the result, so that it worked and I could then use the closure I’d created and filled in the existence check going into the insert/replace handlers.

2 Likes