If you click on articles in non-latin languages (e.g. Arabic, Russian and Chinese), you get a 404 error. This is despite those articles being available in the KV store. The other articles do work fine.
The same site does work without issue on Netlify, including the non-latin languages.
But with debugging on, I get the error: could not find 10-%D0%B6%D0%B8%D0%B7%D0%BD%D0%B5%D0%BD%D0%BD%D1%8B%D1%85-%D1%83%D1%80%D0%BE%D0%BA%D0%BE%D0%B2-%D0%BA%D0%BE%D1%82%D0%BE%D1%80%D1%8B%D0%B5-%D1%8F-%D0%BF%D0%BE%D0%BB%D1%83%D1%87%D0%B8%D0%BB-%D0%B2/index.html in your content namespace
So perhaps the error happens because it looks for the encoded URL above, and I store the decoded URL.
Precisely, there is the encoding mismatch I mentioned. You’ll need to decode the string before looking it up in KV.
I am not quite sure what mapRequestToAsset does but you seem to decode the string only to pass it to a request object where it will be re-encoded. Where do you actually access KV? Thats where you need to make sure you have the right string. If that mapRequestToAsset is some internal function it might always use the encoded string, in which case you’d need to save the string in its encoded form in KV too.
What it comes down to is that you need to debug your code and make sure the strings match. If you do not actively fetch it from the database you will most likely have to store it in its encoded fashion and I already addressed any potential issues you might have here in posting #4 as well.
Sorry about my slowness here. I did not think of editing the imported files before.
Some things I have found:
In the documentation, mapRequestToAsset is recommended to alter the URL, after which you make a new Request with the altered URL. BUT Request() encodes its URL, so decoding in mapRequestToAssset is futile.
I then decoded the URL in kv-asset-handler itself. Now, in the previewer and playground, things work perfectly. However, on the live worker the non-latin language pages load their content for a split second, after which they switch to a botched up mixture of the post and a 404 page. (This may be due to caching on my end. I’ll test some more. EDIT: YES, see EDIT2 below.)
When I used encoded file names instead of decoded file names, the reason I still got the error was because the script searches for the encoded file names in upper case (all caps), while they are stored in lower case. I do not know why it would search for the upper case path, since the link goes to the lowercase version. Maybe this is even the issue behind the overall problem I’m having. Again, I edited kv-asset-handler to make the pathKey lower case.
However, this now results in the page loading for a split second, and then going white.
EDIT2: The weird results after decoding the URL before were due to browser caching on my end. So things now work by using decoded keys and decoding the pathKey in kv-asset-handler.
@sklabnik, do any of these things sound like bugs? On one hand there is the fact that things only work if I decode the pathKey (if I use decoded filenames/keys), and on the other hand, if I use encoded filenames/keys, kv-asset-handler for some reason looks for the key in all caps, while it is stored in lowercase.
One of the things was you sent 10-жизненных-уроков-которые-я-получил-в/index.html but the key in your screenshot shows a completely different value.
As I said earlier, if you decode only to have it encoded again, there is little point in decoding it in the first place. I dont think there is anywhere a bug involved here, it simply was that encoding issue. Both approaches (encoded and decoded) might be possible, but I’d try to avoid using the encoded value wherever possible but use the proper string and make sure all the strings match.
Takes any path that ends in / or evaluates to an html file and appends index.html or /index.html for lookup in your Workers KV namespace.
That would essentially mean a request for 10-жизненных-уроков-которые-я-получил-в/ should be a lookup for 10-жизненных-уроков-которые-я-получил-в/index.html and that wouldnt match the KV key.
The point is you were trying to lookup 10-жизненных-уроков-которые-я-получил-в/index.html, which however didnt exist as that entry had that hash placed in the middle. I cant say where that hash came from but it would explain why that particular lookup failed even when you had the encoding right.
… To fix this, on publish or preview, Wrangler walks the entry-point directory you’ve declared in your wrangler.toml and creates an asset manifest: a map of your filenames to a hash of their content. We use this asset manifest to map requests for a particular filename, say index.html, to the content hash of the most recently uploaded static asset.