Workers + AI (slow)

I’m calling workers + ai (llama) but it takes around ~20seconds to perform the task. I’ve seen other realtime chats using the same llama. I’m using free tier, is there something related? - or should it take that time to answer a simple question?

Hi @fausto.alemao - Are you streaming the response (by passing stream: true)? It’s possible you’d see a delay of this length without streaming on if you’re not streaming a response.