Streaming is buffered even when Cloudflare is paused

Hello,

I have a web application. Users can send requests to the backend at www.backend.io, which is managed by Cloudflare (all the A and CNAME DNS records for backend.io are set DNS Only). The backend calls completions.create with stream: true of OpenAI APIs.

Tests conducted on my localhost during development show that streaming works well; the response gradually appears on the screen. However, tests in the production environment show that the entire response is sent all at once in the end.

I tried pausing Cloudflare on www.backend.io, but the streaming still did not work. I also attempted to create another server hosting a smaller temporary backend (back.temp.tech). When I did not use Cloudflare, the streaming worked well; when I paused Cloudflare, the streaming worked well too.

So, I don’t know which Cloudflare setting might be preventing streaming from working with my backend.io, even when I pause Cloudflare. Does anyone have any insights?

Here is the frontend code in reactjs:

import React, { useState } from "react";
import { v4 as uuidv4 } from "uuid"; // import the uuid function

export default function Page() {
    const [userMessage, setUserMessage] = useState("");
    const [response, setResponse] = useState("");
    const [requestId, setRequestId] = useState(""); // State to keep track of the current requestId

    const handleMessageChange = (e) => {
        setUserMessage(e.target.value);
    };

    const handleButtonClick = async () => {
        const newRequestId = uuidv4(); // Generate a new unique requestId
        setRequestId(newRequestId); // Set the requestId in the state

        const res = await fetch("https://www.backend.io/httpOnly/complete", {
            method: "POST",
            headers: { "Content-Type": "application/json" },
            body: JSON.stringify({ requestId: newRequestId, userMessage }), // Send the new unique requestId with the request
        });
    
        if (res.body) {
            const reader = res.body.getReader();
            let text = "";

            return reader.read().then(function processText({ done, value }) {
                if (done) {
                    setResponse(text);
                    setRequestId(""); // Clear the requestId after completing the request
                    return;
                }
    
                const v = new TextDecoder("utf-8").decode(value);
                console.log(v)
                text = text + v;
                setResponse(text);
                return reader.read().then(processText);
            });
        }
    };

    return (
        <div className="App">
            <input type="text" value={userMessage} onChange={handleMessageChange} />
            <button onClick={handleButtonClick}>Send</button>
            <div style={{ whiteSpace: "pre-wrap", textAlign: "left" }}>{response}</div>
         </div>
    );
}

Here is the backend code:

const controllers = {};

app.post('/complete', async (req, res) => {
    const requestId = req.body.requestId; // You need to send a unique identifier with each request
    const controller = new AbortController();
    controllers[requestId] = controller;

    const userMessage = req.body.userMessage;
    const stream = await openai.chat.completions.create({
        model: 'gpt-4-1106-preview',
        messages: [{ role: 'user', content: userMessage }],
        stream: true,
    }, { signal: controllers[requestId].signal });
    
    for await (const part of stream) {
        console.log("part", part);
        console.log("part.choices[0]?.delta?.content", part.choices[0]?.delta?.content)
        res.write(part.choices[0]?.delta?.content || '');
    }
    console.log("stream", stream)
    res.end();
});

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.