stream: true parameter on chat completions. Tokens are delivered as Server-Sent Events (SSE) as soon as the model generates them, so your UI can start rendering immediately instead of waiting for the full response.
Streaming works with all models that support it, and is compatible with the OpenAI SDK’s built-in stream helpers.
Code examples
SSE format
Each chunk arrives as a line beginning withdata: , followed by a JSON object. The stream ends with the sentinel value data: [DONE].
Chunk structure
delta.content field contains the new token(s) for that chunk. When finish_reason is set (e.g., "stop"), the model has finished generating and no more content chunks will follow.