Skip to content

Streaming

All generative endpoints support streaming — set stream: true on the request and the response is delivered as Server-Sent Events (SSE).

stream = client.chat.completions.create(
model="...",
messages=[{"role": "user", "content": "Hello"}],
stream=True,
stream_options={"include_usage": True},
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")
  • OpenAI shape: data: {...} events with choices[].delta chunks; final data: [DONE].
  • Anthropic shape: typed events — message_start, content_block_delta, message_delta, message_stop.
  • ai& trailer (both shapes): after the model’s terminal event, ai& appends a metadata event with cost, request ID, and final usage. See Streaming Events.

Set stream_options.include_usage = true (OpenAI shape) to get the final usage block before the trailer event. Anthropic streams always carry usage in message_delta.