Streaming

All generative endpoints support streaming — set stream: true on the request and the response is delivered as Server-Sent Events (SSE).

Enabling streaming

stream = client.chat.completions.create(
    model="...",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True,
    stream_options={"include_usage": True},
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

with client.messages.stream(
    model="...",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="")

curl https://api.aiand.com/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -d '{"model":"...","stream":true,"messages":[{"role":"user","content":"Hello"}]}'

What you receive

OpenAI shape: data: {...} events with choices[].delta chunks; final data: [DONE].
Anthropic shape: typed events — message_start, content_block_delta, message_delta, message_stop.
ai& trailer (OpenAI shape, opt-in): with the X-Aiand-Metrics: true request header, ai& appends a metrics event with token counts, cost, and currency after the model’s terminal event. See Streaming Events.

Usage and cost in streams

Set stream_options.include_usage = true (OpenAI shape) to get the final usage block before the trailer event. Anthropic streams always carry usage in message_delta.