Streaming
All generative endpoints support streaming — set stream: true on the request and the response is delivered as Server-Sent Events (SSE).
Enabling streaming
Section titled “Enabling streaming”stream = client.chat.completions.create( model="...", messages=[{"role": "user", "content": "Hello"}], stream=True, stream_options={"include_usage": True},)for chunk in stream: print(chunk.choices[0].delta.content or "", end="")with client.messages.stream( model="...", max_tokens=1024, messages=[{"role": "user", "content": "Hello"}],) as stream: for text in stream.text_stream: print(text, end="")curl https://api.aiand.com/v1/chat/completions \ -H "Authorization: Bearer sk-..." \ -d '{"model":"...","stream":true,"messages":[{"role":"user","content":"Hello"}]}'What you receive
Section titled “What you receive”- OpenAI shape:
data: {...}events withchoices[].deltachunks; finaldata: [DONE]. - Anthropic shape: typed events —
message_start,content_block_delta,message_delta,message_stop. - ai& trailer (both shapes): after the model’s terminal event, ai& appends a
metadataevent with cost, request ID, and final usage. See Streaming Events.
Usage and cost in streams
Section titled “Usage and cost in streams”Set stream_options.include_usage = true (OpenAI shape) to get the final usage block before the trailer event. Anthropic streams always carry usage in message_delta.