Chat Completions
POST /v1/chat/completionsCreates a model response for the given conversation. This endpoint is fully compatible with the OpenAI Chat Completions API.
Request body
Section titled “Request body”| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID to use (see Models) |
messages | array | Yes | Conversation messages (see Message types) |
stream | boolean | No | Stream partial deltas as SSE. Default: false |
stream_options | object | No | { "include_usage": true } to include token usage in stream |
temperature | number | No | Sampling temperature, 0–2. Default: model-dependent |
top_p | number | No | Nucleus sampling threshold, 0–1 |
n | integer | No | Number of choices to generate, 1–128 |
max_tokens | integer | No | Max tokens to generate (deprecated — use max_completion_tokens) |
max_completion_tokens | integer | No | Upper bound for generated tokens including reasoning |
stop | string | string[] | No | Up to 4 stop sequences |
frequency_penalty | number | No | Frequency penalty, -2 to 2 |
presence_penalty | number | No | Presence penalty, -2 to 2 |
logprobs | boolean | No | Return log probabilities of output tokens |
top_logprobs | integer | No | Number of most likely tokens per position, 0–20 |
logit_bias | object | No | Map of token IDs to bias values (-100 to 100) |
response_format | object | No | { "type": "text" }, { "type": "json_object" }, or { "type": "json_schema", "json_schema": {...} } |
seed | integer | No | Seed for deterministic sampling |
tools | array | No | Function tools the model may call |
tool_choice | string | object | No | "none", "auto", "required", or specific tool |
parallel_tool_calls | boolean | No | Enable parallel function calling |
reasoning_effort | string | No | "none", "minimal", "low", "medium", "high", "xhigh" |
top_k | integer | No | Top-k sampling (provider-specific) |
min_p | number | No | Min-p sampling threshold, 0–1 (provider-specific) |
repetition_penalty | number | No | Repetition penalty (provider-specific) |
user | string | No | End-user identifier for abuse tracking |
Message types
Section titled “Message types”System message
Section titled “System message”{ "role": "system", "content": "You are a helpful assistant." }User message
Section titled “User message”{ "role": "user", "content": "What is the capital of France?" }User messages also support multimodal content arrays:
{ "role": "user", "content": [ { "type": "text", "text": "What's in this image?" }, { "type": "image_url", "image_url": { "url": "https://...", "detail": "auto" } } ]}Content part types: text, image_url, input_audio, file
Assistant message
Section titled “Assistant message”{ "role": "assistant", "content": "The capital of France is Paris." }Tool message
Section titled “Tool message”{ "role": "tool", "tool_call_id": "call_abc123", "content": "{\"result\": 42}" }Response
Section titled “Response”{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1700000000, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help you today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 10, "completion_tokens": 9, "total_tokens": 19 }}Usage object
Section titled “Usage object”| Field | Type | Description |
|---|---|---|
prompt_tokens | integer | Input tokens consumed |
completion_tokens | integer | Output tokens generated |
total_tokens | integer | Total tokens (input + output) |
prompt_tokens_details | object | Optional. { cached_tokens, audio_tokens } |
completion_tokens_details | object | Optional. { reasoning_tokens, audio_tokens } |
Streaming
Section titled “Streaming”Set stream: true to receive partial responses as server-sent events:
curl https://api.aiand.com/v1/chat/completions \ -H "Authorization: Bearer sk-your-api-key" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "stream": true, "messages": [{"role": "user", "content": "Count to 5"}] }'Each SSE event contains a data: line with a JSON chunk. The stream ends with data: [DONE].
To include token usage in the final stream event:
{ "stream": true, "stream_options": { "include_usage": true }}Tool calling
Section titled “Tool calling”{ "model": "gpt-4o-mini", "messages": [{"role": "user", "content": "What's the weather in Tokyo?"}], "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": { "type": "string" } }, "required": ["location"] } } } ]}When the model decides to call a tool, the response includes tool_calls in the assistant message:
{ "choices": [{ "message": { "role": "assistant", "tool_calls": [{ "id": "call_abc123", "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\": \"Tokyo\"}" } }] }, "finish_reason": "tool_calls" }]}Send the tool result back with a tool message to continue the conversation.