Skip to content

Chat Completions

POST /v1/chat/completions

Generate a model response for a given conversation. Fully compatible with the OpenAI Chat Completions API.

ParameterTypeRequiredDescription
modelstringYesModel ID (see Models)
messagesarrayYesConversation messages (see Message types)
streambooleanNoStream partial deltas as SSE. Default: false
stream_optionsobjectNo{ "include_usage": true } to include token counts in the final stream event
temperaturenumberNoSampling temperature, 0–2. Default: model-dependent
top_pnumberNoNucleus sampling threshold, 0–1
nintegerNoNumber of choices to generate, 1–128
max_tokensintegerNoMaximum tokens to generate (deprecated — use max_completion_tokens)
max_completion_tokensintegerNoUpper bound on generated tokens, including reasoning tokens
stopstring | string[]NoUp to 4 stop sequences
frequency_penaltynumberNoFrequency penalty, -2 to 2
presence_penaltynumberNoPresence penalty, -2 to 2
logprobsbooleanNoReturn log probabilities of output tokens
top_logprobsintegerNoMost likely tokens per position, 0–20
logit_biasobjectNoMap of token IDs to bias values (-100 to 100)
response_formatobjectNo{ "type": "text" }, { "type": "json_object" }, or { "type": "json_schema", "json_schema": {...} }
seedintegerNoSeed for deterministic sampling
toolsarrayNoFunction tools the model may call
tool_choicestring | objectNo"none", "auto", "required", or a specific tool
parallel_tool_callsbooleanNoAllow parallel function calling
reasoning_effortstringNo"none", "minimal", "low", "medium", "high", "xhigh"
top_kintegerNoTop-k sampling (provider-specific)
min_pnumberNoMin-p sampling threshold, 0–1 (provider-specific)
repetition_penaltynumberNoRepetition penalty (provider-specific)
userstringNoEnd-user identifier for abuse tracking
{ "role": "system", "content": "You are a helpful assistant." }
{ "role": "user", "content": "What is the capital of France?" }

User messages also accept multimodal content arrays:

{
"role": "user",
"content": [
{ "type": "text", "text": "What's in this image?" },
{ "type": "image_url", "image_url": { "url": "https://...", "detail": "auto" } }
]
}

Supported content types: text, image_url, video_url, audio_url, input_audio, file.

For images, prefer uploading via the Files API and referencing by file_id rather than inline base64 — multi-turn conversations and retries don’t re-send bytes from your client:

{
"role": "user",
"content": [
{ "type": "text", "text": "What's in this image?" },
{ "type": "file", "file": { "file_id": "file-abc123" } }
]
}

The model must have the matching capability (e.g. vision); otherwise the request is rejected with 400 model_capability_mismatch.

{ "role": "assistant", "content": "The capital of France is Paris." }
{ "role": "tool", "tool_call_id": "call_abc123", "content": "{\"result\": 42}" }
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "openai/gpt-oss-120b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 9,
"total_tokens": 19
}
}
FieldTypeDescription
prompt_tokensintegerInput tokens consumed
completion_tokensintegerOutput tokens generated
total_tokensintegerSum of input and output tokens
prompt_tokens_detailsobjectOptional. { cached_tokens, audio_tokens }
completion_tokens_detailsobjectOptional. { reasoning_tokens, audio_tokens }

Set stream: true to receive partial responses as server-sent events.

Terminal window
curl https://api.aiand.com/v1/chat/completions \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-oss-120b",
"stream": true,
"messages": [{"role": "user", "content": "Count to 5"}]
}'

Each event contains a data: line with a JSON chunk. The stream ends with data: [DONE].

To include token usage in the final event:

{
"stream": true,
"stream_options": { "include_usage": true }
}

Define tools in the request and the model can choose to call them:

{
"model": "openai/gpt-oss-120b",
"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
}
]
}

When the model calls a tool, the response includes tool_calls:

{
"choices": [{
"message": {
"role": "assistant",
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Tokyo\"}"
}
}]
},
"finish_reason": "tool_calls"
}]
}

Send the result back as a tool message to continue the conversation.