Skip to content

Chat Completions

POST /v1/chat/completions

Creates a model response for the given conversation. This endpoint is fully compatible with the OpenAI Chat Completions API.

ParameterTypeRequiredDescription
modelstringYesModel ID to use (see Models)
messagesarrayYesConversation messages (see Message types)
streambooleanNoStream partial deltas as SSE. Default: false
stream_optionsobjectNo{ "include_usage": true } to include token usage in stream
temperaturenumberNoSampling temperature, 0–2. Default: model-dependent
top_pnumberNoNucleus sampling threshold, 0–1
nintegerNoNumber of choices to generate, 1–128
max_tokensintegerNoMax tokens to generate (deprecated — use max_completion_tokens)
max_completion_tokensintegerNoUpper bound for generated tokens including reasoning
stopstring | string[]NoUp to 4 stop sequences
frequency_penaltynumberNoFrequency penalty, -2 to 2
presence_penaltynumberNoPresence penalty, -2 to 2
logprobsbooleanNoReturn log probabilities of output tokens
top_logprobsintegerNoNumber of most likely tokens per position, 0–20
logit_biasobjectNoMap of token IDs to bias values (-100 to 100)
response_formatobjectNo{ "type": "text" }, { "type": "json_object" }, or { "type": "json_schema", "json_schema": {...} }
seedintegerNoSeed for deterministic sampling
toolsarrayNoFunction tools the model may call
tool_choicestring | objectNo"none", "auto", "required", or specific tool
parallel_tool_callsbooleanNoEnable parallel function calling
reasoning_effortstringNo"none", "minimal", "low", "medium", "high", "xhigh"
top_kintegerNoTop-k sampling (provider-specific)
min_pnumberNoMin-p sampling threshold, 0–1 (provider-specific)
repetition_penaltynumberNoRepetition penalty (provider-specific)
userstringNoEnd-user identifier for abuse tracking
{ "role": "system", "content": "You are a helpful assistant." }
{ "role": "user", "content": "What is the capital of France?" }

User messages also support multimodal content arrays:

{
"role": "user",
"content": [
{ "type": "text", "text": "What's in this image?" },
{ "type": "image_url", "image_url": { "url": "https://...", "detail": "auto" } }
]
}

Content part types: text, image_url, input_audio, file

{ "role": "assistant", "content": "The capital of France is Paris." }
{ "role": "tool", "tool_call_id": "call_abc123", "content": "{\"result\": 42}" }
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 9,
"total_tokens": 19
}
}
FieldTypeDescription
prompt_tokensintegerInput tokens consumed
completion_tokensintegerOutput tokens generated
total_tokensintegerTotal tokens (input + output)
prompt_tokens_detailsobjectOptional. { cached_tokens, audio_tokens }
completion_tokens_detailsobjectOptional. { reasoning_tokens, audio_tokens }

Set stream: true to receive partial responses as server-sent events:

Terminal window
curl https://api.aiand.com/v1/chat/completions \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"stream": true,
"messages": [{"role": "user", "content": "Count to 5"}]
}'

Each SSE event contains a data: line with a JSON chunk. The stream ends with data: [DONE].

To include token usage in the final stream event:

{
"stream": true,
"stream_options": { "include_usage": true }
}
{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
}
]
}

When the model decides to call a tool, the response includes tool_calls in the assistant message:

{
"choices": [{
"message": {
"role": "assistant",
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Tokyo\"}"
}
}]
},
"finish_reason": "tool_calls"
}]
}

Send the tool result back with a tool message to continue the conversation.