Chat Completions

POST /v1/chat/completions

Creates a model response for the given conversation. This endpoint is fully compatible with the OpenAI Chat Completions API.

Request body

Parameter	Type	Required	Description
`model`	string	Yes	Model ID to use (see Models)
`messages`	array	Yes	Conversation messages (see Message types)
`stream`	boolean	No	Stream partial deltas as SSE. Default: `false`
`stream_options`	object	No	`{ "include_usage": true }` to include token usage in stream
`temperature`	number	No	Sampling temperature, 0–2. Default: model-dependent
`top_p`	number	No	Nucleus sampling threshold, 0–1
`n`	integer	No	Number of choices to generate, 1–128
`max_tokens`	integer	No	Max tokens to generate (deprecated — use `max_completion_tokens`)
`max_completion_tokens`	integer	No	Upper bound for generated tokens including reasoning
`stop`	string \| string[]	No	Up to 4 stop sequences
`frequency_penalty`	number	No	Frequency penalty, -2 to 2
`presence_penalty`	number	No	Presence penalty, -2 to 2
`logprobs`	boolean	No	Return log probabilities of output tokens
`top_logprobs`	integer	No	Number of most likely tokens per position, 0–20
`logit_bias`	object	No	Map of token IDs to bias values (-100 to 100)
`response_format`	object	No	`{ "type": "text" }`, `{ "type": "json_object" }`, or `{ "type": "json_schema", "json_schema": {...} }`
`seed`	integer	No	Seed for deterministic sampling
`tools`	array	No	Function tools the model may call
`tool_choice`	string \| object	No	`"none"`, `"auto"`, `"required"`, or specific tool
`parallel_tool_calls`	boolean	No	Enable parallel function calling
`reasoning_effort`	string	No	`"none"`, `"minimal"`, `"low"`, `"medium"`, `"high"`, `"xhigh"`
`top_k`	integer	No	Top-k sampling (provider-specific)
`min_p`	number	No	Min-p sampling threshold, 0–1 (provider-specific)
`repetition_penalty`	number	No	Repetition penalty (provider-specific)
`user`	string	No	End-user identifier for abuse tracking

Message types

System message

{ "role": "system", "content": "You are a helpful assistant." }

User message

{ "role": "user", "content": "What is the capital of France?" }

User messages also support multimodal content arrays:

{
  "role": "user",
  "content": [
    { "type": "text", "text": "What's in this image?" },
    { "type": "image_url", "image_url": { "url": "https://...", "detail": "auto" } }
  ]
}

Content part types: text, image_url, input_audio, file

Assistant message

{ "role": "assistant", "content": "The capital of France is Paris." }

Tool message

{ "role": "tool", "tool_call_id": "call_abc123", "content": "{\"result\": 42}" }

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 9,
    "total_tokens": 19
  }
}

Usage object

Field	Type	Description
`prompt_tokens`	integer	Input tokens consumed
`completion_tokens`	integer	Output tokens generated
`total_tokens`	integer	Total tokens (input + output)
`prompt_tokens_details`	object	Optional. `{ cached_tokens, audio_tokens }`
`completion_tokens_details`	object	Optional. `{ reasoning_tokens, audio_tokens }`

Streaming

Set stream: true to receive partial responses as server-sent events:

curl https://api.aiand.com/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "stream": true,
    "messages": [{"role": "user", "content": "Count to 5"}]
  }'

Each SSE event contains a data: line with a JSON chunk. The stream ends with data: [DONE].

To include token usage in the final stream event:

{
  "stream": true,
  "stream_options": { "include_usage": true }
}

Tool calling

{
  "model": "gpt-4o-mini",
  "messages": [{"role": "user", "content": "What's the weather in Tokyo?"}],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string" }
          },
          "required": ["location"]
        }
      }
    }
  ]
}

When the model decides to call a tool, the response includes tool_calls in the assistant message:

{
  "choices": [{
    "message": {
      "role": "assistant",
      "tool_calls": [{
        "id": "call_abc123",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\": \"Tokyo\"}"
        }
      }]
    },
    "finish_reason": "tool_calls"
  }]
}

Send the tool result back with a tool message to continue the conversation.