Skip to content

Responses

POST /v1/responses

Creates a model response for the given input. Compatible with the OpenAI Responses API.

ParameterTypeRequiredDescription
modelstringYesModel ID to use
inputstring | arrayYesText string (treated as user message) or array of input items
instructionsstringNoSystem/developer message injected into context
streambooleanNoStream as SSE. Default: false
temperaturenumberNoSampling temperature, 0–2
top_pnumberNoNucleus sampling, 0–1
max_output_tokensintegerNoUpper bound for generated tokens
toolsarrayNoFunction tools the model may call
tool_choicestring | objectNo"none", "auto", "required", or specific tool
parallel_tool_callsbooleanNoAllow parallel tool calls
reasoningobjectNo{ "effort": "low"|"medium"|"high", "summary": "auto"|"concise"|"detailed" }
truncationstringNo"auto" (truncate to fit context) or "disabled"
previous_response_idstringNoContinue a multi-turn conversation
storebooleanNoStore response for later retrieval
metadataobjectNoKey-value pairs for tracking
textobjectNoText response format config
seedintegerNoDeterministic sampling seed
stopstring | string[]NoUp to 4 stop sequences
top_kintegerNoTop-k sampling (provider-specific)
repetition_penaltynumberNoRepetition penalty (provider-specific)

Simple string input:

{
"model": "gpt-4o-mini",
"input": "Explain quantum computing in one paragraph."
}

Structured input items:

{
"model": "gpt-4o-mini",
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "Describe this image" },
{ "type": "input_image", "image_url": "https://..." }
]
}
]
}

Input item types:

  • Message: { "role": "user"|"assistant"|"developer"|"system", "content": string | content[] }
  • Function call output: { "type": "function_call_output", "call_id": "...", "output": "..." }
  • Item reference: { "type": "item_reference", "id": "..." }

Content types: input_text, input_image, input_file

{
"id": "resp_abc123",
"object": "response",
"status": "completed",
"created_at": 1700000000,
"model": "gpt-4o-mini",
"output": [
{
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "Quantum computing uses quantum bits (qubits)..."
}
]
}
],
"usage": {
"input_tokens": 15,
"output_tokens": 80,
"total_tokens": 95
}
}
StatusDescription
completedGeneration finished successfully
failedGeneration failed (see error field)
in_progressStill generating (streaming)
incompleteStopped early (see incomplete_details.reason)
cancelledRequest was cancelled
queuedWaiting to be processed
FieldTypeDescription
input_tokensintegerInput tokens consumed
output_tokensintegerOutput tokens generated
total_tokensintegerTotal tokens
input_tokens_detailsobject{ cached_tokens }
output_tokens_detailsobject{ reasoning_tokens }

Use previous_response_id to continue a conversation without resending the full history:

{
"model": "gpt-4o-mini",
"input": "Now explain it to a 5-year-old.",
"previous_response_id": "resp_abc123"
}