Skip to content

Video Understanding

Video-capable models accept video clips alongside text. You can send video two ways: reference an upload by file_id, or pass an inline URL.

Section titled “Option 1: Upload first, reference by file_id (recommended)”

Best for clips you’ll use across multiple requests, and the only path for files you don’t already host somewhere reachable.

  1. Upload via the Uploads API (chunked, up to 8 GB) with purpose: "video" — or let the purpose be inferred from MIME.
  2. Reference the returned file-... id in a chat message:
{
"role": "user",
"content": [
{ "type": "text", "text": "Summarize this clip." },
{ "type": "file", "file": { "file_id": "file-abc123" } }
]
}

The bytes are pulled from storage at inference time — your client never re-uploads.

If the video is already at a public URL, you can reference it directly:

{
"role": "user",
"content": [
{ "type": "text", "text": "Summarize this clip." },
{ "type": "video_url", "video_url": { "url": "https://example.com/clip.mp4" } }
]
}

The URL must be reachable from ai& and serve the bytes with a supported Content-Type.

  • Supported MIME types: video/mp4, video/webm, video/quicktime.
  • Uploaded files: up to 8 GB via the chunked Uploads API. Single-shot /v1/files is capped at 100 MB.

Both options require the target model to declare the video capability — otherwise the request returns 400 invalid_request_error. List models and their capabilities via Models.