Vision Extraction
This recipe pulls structured data out of an image — receipts, invoices, ID cards, business cards. Three pieces:
- Upload the image via Files.
- Reference it in a chat message by
file_id. - Constrain the output with a JSON Schema.
Full example
Section titled “Full example”from openai import OpenAIfrom pydantic import BaseModelfrom typing import List
client = OpenAI(base_url="https://api.aiand.com/v1", api_key="sk-...")
class LineItem(BaseModel): description: str quantity: int unit_price_jpy: float
class Receipt(BaseModel): merchant: str date: str items: List[LineItem] total_jpy: float
uploaded = client.files.create( file=open("receipt.jpg", "rb"), purpose="vision",)
response = client.chat.completions.parse( model="google/gemma-4-31b-it", messages=[ {"role": "system", "content": "Extract structured data from receipts."}, { "role": "user", "content": [ {"type": "text", "text": "Parse this receipt into JSON."}, {"type": "file", "file": {"file_id": uploaded.id}}, ], }, ], response_format=Receipt,)
receipt = response.choices[0].message.parsedfor item in receipt.items: print(f"{item.description}: ¥{item.unit_price_jpy}")print(f"Total: ¥{receipt.total_jpy}")What’s happening
Section titled “What’s happening”client.files.create(..., purpose="vision")uploads the image to ai& storage. The purpose is also inferred from MIME if omitted.- The
filecontent part —{type: "file", file: {file_id}}— tells ai& to resolve the file. The platform generates a short-lived signed URL and rewrites the part toimage_urlbefore forwarding to the model. See Vision for details. response_format=Receiptships the Pydantic schema as a strict JSON Schema.
Cleanup
Section titled “Cleanup”Files cost storage. Delete them when you’re done:
client.files.delete(uploaded.id)Or let the 30-day expiry handle it.