Structured Extraction
Extract structured records from free-form text — invoices, emails, transcripts, scraped pages. Use a strict JSON Schema so the result parses cleanly every time.
Example: parse contact info
Section titled “Example: parse contact info”from openai import OpenAI
client = OpenAI(base_url="https://api.aiand.com/v1", api_key="sk-...")
text = """From: Jane Doe <jane@example.com>Re: Coffee next weekPhone: +1 555-123-4567"""
response = client.chat.completions.create( model="openai/gpt-oss-120b", messages=[ {"role": "system", "content": "Extract contact info from the user's text."}, {"role": "user", "content": text}, ], response_format={ "type": "json_schema", "json_schema": { "name": "contact", "strict": True, "schema": { "type": "object", "properties": { "name": {"type": "string"}, "email": {"type": "string"}, "phone": {"type": ["string", "null"]}, }, "required": ["name", "email", "phone"], "additionalProperties": False, }, }, },)
import jsoncontact = json.loads(response.choices[0].message.content)print(contact)# {'name': 'Jane Doe', 'email': 'jane@example.com', 'phone': '+1 555-123-4567'}With Pydantic (Python)
Section titled “With Pydantic (Python)”The OpenAI Python SDK can derive the schema from a Pydantic model:
from pydantic import BaseModelfrom typing import Optional
class Contact(BaseModel): name: str email: str phone: Optional[str]
response = client.chat.completions.parse( model="openai/gpt-oss-120b", messages=[ {"role": "system", "content": "Extract contact info."}, {"role": "user", "content": text}, ], response_format=Contact,)
contact = response.choices[0].message.parsedprint(contact.name, contact.email)- Make every field
requiredand use["string", "null"]for optional values —strictmode rejects missing keys, not nullable ones. - Set
additionalProperties: falseto prevent the model from sneaking in extra fields. - For nested arrays (multiple contacts), wrap in
{"type": "array", "items": {...}}inside an outer object.