Streaming Chat UI

Streaming UIs feel instant because tokens render as they arrive. The pattern is the same everywhere: server route opens a stream to ai&, pipes the SSE body straight to the browser, and the client renders deltas.

Server route (Next.js App Router)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.aiand.com/v1",
  apiKey: process.env.AIAND_API_KEY!,
});

export async function POST(req: Request) {
  const { messages } = await req.json();

  const stream = await client.chat.completions.create({
    model: "openai/gpt-oss-120b",
    messages,
    stream: true,
    stream_options: { include_usage: true },
  });

  const encoder = new TextEncoder();
  const body = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        const delta = chunk.choices[0]?.delta?.content ?? "";
        if (delta) controller.enqueue(encoder.encode(delta));
      }
      controller.close();
    },
  });

  return new Response(body, {
    headers: { "Content-Type": "text/plain; charset=utf-8" },
  });
}

Client component

"use client";
import { useState } from "react";

export default function Chat() {
  const [input, setInput] = useState("");
  const [output, setOutput] = useState("");

  async function send() {
    setOutput("");
    const res = await fetch("/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ messages: [{ role: "user", content: input }] }),
    });
    const reader = res.body!.getReader();
    const decoder = new TextDecoder();
    while (true) {
      const { value, done } = await reader.read();
      if (done) break;
      setOutput((prev) => prev + decoder.decode(value));
    }
  }

  return (
    <div>
      <input value={input} onChange={(e) => setInput(e.target.value)} />
      <button onClick={send}>Send</button>
      <pre>{output}</pre>
    </div>
  );
}

Capturing cost and request_id

The Node SDK’s for await loop iterates content chunks but doesn’t expose the metrics trailer event. To capture cost and request_id:

Read the request ID from the X-Request-ID response header.
For cost, send the X-Aiand-Metrics: true request header and drop down to fetch + SSE parsing to read event: metrics after [DONE] — its payload carries token counts, cost, and currency. See Streaming Events.

Notes

The upstream response body matches OpenAI’s format exactly — you can swap base URLs without changing the parser.
Never paste the API key into client code; always route through your server.
For a Vercel AI SDK integration (with useChat etc.), see Vercel AI SDK.