Streaming

How streaming works through the Noirdoc proxy with real-time pseudonymization.

Overview

Noirdoc supports streaming for all providers. Pseudonymization and restoration happen on each chunk in real time — there is no buffering of the full response before delivery. To enable streaming, pass "stream": true in your request body.

The proxy intercepts each Server-Sent Events (SSE) chunk, reidentifies any pseudonym tokens the model produces, and forwards the restored text to your application. From the client’s perspective, streaming through Noirdoc behaves identically to streaming directly from the provider.

Python — OpenAI SDK

stream = client.chat.completions.create(
    model="gpt-5.4-mini",
    messages=[
        {"role": "user", "content": "Write a cover letter for Max Mustermann."}
    ],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Python — Anthropic SDK

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a cover letter for Anna Schmidt."}
    ],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Node.js — OpenAI SDK

const stream = await client.chat.completions.create({
  model: "gpt-5.4-mini",
  messages: [
    { role: "user", content: "Write a cover letter for Max Mustermann." },
  ],
  stream: true,
});

for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) process.stdout.write(delta);
}

Node.js — Anthropic SDK

const stream = client.messages.stream({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "Write a cover letter for Anna Schmidt." },
  ],
});

for await (const event of stream) {
  if (
    event.type === "content_block_delta" &&
    event.delta.type === "text_delta"
  ) {
    process.stdout.write(event.delta.text);
  }
}

cURL

curl https://api.noirdoc.de/v1/chat/completions \
  -H "Authorization: Bearer px-your-noirdoc-key" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "gpt-5.4-mini",
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a cover letter for Anna Schmidt, email anna@example.com."}
    ]
  }'

The -N flag disables output buffering so SSE chunks appear in real time.

SSE format

Proxy endpoints return text/event-stream responses following the Server-Sent Events specification. Each event is a data: line containing a partial response object as JSON:

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":"Dear"},"index":0}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":" Hiring"},"index":0}]}

data: [DONE]

Each data: line carries one chunk with a partial response. The stream terminates with data: [DONE].