Chat Completions

POST /v1/chat/completions is an OpenAI-wire-compatible proxy. If you have code that talks to the OpenAI Chat Completions API, point its base URL at AIUS and use an aius_… bearer token — it just works. The provider key is held server-side; you never send one.

The model is server-owned and not selectable. The gateway ignores any model you send and forces the platform-selected model (resolved from a server-side catalog), then scrubs the model identity to aius-default on every response. Any tools you include are stripped before the upstream call — tools are server-owned too. You do not pick the model or inject tools over this API.

POST https://aius.co/api/v1/chat/completions

Request

Headers

Authorization: Bearer aius_xxxxxxxx...
Content-Type: application/json

Body

The body is forwarded to the upstream model in OpenAI Chat Completions shape.

{
  "model": "aius-default",
  "messages": [
    { "role": "system", "content": "You are a concise assistant." },
    { "role": "user", "content": "Say hello in one sentence." }
  ],
  "stream": false,
  "max_tokens": 256,
  "temperature": 0.7
}

Field	Type	Required	Notes
`model`	string	No	Ignored. The gateway forces the server-selected model and discards whatever you send. SDKs require the field — pass any placeholder (e.g. `aius-default`).
`messages`	array	Yes	OpenAI message objects (`role` + `content`).
`stream`	boolean	No	`true` streams Server-Sent Events. Default `false`.
`max_tokens`	integer	No	Upper bound on generated tokens.
`temperature`	number	No	Sampling temperature.
`tools`	array	No	Stripped before the upstream call — tools are server-owned on AIUS, not client-supplied.

Standard OpenAI Chat Completions fields (top_p, stop, tool_choice, etc.) are forwarded upstream as-is. The two exceptions are model (forced server-side) and tools (stripped) — see the note above.

Non-streaming response

Returns the standard Chat Completions object. Note the model field is the scrubbed aius-default label, never the real upstream model id:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1717123456,
  "model": "aius-default",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Hello there!" },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 18, "completion_tokens": 4, "total_tokens": 22 }
}

The proxy also echoes correlation headers on the response so you can tie a call to a run-model trace: x-aius-session-id, x-aius-run-id, x-aius-step-run-id.

Streaming response

Set "stream": true. The response is text/event-stream of OpenAI chunk objects, terminated by data: [DONE]:

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" there!"},"finish_reason":null}]}

data: [DONE]

Examples

Non-streaming

curl -X POST https://aius.co/api/v1/chat/completions \
  -H "Authorization: Bearer aius_xxxxxxxx..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "aius-default",
    "messages": [{"role": "user", "content": "Say hello in one sentence."}]
  }'

Using the OpenAI SDK

Because the wire format matches, the official OpenAI SDKs work directly:

from openai import OpenAI

client = OpenAI(
    base_url="https://aius.co/api/v1",
    api_key="aius_xxxxxxxx...",
)
resp = client.chat.completions.create(
    model="aius-default",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

Streaming

import httpx

with httpx.stream(
    "POST",
    "https://aius.co/api/v1/chat/completions",
    headers={"Authorization": "Bearer aius_xxxxxxxx..."},
    json={
        "model": "aius-default",
        "messages": [{"role": "user", "content": "Tell me a short story."}],
        "stream": True,
    },
    timeout=None,
) as r:
    for line in r.iter_lines():
        if line.startswith("data: ") and line != "data: [DONE]":
            print(line[len("data: "):])

About `GET /v1/models`

GET /v1/models is not a list of selectable LLMs — the LLM is server-owned and forced (see the note at the top). It is the per-client model registry (the ML/AI model cards your runs produce as deliverables), scoped to an organization via a required org_id query parameter:

curl "https://aius.co/api/v1/models?org_id=client_xyz789" \
  -H "Authorization: Bearer aius_xxxxxxxx..."

There is no endpoint to choose or list chat models — the gateway decides.

When to use the run loop instead

/v1/chat/completions is a stateless proxy: you own the conversation, the tool loop, and tool execution. If you want the server to drive the agent loop and just hand you tool calls to execute locally, use the run-loop WebSocket instead.

​Chat Completions

​Request

​Headers

​Body

​Non-streaming response

​Streaming response

​Examples

​Non-streaming

​Using the OpenAI SDK

​Streaming

​About GET /v1/models

​When to use the run loop instead

Chat Completions

Request

Headers

Body

Non-streaming response

Streaming response

Examples

Non-streaming

Using the OpenAI SDK

Streaming

About `GET /v1/models`

When to use the run loop instead