/v1/ prefix. Any client
library, proxy, or tool that speaks the OpenAI Chat Completions protocol can
connect to Comis without modification. The API supports chat completions
(streaming and non-streaming), model listing, embeddings, and the OpenAI
Responses API. Authentication uses the same gateway bearer tokens as JSON-RPC,
so a token with rpc or api scope works.
Endpoints
| Method | Path | Description |
|---|---|---|
POST | /v1/chat/completions | Send messages and receive agent responses |
GET | /v1/models | List all configured models |
GET | /v1/models/:id | Retrieve a single model by ID |
POST | /v1/embeddings | Generate text embeddings |
POST /v1/responses (OpenAI Responses API). It is listed in the HTTP Gateway routes reference but is intentionally not described here because the OpenAI Responses spec is separate from the Chat Completions spec.
Authentication
All endpoints require a valid bearer token configured ingateway.tokens[].
Pass the token via the Authorization header:
Chat Completions
Request
POST /v1/chat/completions
The request body is validated against ChatCompletionRequestSchema. All fields
must match exactly — unknown fields are rejected (strictObject).
| Field | Type | Required | Constraints | Default | Description |
|---|---|---|---|---|---|
model | string | Yes | min(1) | — | Model identifier. Accepted forms (validated against the configured agents — the same catalog /v1/models advertises): "provider/model" (e.g. "anthropic/claude-sonnet-4-5-20250929"), the bare model id, or an agent id (e.g. "default"). An unknown model returns 404 Model not found |
messages | ChatMessage[] | Yes | min(1) array | — | Conversation messages in order |
stream | boolean | No | — | false | Enable Server-Sent Events streaming |
temperature | number | No | min(0), max(2) | — | Sampling temperature |
max_tokens | number | No | Positive integer | — | Maximum tokens to generate |
stream_options | StreamOptions | No | — | — | Streaming behavior options |
Source:
ChatCompletionRequestSchema in packages/gateway/src/openai/openai-types.tsMessage Format
Each message in themessages array follows ChatMessageSchema:
| Field | Type | Required | Description |
|---|---|---|---|
role | "system" | "user" | "assistant" | Yes | Message role |
content | string | Yes | Message text content |
user message becomes the agent input. If no user message is
found, the request returns 400.
Stream Options
| Field | Type | Required | Description |
|---|---|---|---|
include_usage | boolean | No | Include token usage in the final stream chunk |
Example Request
POST /v1/chat/completions
Non-Streaming Response
Whenstream is false (default), the endpoint returns a single JSON object
following the ChatCompletion interface:
ChatCompletion Response
| Field | Type | Description |
|---|---|---|
id | string | Unique completion ID with chatcmpl- prefix and UUID |
object | "chat.completion" | Fixed object type |
created | number | Unix timestamp (seconds) |
model | string | Model identifier from request |
choices | Choice[] | Array with single choice (index 0) |
choices[].message.role | "assistant" | Always "assistant" |
choices[].message.content | string | null | Response text |
choices[].finish_reason | string | Mapped finish reason (see table below) |
usage.prompt_tokens | number | Input token count |
usage.completion_tokens | number | Output token count |
usage.total_tokens | number | Total token count |
Source:
ChatCompletion interface in packages/gateway/src/openai/openai-types.tsStreaming Response
Whenstream is true, the endpoint returns Server-Sent Events (SSE). Each
event contains a ChatCompletionChunk serialized as JSON. The streaming
sequence follows five steps:
Step 1: Role chunk — Announces the assistant role.
finish_reason, empty delta.
ChatCompletionChunk Interface
| Field | Type | Description |
|---|---|---|
id | string | Same chatcmpl- ID for all chunks in one completion |
object | "chat.completion.chunk" | Fixed chunk object type |
created | number | Unix timestamp (seconds) |
model | string | Model identifier from request |
choices | ChunkChoice[] | Array with delta content |
choices[].index | number | Always 0 |
choices[].delta.role | "assistant" | undefined | Present only in role chunk |
choices[].delta.content | string | undefined | Present in content chunks |
choices[].finish_reason | string | null | null until finish chunk |
usage | Usage | undefined | Present only in usage chunk |
Source:
ChatCompletionChunk interface in packages/gateway/src/openai/openai-types.ts, streaming sequence in packages/gateway/src/openai/openai-completions.tsFinish Reason Mapping
Comis maps its internal finish reasons to OpenAI-compatible values. Unknown reasons default to"stop".
| Comis Reason | OpenAI Reason | Description |
|---|---|---|
stop | stop | Agent completed normally |
max_steps | length | Maximum execution steps reached |
budget_exceeded | stop | Token budget exhausted |
circuit_open | stop | Circuit breaker tripped |
context_loop | stop | Context loop detected |
error | stop | Execution error occurred |
| (unknown) | stop | Any unrecognized reason |
Source:
FINISH_REASON_MAP in packages/gateway/src/openai/openai-types.tsSession Key
Each OpenAI API request creates an ephemeral session with an auto-generated key:peerId uses the completion ID, making each request a unique session. This
means there is no conversation history between requests — each call is
stateless from the agent’s perspective.
Source: Session key construction in
packages/gateway/src/openai/openai-completions.tsModel Resolution
If the gateway has aresolveModel function wired, the model field is
validated against the model catalog. If the model is not found, the endpoint
returns a 404 error:
Models
List Models
GET /v1/models
Returns all configured models in OpenAI’s List Models format.
GET /v1/models Response
Get Model
GET /v1/models/:model_id
Retrieves a single model by its ID. Model IDs use provider/modelId format
(e.g., anthropic/claude-sonnet-4-5-20250929). The route uses a wildcard param
to accommodate the slash in the ID.
Success:
GET /v1/models/anthropic/claude-sonnet-4-5-20250929
Model Format
Each model object has the following fields:| Field | Type | Description |
|---|---|---|
id | string | "provider/modelId" format |
object | "model" | Fixed object type |
created | number | Always 0 (no creation timestamp tracked) |
owned_by | string | Provider name from model catalog |
Source:
toOpenAIModel() in packages/gateway/src/openai/openai-models.tsEmbeddings
Request
POST /v1/embeddings
The request body is validated against EmbeddingsRequestSchema.
| Field | Type | Required | Constraints | Default | Description |
|---|---|---|---|---|---|
model | string | Yes | min(1) | — | Embedding model identifier |
input | string | string[] | Yes | — | — | Text to embed (single string or array) |
encoding_format | "float" | "base64" | No | — | "float" | Output encoding format |
Example Requests
Single string input
Batch input
Response
POST /v1/embeddings Response
| Field | Type | Description |
|---|---|---|
object | "list" | Fixed collection type |
data[].object | "embedding" | Fixed embedding type |
data[].embedding | number[] | Embedding vector |
data[].index | number | Position in input array |
model | string | Model ID from the embedding port |
usage.prompt_tokens | number | Estimated token count (ceil(chars / 4)) |
usage.total_tokens | number | Same as prompt_tokens (no completion tokens) |
404:
Source:
EmbeddingsRequestSchema and createOpenaiEmbeddingsRoute() in packages/gateway/src/openai/openai-embeddings.tsError Responses
All endpoints return errors in theOpenAIErrorResponse format:
Error Response Format
| Field | Type | Description |
|---|---|---|
error.message | string | Human-readable error description |
error.type | string | Error category (see mapping below) |
error.param | string | null | The parameter that caused the error (if applicable) |
error.code | string | null | Always null |
Error Type Mapping
| HTTP Status | OpenAI Error Type | Typical Cause |
|---|---|---|
400 | invalid_request_error | Missing required field, invalid value |
401 | authentication_error | Missing or invalid bearer token |
403 | permission_error | Insufficient token scopes |
404 | not_found_error | Model not found, no embedding provider |
429 | rate_limit_error | Rate limit exceeded |
500 | server_error | Internal error during execution |
"server_error".
Source:
STATUS_TO_ERROR_TYPE and createOpenAIError() in packages/gateway/src/openai/openai-types.tsConfiguration
The OpenAI-compatible API is enabled automatically when the gateway is running. Relevant configuration fields:Source:
GatewayConfigSchema in packages/core/src/config/schema-gateway.tsRelated
HTTP Gateway
Full gateway endpoint reference
JSON-RPC
JSON-RPC method reference
WebSocket
WebSocket streaming reference
Rate Limiting
Rate limiting configuration
