Skip to main content
What this is for: point any OpenAI-API-compatible client at Comis and have it talk to your local agents. Who it’s for: developers integrating Comis into existing OpenAI-shaped tooling (LangChain, custom GPTs, third-party UIs). Comis exposes an OpenAI-compatible REST API at the /v1/ prefix. Any client library, proxy, or tool that speaks the OpenAI Chat Completions protocol can connect to Comis without modification. The API supports chat completions (streaming and non-streaming), model listing, embeddings, and the OpenAI Responses API. Authentication uses the same gateway bearer tokens as JSON-RPC, so a token with rpc or api scope works.

Endpoints

MethodPathDescription
POST/v1/chat/completionsSend messages and receive agent responses
GET/v1/modelsList all configured models
GET/v1/models/:idRetrieve a single model by ID
POST/v1/embeddingsGenerate text embeddings
The gateway also mounts POST /v1/responses (OpenAI Responses API). It is listed in the HTTP Gateway routes reference but is intentionally not described here because the OpenAI Responses spec is separate from the Chat Completions spec.

Authentication

All endpoints require a valid bearer token configured in gateway.tokens[]. Pass the token via the Authorization header:
curl -H "Authorization: Bearer your-token" \
  http://localhost:4766/v1/chat/completions
Tokens are defined in config with a minimum 32-character secret and optional scope restrictions:
gateway:
  tokens:
    - id: "my-client"
      secret: "a-secure-token-at-least-32-characters-long"
      scopes: ["rpc", "ws"]

Chat Completions

Request

POST /v1/chat/completions The request body is validated against ChatCompletionRequestSchema. All fields must match exactly — unknown fields are rejected (strictObject).
FieldTypeRequiredConstraintsDefaultDescription
modelstringYesmin(1)Model identifier. Accepted forms (validated against the configured agents — the same catalog /v1/models advertises): "provider/model" (e.g. "anthropic/claude-sonnet-4-5-20250929"), the bare model id, or an agent id (e.g. "default"). An unknown model returns 404 Model not found
messagesChatMessage[]Yesmin(1) arrayConversation messages in order
streambooleanNofalseEnable Server-Sent Events streaming
temperaturenumberNomin(0), max(2)Sampling temperature
max_tokensnumberNoPositive integerMaximum tokens to generate
stream_optionsStreamOptionsNoStreaming behavior options
Source: ChatCompletionRequestSchema in packages/gateway/src/openai/openai-types.ts

Message Format

Each message in the messages array follows ChatMessageSchema:
FieldTypeRequiredDescription
role"system" | "user" | "assistant"YesMessage role
contentstringYesMessage text content
System messages are concatenated (joined with newline) and passed as the system prompt. The last user message becomes the agent input. If no user message is found, the request returns 400.

Stream Options

FieldTypeRequiredDescription
include_usagebooleanNoInclude token usage in the final stream chunk

Example Request

POST /v1/chat/completions
{
  "model": "anthropic/claude-sonnet-4-5-20250929",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is the weather like today?" }
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 1024
}

Non-Streaming Response

When stream is false (default), the endpoint returns a single JSON object following the ChatCompletion interface:
ChatCompletion Response
{
  "id": "chatcmpl-550e8400-e29b-41d4-a716-446655440000",
  "object": "chat.completion",
  "created": 1710000000,
  "model": "anthropic/claude-sonnet-4-5-20250929",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I don't have access to real-time weather data..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 128,
    "total_tokens": 170
  }
}
FieldTypeDescription
idstringUnique completion ID with chatcmpl- prefix and UUID
object"chat.completion"Fixed object type
creatednumberUnix timestamp (seconds)
modelstringModel identifier from request
choicesChoice[]Array with single choice (index 0)
choices[].message.role"assistant"Always "assistant"
choices[].message.contentstring | nullResponse text
choices[].finish_reasonstringMapped finish reason (see table below)
usage.prompt_tokensnumberInput token count
usage.completion_tokensnumberOutput token count
usage.total_tokensnumberTotal token count
Source: ChatCompletion interface in packages/gateway/src/openai/openai-types.ts

Streaming Response

When stream is true, the endpoint returns Server-Sent Events (SSE). Each event contains a ChatCompletionChunk serialized as JSON. The streaming sequence follows five steps: Step 1: Role chunk — Announces the assistant role.
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1710000000,"model":"anthropic/claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
Step 2: Content chunks — One event per text delta (N chunks).
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1710000000,"model":"anthropic/claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
Step 3: Finish chunk — Contains the finish_reason, empty delta.
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1710000000,"model":"anthropic/claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
Step 4: Usage chunk — Token counts with empty choices array.
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1710000000,"model":"anthropic/claude-sonnet-4-5-20250929","choices":[],"usage":{"prompt_tokens":42,"completion_tokens":128,"total_tokens":170}}
Step 5: Terminal marker — Signals end of stream.
data: [DONE]

ChatCompletionChunk Interface

FieldTypeDescription
idstringSame chatcmpl- ID for all chunks in one completion
object"chat.completion.chunk"Fixed chunk object type
creatednumberUnix timestamp (seconds)
modelstringModel identifier from request
choicesChunkChoice[]Array with delta content
choices[].indexnumberAlways 0
choices[].delta.role"assistant" | undefinedPresent only in role chunk
choices[].delta.contentstring | undefinedPresent in content chunks
choices[].finish_reasonstring | nullnull until finish chunk
usageUsage | undefinedPresent only in usage chunk
Source: ChatCompletionChunk interface in packages/gateway/src/openai/openai-types.ts, streaming sequence in packages/gateway/src/openai/openai-completions.ts

Finish Reason Mapping

Comis maps its internal finish reasons to OpenAI-compatible values. Unknown reasons default to "stop".
Comis ReasonOpenAI ReasonDescription
stopstopAgent completed normally
max_stepslengthMaximum execution steps reached
budget_exceededstopToken budget exhausted
circuit_openstopCircuit breaker tripped
context_loopstopContext loop detected
errorstopExecution error occurred
(unknown)stopAny unrecognized reason
Source: FINISH_REASON_MAP in packages/gateway/src/openai/openai-types.ts

Session Key

Each OpenAI API request creates an ephemeral session with an auto-generated key:
{
  "userId": "openai-api",
  "channelId": "openai",
  "peerId": "chatcmpl-<uuid>"
}
The peerId uses the completion ID, making each request a unique session. This means there is no conversation history between requests — each call is stateless from the agent’s perspective.
Source: Session key construction in packages/gateway/src/openai/openai-completions.ts

Model Resolution

If the gateway has a resolveModel function wired, the model field is validated against the model catalog. If the model is not found, the endpoint returns a 404 error:
{
  "error": {
    "message": "Model not found: invalid/model",
    "type": "not_found_error",
    "param": null,
    "code": null
  }
}

Models

List Models

GET /v1/models Returns all configured models in OpenAI’s List Models format.
GET /v1/models Response
{
  "object": "list",
  "data": [
    {
      "id": "anthropic/claude-sonnet-4-5-20250929",
      "object": "model",
      "created": 0,
      "owned_by": "anthropic"
    },
    {
      "id": "openai/gpt-4o",
      "object": "model",
      "created": 0,
      "owned_by": "openai"
    }
  ]
}

Get Model

GET /v1/models/:model_id Retrieves a single model by its ID. Model IDs use provider/modelId format (e.g., anthropic/claude-sonnet-4-5-20250929). The route uses a wildcard param to accommodate the slash in the ID. Success:
GET /v1/models/anthropic/claude-sonnet-4-5-20250929
{
  "id": "anthropic/claude-sonnet-4-5-20250929",
  "object": "model",
  "created": 0,
  "owned_by": "anthropic"
}
Not found:
{
  "error": {
    "message": "Model not found",
    "type": "not_found_error",
    "param": null,
    "code": null
  }
}

Model Format

Each model object has the following fields:
FieldTypeDescription
idstring"provider/modelId" format
object"model"Fixed object type
creatednumberAlways 0 (no creation timestamp tracked)
owned_bystringProvider name from model catalog
Source: toOpenAIModel() in packages/gateway/src/openai/openai-models.ts

Embeddings

Request

POST /v1/embeddings The request body is validated against EmbeddingsRequestSchema.
FieldTypeRequiredConstraintsDefaultDescription
modelstringYesmin(1)Embedding model identifier
inputstring | string[]YesText to embed (single string or array)
encoding_format"float" | "base64"No"float"Output encoding format

Example Requests

Single string input
{
  "model": "openai/text-embedding-3-small",
  "input": "Hello, world!"
}
Batch input
{
  "model": "openai/text-embedding-3-small",
  "input": ["Hello", "World"],
  "encoding_format": "float"
}

Response

POST /v1/embeddings Response
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0023, -0.009, 0.015, ...],
      "index": 0
    }
  ],
  "model": "openai/text-embedding-3-small",
  "usage": {
    "prompt_tokens": 3,
    "total_tokens": 3
  }
}
FieldTypeDescription
object"list"Fixed collection type
data[].object"embedding"Fixed embedding type
data[].embeddingnumber[]Embedding vector
data[].indexnumberPosition in input array
modelstringModel ID from the embedding port
usage.prompt_tokensnumberEstimated token count (ceil(chars / 4))
usage.total_tokensnumberSame as prompt_tokens (no completion tokens)
If no embedding provider is configured, the endpoint returns 404:
{
  "error": {
    "message": "No embedding provider configured",
    "type": "not_found_error",
    "param": null,
    "code": null
  }
}
Source: EmbeddingsRequestSchema and createOpenaiEmbeddingsRoute() in packages/gateway/src/openai/openai-embeddings.ts

Error Responses

All endpoints return errors in the OpenAIErrorResponse format:
Error Response Format
{
  "error": {
    "message": "Description of what went wrong",
    "type": "invalid_request_error",
    "param": "model",
    "code": null
  }
}
FieldTypeDescription
error.messagestringHuman-readable error description
error.typestringError category (see mapping below)
error.paramstring | nullThe parameter that caused the error (if applicable)
error.codestring | nullAlways null

Error Type Mapping

HTTP StatusOpenAI Error TypeTypical Cause
400invalid_request_errorMissing required field, invalid value
401authentication_errorMissing or invalid bearer token
403permission_errorInsufficient token scopes
404not_found_errorModel not found, no embedding provider
429rate_limit_errorRate limit exceeded
500server_errorInternal error during execution
Unknown HTTP status codes fall back to "server_error".
Source: STATUS_TO_ERROR_TYPE and createOpenAIError() in packages/gateway/src/openai/openai-types.ts

Configuration

The OpenAI-compatible API is enabled automatically when the gateway is running. Relevant configuration fields:
gateway:
  enabled: true          # Enable the gateway server (default: true)
  host: "127.0.0.1"     # Bind address (default: loopback only)
  port: 4766            # Listen port (default: 4766)
  tokens:               # Bearer tokens for authentication
    - id: "client-1"
      secret: "your-secret-at-least-32-characters"
      scopes: []        # Empty = all scopes
  rateLimit:
    windowMs: 60000     # Rate limit window (default: 60s)
    maxRequests: 100    # Max requests per window (default: 100)
  httpBodyLimitBytes: 1048576  # Max POST body (default: 1MB)
Source: GatewayConfigSchema in packages/core/src/config/schema-gateway.ts

HTTP Gateway

Full gateway endpoint reference

JSON-RPC

JSON-RPC method reference

WebSocket

WebSocket streaming reference

Rate Limiting

Rate limiting configuration