Skip to main content
What this is for: the single source of truth for everything you can put in ~/.comis/config.yaml. Who it’s for: anyone tuning agent behavior, wiring channels, hardening the gateway, or composing layered configs across environments. Comis uses a single config.yaml file (or multiple layered files) to configure every aspect of the system. The configuration contains 41 top-level sections with 54+ nested schemas defining agent behavior, channel adapters, security policies, gateway settings, and more. All schemas use Zod strict validation — unknown keys are rejected at startup. Config files are specified via the COMIS_CONFIG_PATHS environment variable (colon-separated, like the shell PATH). See the Configuration Guide for a step-by-step setup walkthrough.

Config Loading

Comis loads configuration through a layered resolution process:
  1. Defaults — Every schema field has a sensible default. An empty config file produces a valid configuration.
  2. YAML files — Files listed in COMIS_CONFIG_PATHS are merged in order (later files override earlier ones).
  3. Environment overrides — Env vars override YAML values at runtime.

Special Directives

DirectiveDescriptionExample
$includeReference another YAML file for modular composition (max 10 levels deep)$include: ./channels.yaml
${VAR_NAME}Substitute an environment variable or stored secret via SecretManagertoken: "${TELEGRAM_BOT_TOKEN}"
$${VAR_NAME}Escape syntax — produces literal ${VAR_NAME} in the outputexample: "$${NOT_SUBSTITUTED}"
$VAR_NAMEBare reference — auto-corrected to ${VAR_NAME} with a warningtoken: $MY_TOKEN (corrected)
Strict validation (z.strictObject) means any unrecognized key in your config file causes a startup error. Check key names carefully against this reference.

Minimal Example

# ~/.comis/config.yaml
tenantId: my-project
logLevel: info

agents:
  assistant:
    provider: anthropic
    model: claude-sonnet-4-5-20250929

channels:
  telegram:
    enabled: true
    token: "${TELEGRAM_BOT_TOKEN}"

gateway:
  enabled: true
  port: 4766
  tokens:
    - id: default
      secret: "${COMIS_GATEWAY_TOKEN}"
      scopes: ["*"]

Complete Annotated Example

A realistic single-tenant deployment running one Telegram-connected research assistant with persistent memory, gateway auth, scheduled heartbeat, and approval gates for sensitive actions. Copy this as a starting point and trim anything you don’t need — defaults take care of the rest.
# ~/.comis/config.yaml — annotated reference deployment
tenantId: research-lab            # Used to scope memory and sessions
logLevel: info                    # trace/debug/info/warn/error/fatal
dataDir: ""                       # Empty resolves to ~/.comis

# -- One agent: research assistant on Anthropic Sonnet 4.5 --------------------
agents:
  research-bot:
    name: "Research Bot"
    provider: anthropic
    model: claude-sonnet-4-5-20250929
    maxSteps: 50                  # Reasoning steps per execution
    cacheRetention: long          # Anthropic prompt cache: long beats short
    budgets:
      perExecution: 1000000       # 1M tokens per single run
      perHour:      5000000
      perDay:       50000000
    rag:
      enabled: true               # Auto-retrieve relevant memory before LLM call
      maxResults: 8
      minScore: 0.15
    skills:
      toolPolicy:
        profile: full             # minimal | coding | messaging | supervisor | full
        deny: ["exec"]            # No shell exec for this agent
    session:
      resetPolicy:
        mode: hybrid              # daily + idle whichever fires first
        dailyResetHour: 4
        idleTimeoutMs: 14400000   # 4 hours
    scheduler:
      heartbeat:
        enabled: true             # Proactive check-ins
        intervalMs: 1800000       # 30 min

# -- One inbound channel ------------------------------------------------------
channels:
  telegram:
    enabled: true
    botToken: "${TELEGRAM_BOT_TOKEN}"
    allowFrom: ["123456789"]      # Only this Telegram user can reach the agent
    mediaProcessing:
      transcribeAudio: true
      analyzeImages:   true

# -- Memory: SQLite + local embeddings ---------------------------------------
memory:
  dbPath: "memory.db"
  walMode: true
  embeddingModel: "text-embedding-3-small"
  embeddingDimensions: 1536
  retention:
    maxAgeDays: 365               # Keep one year of memory entries (0 = no age limit)

# -- Gateway: HTTP + WebSocket on loopback only ------------------------------
gateway:
  enabled: true
  host: "127.0.0.1"               # 0.0.0.0 only with TLS in front
  port: 4766
  tokens:
    - id: cli-default
      secret: "${COMIS_GATEWAY_TOKEN}"
      scopes: ["*"]
  rateLimit:
    windowMs: 60000
    maxRequests: 200

# -- Approvals: gate destructive actions -------------------------------------
approvals:
  enabled: true
  defaultPolicy: prompt           # prompt | allow | deny
  rules:
    - match: { tool: "exec", argMatches: ["rm ", "DROP ", "delete from"] }
      action: deny
    - match: { tool: "exec" }
      action: prompt

# -- Scheduler: cron + heartbeat ---------------------------------------------
scheduler:
  cron:
    enabled: true
    defaultTimezone: "America/New_York"
  heartbeat:
    enabled: true
    intervalMs: 600000            # 10 min global heartbeat coalescer
The values shown are real (not placeholders). Save it as ~/.comis/config.yaml, populate ~/.comis/.env with TELEGRAM_BOT_TOKEN and COMIS_GATEWAY_TOKEN, and comis daemon start will boot a working deployment.

Quick Reference

All 41 top-level configuration sections at a glance:
SectionTypeDescription
tenantIdstringTenant identifier for multi-tenancy
logLevelenumGlobal log level (trace/debug/info/warn/error/fatal)
dataDirstringBase data directory for persistent storage
agentDirstringSDK agent directory for persistent settings
agentsRecord<string, PerAgentConfig>Multi-agent configuration map
channelsChannelConfigChannel adapter configuration
memoryMemoryConfigMemory system configuration
securitySecurityConfigSecurity, audit, and secrets configuration
routingRoutingConfigMulti-agent routing dispatch
daemonDaemonConfigDaemon process, logging, and watchdog
schedulerSchedulerConfigCron, heartbeat, and task automation
gatewayGatewayConfigHTTPS server, tokens, TLS, rate limiting
integrationsIntegrationsConfigExternal services (MCP, search, media)
monitoringMonitoringConfigSystem health monitoring
observabilityObservabilityConfigObservability persistence (retention, snapshots)
diagnosticsDiagnosticsConfigTrajectory JSONL recording, config-audit log path, cache-trace enable/disable knobs
pluginsPluginsConfigPlugin system configuration
queueQueueConfigCommand queue and concurrency control
streamingStreamingConfigBlock streaming and delivery
autoReplyEngineAutoReplyEngineConfigAgent activation for inbound messages
sendPolicySendPolicyConfigOutbound message gating rules
embeddingEmbeddingConfigEmbedding provider (local GGUF or OpenAI)
envelopeEnvelopeConfigMessage metadata enrichment
browserBrowserConfigBrowser automation (CDP, headless Chrome)
modelsModelsConfigModel catalog and alias configuration
providersProvidersConfigLLM provider entries (API keys, endpoints)
messagesMessagesConfigMessaging UX (splitting, typing, receipts)
approvalsApprovalsConfigAction approval workflow
webhooksWebhooksConfigWebhook subsystem (path routing, HMAC)
lifecycleReactionsLifecycleReactionsConfigProcessing phase emoji reactions
responsePrefixResponsePrefixConfigResponse prefix/suffix template
deliveryQueueDeliveryQueueConfigCrash-safe outbound delivery queue
deliveryMirrorDeliveryMirrorConfigSession delivery mirroring and deduplication
outputRetentionOutputRetentionConfigOutput retention housekeeper configuration — per-class retention window in milliseconds
deliveryTimingDeliveryTimingConfigInter-block delivery pacing
coalescerCoalescerConfigBlock coalescer for streaming
senderTrustDisplaySenderTrustDisplayConfigSender identity display in envelope
documentationDocumentationConfigDocumentation links for system prompt
telegramFileRefGuardTelegramFileRefGuardConfigTelegram file reference guard
toolingToolingConfigTool-first capability layer — MCP/skill hints, install-detour mode
executorExecutorConfigCredential-broker executor configuration (optional; omit unless using the broker)

Core Settings

Scalar fields at the root of the configuration.
KeyTypeDefaultDescription
tenantIdstring"default"Tenant identifier for SaaS multi-tenancy. Used to scope memory and sessions.
logLevelenum"debug"Global log level. Values: trace, debug, info, warn, error, fatal.
dataDirstring"" (resolves to ~/.comis)Base data directory for all persistent storage. Empty string resolves to ~/.comis.
agentDirstring"~/.pi/agent"SDK agent directory for persistent settings.
tenantId: my-project
logLevel: info
dataDir: /opt/comis/data
agentDir: ~/.pi/agent

Agents

Multi-agent configuration map. Each key is an agent ID, and the value is a PerAgentConfig object that extends AgentConfig with skills, scheduler, session, concurrency, and broadcast settings.Type: Record<string, PerAgentConfig>Default: { default: PerAgentConfig.parse({}) } (one default agent with all defaults)

AgentConfig Fields

KeyTypeDefaultDescription
namestring"Comis"Display name for the agent
modelstring"default"LLM model identifier. "default" resolves via models.defaultModel.
providerstring"default"LLM provider. "default" resolves via models.defaultProvider.
maxStepsnumber150Maximum reasoning steps per execution
thinkingLevelenum(unset)SDK thinking level override: off, minimal, low, medium, high, xhigh
maxTokensnumber(unset)SDK max tokens override
temperaturenumber(unset)SDK temperature override (0-2)
cacheRetentionenum"long"Anthropic prompt cache retention: none, short, long
maxContextCharsnumber100000Maximum total characters for context window (~25k tokens)
maxToolResultCharsnumber50000Maximum characters per tool result before truncation. Small/nano-class models default lower (small 12000, nano 8000) so a single large result can’t fill the window; set this explicitly (or use capabilityClassOverride: frontier) to keep the full 50000.
preserveRecentnumber4Minimum recent messages to always preserve during compaction
workspacePathstring(unset)Path to agent workspace directory containing identity files
reactionLevelenum(unset)Reaction frequency: minimal (1 per 5-10 exchanges) or extensive (react freely)
languagestring(unset)Reply language for deterministic degraded replies (context-exhausted / output-starved notices). Accepts a BCP-47 tag ("he") or an English display name ("Hebrew"). When omitted, Comis auto-detects the reply language from the USER.md preferred language, then the inbound message script (Hebrew, Arabic, and Russian/Cyrillic only). Does not affect the live agent reply, which always follows the user’s language.
enforceFinalTagbooleanfalseWhen enabled, only content inside <final> blocks reaches users. Suppresses all content outside final tags on both streaming and non-streaming paths.
fastModebooleanfalseEnables fast mode for the LLM provider (provider-specific behavior).
storeCompletionsbooleanfalseWhen enabled, sends store: true to OpenAI-compatible providers for completion storage.
oauthProfilesRecord<string, string>{}Per-provider OAuth profile preference. Keys are provider ids; values are profile ids in <provider>:<identity> form (e.g., openai-codex:user@example.com). The daemon resolves at LLM-call time as: this map -> lastGood per provider -> first available profile in the store. See OAuth concepts -> Multi-account profiles.
Multi-account example — two agents, two ChatGPT accounts:
~/.comis/config.yaml
security:
  storage: file

agents:
  default:
    provider: openai-codex
    model: gpt-5.1
    oauthProfiles:
      openai-codex: "openai-codex:user@example.com"

  work-agent:
    provider: openai-codex
    model: gpt-5.1-codex-max
    oauthProfiles:
      openai-codex: "openai-codex:work@company.com"

Budget (agents.*.budgets)

KeyTypeDefaultDescription
perExecutionnumber2000000Max tokens per single execution
perHournumber10000000Max tokens per hour (rolling window)
perDaynumber100000000Max tokens per day (rolling window)

Circuit Breaker (agents.*.circuitBreaker)

KeyTypeDefaultDescription
failureThresholdnumber5Consecutive failures before opening circuit
resetTimeoutMsnumber60000Milliseconds before attempting recovery
halfOpenTimeoutMsnumber30000Milliseconds for half-open probe timeout

Model Routes (agents.*.modelRoutes)

Extensible key-value map of task type to model identifier. Any string key maps to a model ID string.
KeyTypeDefaultDescription
defaultstring(unset)Default model for unrouted tasks (falls back to agent.model)
(any key)stringCustom route name to model ID

Operation Model Overrides (agents.*.operationModels)

Override the model and timeout for specific internal operation types. Each entry is an OperationModelEntry with optional model (string "provider:modelId") and timeout (number) fields. The key is timeout (milliseconds) — timeoutMs is rejected by the strict config parser. Omit any type to use the agent’s primary model.
Operation typeWhen it firesRecommended model
cronScheduled cron task execution
heartbeatPeriodic heartbeat check
subagentSub-agent task delegation
compactionContext compaction summarizationCapable model or contextEngine.compaction.strongerSummarizerModel
taskExtractionTask extraction from conversation
condensationMemory condensationCapable model
verificationPre-delivery critic (R4). Fires when agents.<id>.verification.enabled=true and the response is a completion-claiming response meeting minResponseChars.Cheap model: "anthropic:claude-haiku-4-5-20250929" or "ollama:qwen3.6:27b" for local self-check. Omit to use the primary model.
planningPre-execution planner (R5, deferrable on M2).Capable model for checklist generation. Omit to use the primary model.
agents:
  default:
    operationModels:
      verification:
        model: "anthropic:claude-haiku-4-5-20250929"
        timeout: 30000
      compaction:
        model: "anthropic:claude-haiku-4-5-20250929"

Model Failover (agents.*.modelFailover)

KeyTypeDefaultDescription
fallbackModelsFallbackModel[][]Ordered fallback models (provider + modelId pairs)
authProfilesAuthProfile[][]Per-provider API key profiles for auth rotation
allowedModelsstring[][]Model allowlist (empty = allow all)
maxAttemptsnumber6Maximum total attempts across all models/keys
cooldownInitialMsnumber60000Initial cooldown duration in ms
cooldownMultipliernumber5Exponential cooldown multiplier
cooldownCapMsnumber3600000Maximum cooldown duration in ms (1 hour)

SDK Retry (agents.*.sdkRetry)

KeyTypeDefaultDescription
enabledbooleantrueEnable SDK-native retry with exponential backoff
maxRetriesnumber5Maximum retry attempts for transient errors
baseDelayMsnumber4000Base delay in ms before first retry (doubles each attempt)
maxDelayMsnumber60000Maximum delay cap in ms between retries

Prompt Timeout (agents.*.promptTimeout)

KeyTypeDefaultDescription
promptTimeoutMsnumber180000Stall budget for primary prompt calls (3 minutes): the deadline resets on activity — stream text/thinking deltas (throttled ~1/s) and tool completions. A turn dies only when NO activity occurs for this long. Stall semantics apply to all providers.
retryPromptTimeoutMsnumber60000Wall-clock timeout for retry and fallback prompt calls (1 minute). Used during auth rotation and model fallback attempts. Whole-turn (not stall-based) — retry and fallback prompts do not reset on activity.
stallCeilingMultipliernumber10Makespan ceiling: a turn is aborted at promptTimeoutMs × stallCeilingMultiplier even while still streaming — bounds runaway generations that would otherwise reset the stall budget forever. Valid range 1–100 (a value below 1 would fire the ceiling before the stall budget; the product is additionally capped at Node’s ~24.8-day timer limit). Runtime-tunable.
See Resilience for how prompt timeouts fit into the full resilience stack.

RAG (agents.*.rag)

KeyTypeDefaultDescription
enabledbooleantrueEnable automatic memory retrieval before LLM calls
maxResultsnumber5Maximum memory results to retrieve
maxContextCharsnumber4000Maximum characters of memory context injected
minScorenumber0.1Minimum RRF score threshold (0-1)
includeTrustLevelsstring[]["system", "learned"]Trust levels to include in retrieval
baseFloornumber (0–1)0 for frontier/mid; 0.15 for small/nano (capability-gated)Minimum BASE relevance score (pre-boost) for memory injection. Boosts cannot resurrect a memory whose base score falls below this threshold. Gates on ScoreBreakdown.base (un-boosted cosine/RRF score), applied after scoreWithBreakdown(). Setting baseFloor: 0 is the unset sentinel — the capability default applies (0.15 for small/nano). Any value greater than 0 is treated as explicit and wins over the capability default.
Security (S6): A weaker ModelProfile cannot lower this below the operator-set value. FROZEN_TRUST_PATHS and MemoryWriteValidator remain enforced for all capabilityClasses regardless of this setting.
Capability-gated default (Phase 158): For small/nano agents, rag.baseFloor defaults to 0.15 — dropping memories with a base relevance score below 0.15 (the Phase 153 poison mitigation). For frontier/mid, the default remains 0 (no filter). Setting rag.baseFloor: 0 explicitly is treated the same as “unset” — the capability default applies. To disable the relevance floor on a small/nano agent, set providers.<id>.capabilityClass: "frontier" to override the capability class, or set rag.baseFloor: 0.01 (any value greater than 0 is treated as explicit and wins over the capability default).
Cross-encoder rerank (agents.*.rag.rerank) — opt-in (default off)A cross-encoder re-scores the top fusion candidates with the local reranker model (see memory.rerankerModel). Disabled by default; on timeout or unavailability it falls back to the fusion-ranked order, and the reranker GGUF is never downloaded while disabled.
KeyTypeDefaultDescription
enabledbooleanfalseEnable cross-encoder reranking of fused candidates (opt-in)
maxCandidatesnumber40Candidate cap bounding worst-case rerank latency (positive)
minResultsnumber1Skip reranking when fewer than this many candidates are present (nonnegative)
timeoutMsnumber800Rerank wall-clock timeout in ms; on timeout fall back to fusion order (positive)
Scoring boosts (agents.*.rag.scoring) — all number, range 0-1Multiplicative boosts applied to the reranked-or-fused score before the trust filter.
KeyTypeDefaultDescription
recencyAlphanumber0.2Recency boost weight (applied via createdAt)
temporalAlphanumber0.2Event-time proximity boost weight (applies when occurredAt is present, neutral when absent)
proofAlphanumber0.1Proof-count boost weight (neutral until proofCount exists)
trustAlphanumber0.1Trust-level boost weight and tie-break
Entity associative lane (agents.*.rag.entityLane) — opt-in (default off)A one-hop entity-associative fusion lane. When disabled (the default), RRF fusion is unchanged.
KeyTypeDefaultDescription
enabledbooleanfalseEnable the one-hop entity-associative lane (opt-in)
seedCountnumber5How many top search hits seed the entity self-join (positive)
perEntityCapnumber200Max shared-entity neighbour rows the lane returns (positive)
weightnumber1.0RRF weight for the entity lane (≥ 0)
Enabling the opt-in recall featuresReranking, the entity lane, and consolidation all ship off. The reranker model and the recall trace live under the top-level memory and diagnostics keys (not under agents). A schema-valid block that turns them on:
agents:
  default:
    rag:
      enabled: true                 # always-on recall (default true)
      rerank:
        enabled: true               # opt-in (default false) -- cross-encoder re-scoring
        maxCandidates: 40
        timeoutMs: 800
      scoring:                      # multiplicative boosts (defaults shown)
        recencyAlpha: 0.2
        temporalAlpha: 0.2
        proofAlpha: 0.1
        trustAlpha: 0.1
      entityLane:
        enabled: true               # opt-in (default false) -- one-hop associative lane
    memoryConsolidation:
      enabled: true                 # opt-in (default false) -- LLM cron, a cost gate
      schedule: "30 3 * * *"
    contextEngine:
      version: dag                  # the default lossless LCD engine; set "pipeline" for the simpler engine

memory:
  rerankerModel: "hf:gpustack/bge-reranker-v2-m3-GGUF:bge-reranker-v2-m3-Q8_0.gguf"
  rerankerThreads: 4
  rerankerGpu: auto

diagnostics:
  recallTrace:
    enabled: true                   # opt-in (default false) -- full-sanitized, bounded JSONL

Bootstrap (agents.*.bootstrap)

KeyTypeDefaultDescription
maxCharsnumber20000 for frontier/mid; 3500 for small/nano (capability-gated)Per-file character limit for workspace files injected into system prompt. For capabilityClass in {small, nano}, the effective default drops to 3_500 chars per file (SD6, Phase 159). Setting maxChars: 20000 explicitly is treated the same as “unset” — the capability default of 3_500 still applies for small/nano. Use capabilityClassOverride: frontier or set an explicit value other than 20000 to override.
promptModeenum"full"Verbosity: full, minimal (sub-agents), none (identity only)
groupChatFilteringbooleantrueExclude USER.md from bootstrap in group chats (privacy)

Workspace (agents.*.workspace)

KeyTypeDefaultDescription
profileenum"full"Workspace profile: full (all platform instructions, ~9K tokens) or specialist (minimal safety + workspace reference, ~800 tokens). Use specialist for purpose-built sub-agents.

Concurrency (agents.*.concurrency)

KeyTypeDefaultDescription
maxConcurrentRunsnumber4Maximum concurrent agent runs
maxQueuedPerSessionnumber50Maximum queued messages per session before overflow

Broadcast Groups (agents.*.broadcastGroups)

Array of broadcast group objects for simultaneous multi-channel message delivery.
KeyTypeDefaultDescription
idstring(required)Unique group identifier
namestring""Human-readable group name
targetsBroadcastTarget[][]Channel targets (channelType, channelId, chatId)
enabledbooleantrueWhether this broadcast group is active

Elevated Reply (agents.*.elevatedReply)

KeyTypeDefaultDescription
enabledbooleanfalseEnable trust-based model/prompt routing
trustModelRoutesRecord<string, string>{}Map of trust level to model route name
trustPromptOverridesRecord<string, string>{}Map of trust level to system prompt override
defaultTrustLevelstring"external"Default trust level for unknown senders
senderTrustMapRecord<string, string>{}Per-sender trust level overrides

Tracing (agents.*.tracing)

KeyTypeDefaultDescription
enabledbooleanfalseEnable per-LLM-call JSONL trace files
outputDirstring"~/.comis/traces"Output directory for JSONL trace files

Gemini Cache (agents.*.geminiCache)

Per-agent configuration for Gemini explicit CachedContent caching. When enabled, Comis creates server-side cached content on Google AI Studio for guaranteed 90% discount on cached input tokens.
KeyTypeDefaultDescription
enabledbooleanfalseEnable Gemini explicit CachedContent caching. Only activates for Google AI Studio providers (not Vertex AI).
maxActiveCachesnumber20Maximum active cached contents per agent. Must be a positive integer. Oldest entries are evicted (LRU) when this limit is reached.
Gemini cache TTL is hardcoded at 3600 seconds (1 hour) and is not configurable per-agent. Caches are automatically refreshed when more than 50% of the TTL has elapsed. This setting is independent of the Anthropic cacheRetention field — they control different provider caching mechanisms.

Context Guard (agents.*.contextGuard)

KeyTypeDefaultDescription
enabledbooleantrueEnable context window guard checks
warnPercentnumber80Warn when context usage reaches this percent (0-100)
blockPercentnumber95Block execution when context usage reaches this percent (0-100)

Context Engine (agents.*.contextEngine)

Context engine configuration. Controls the system that manages what your agent sees each turn. The default is dag (the v2.12 lossless LCD engine). The pipeline value (the simpler sequential layered engine) is the first-class opt-in — set version: "pipeline" to use it. DAG does lossless verbatim assembly (full faithful history + a verbatim fresh tail of the most recent steps + transcript repair), zoomably compresses the oldest history under a token budget, and exposes the in-session ctx_* expansion tools. See Compaction for a user-friendly explanation.Core fields
KeyTypeDefaultDescription
enabledbooleantrueMaster toggle for the context engine
versionstring"dag"Operating mode: "dag" (default, the lossless LCD engine) or "pipeline" (opt-in, the sequential layered system)
Shared fields (both modes)
KeyTypeDefaultDescription
thinkingKeepTurnsnumber10Recent assistant turns that retain thinking blocks (1-50)
compactionModelstring""Model used for summarization. Empty (the default) means runtime-resolved against the agent’s primary provider. Override with a specific cheaper or faster model if needed.
evictionMinAgenumber15Minimum turn age before stale errors are evicted by the dead content evictor (3-50)
Pipeline mode fields (version: "pipeline")
KeyTypeDefaultDescription
historyTurnsnumber15Recent user turns to keep in context (3-100)
historyTurnOverridesRecord<string, number>Per-channel-type turn count overrides (e.g., { dm: 10, group: 5 })
observationKeepWindownumber25Recent tool uses that retain full results (1-50)
observationTriggerCharsnumber120000Character threshold before observation masking activates (50K-1M)
compactionCooldownTurnsnumber5Turns to wait before re-triggering LLM compaction (1-50)
compactionPrefixAnchorTurnsnumber2User turns preserved at conversation head for cache prefix stability (0-10)
outputEscalation.enabledbooleantrueAllow escalating output token budget when context is compacted
outputEscalation.escalatedMaxTokensnumber32768Maximum output tokens after escalation (4096-128000)
observationDeactivationCharsnumber80000Character threshold to deactivate observation masking entirely (20K-500K)
ephemeralKeepWindownumber10Recent ephemeral tool results to keep unmasked (1-50)
DAG mode fields (version: "dag") — freshTailTurns, the leaf/condense summarization fields, and budget-bounded eviction are active in the current release (lossless verbatim assembly + threshold-triggered leaf summarization + multi-tier condensation + budget eviction); only the on-demand ctx_* recall keys (maxExpandTokens, maxRecallsPerDay, recallTimeoutMs) remain reserved until the recall tools land in a later phase:
KeyTypeDefaultDescription
freshTailTurnsnumber8Recent steps (assistant + tool round-trips, not user-turns) always kept verbatim and never evicted (1-50)
contextThresholdnumber0.75Budget utilization ratio that triggers the end-of-turn leaf-summarization pass and the condense pass’s hard-fanout pressure gate (0.1-0.95). The ratio is computed against the turn’s effective budget window (min of the reconciled context window and the capability-class cap) — not the model’s configured contextWindow — so capped or served-bound small models compact at the real window.
leafMinFanoutnumber8Minimum raw messages before creating a leaf summary (2-20)
condensedMinFanoutnumber4Minimum leaf summaries before creating a condensed summary (2-20)
condensedMinFanoutHardnumber2Absolute minimum fanout for condensed summaries under pressure (2-10)
incrementalMaxDepthnumber0Maximum DAG depth for incremental compaction. -1 disables depth limit. (-1 to 10)
leafChunkTokensnumber20000Maximum source tokens per leaf summary chunk (1K-100K). Clamped at runtime to the resolved summarizer model’s context window — the smaller of its configured window and the probed served window when the summarizer runs on the served-bound provider — minus the summary target, prompt-template overhead, and the threaded previous-summary size, so a small compaction summarizer (e.g. an 8K operationModels.compaction model, or a served-bound primary) is never fed an over-window chunk; oversized backlogs drain across multiple bounded passes, and a single message larger than the clamped cap is replaced by a bounded deterministic extraction (no LLM call) — its full content stays in the lossless message store.
leafTargetTokensnumber1200Target token count for each leaf summary output (96-5K)
condensedTargetTokensnumber2000Target token count for each condensed summary output (256-10K)
maxExpandTokensnumber4000Maximum tokens a recall sub-agent can read per expansion (500-50K)
maxRecallsPerDaynumber10Daily limit on recall sub-agent spawns per agent (1-100)
recallTimeoutMsnumber120000Timeout for recall sub-agent execution in milliseconds (10K-600K)
largeFileTokenThresholdnumber25000File token count above which content is stored as a large file reference (1K-200K)
annotationKeepWindownumber15Recent tool results protected from annotation replacement in DAG assembly (1-50)
annotationTriggerCharsnumber200000Character threshold before old tool results are annotated with placeholders (10K-1M)
summaryModelstringOptional model override for DAG summarization (falls back to compactionModel)
summaryProviderstringOptional provider override for DAG summarization
DAG robustness / spend / deferred-compaction fields (version: "dag") — all active in the current release:
KeyTypeDefaultDescription
deferCompactionbooleantrueRun the afterTurn leaf + condense passes in the background on the per-conversation single-flight serializer (never blocking the turn). false runs them inline at end-of-turn (deterministic, for tests)
summarizerSpend.maxTokensPerTenantPerHournumber500000Per-tenant rolling-hour ceiling on summarizer (input+output) tokens; over the cap the summarizer is bypassed → truncation-only assembly (no LLM call, no turn failure). 0 disables the hourly cap (min 0)
summarizerSpend.maxTokensPerTenantPerDaynumber5000000Per-tenant rolling-day summarizer token ceiling. 0 disables the daily cap (min 0)
summarizerBreaker.failureThresholdnumber5Consecutive summarizer failures before the breaker opens → truncation-only assembly (min 1)
summarizerBreaker.resetTimeoutMsnumber60000How long the summarizer breaker stays open before a half-open trial, in milliseconds
summarizerBreaker.halfOpenTimeoutMsnumber30000Half-open trial window for the summarizer breaker, in milliseconds
DAG mode fields are validated by Zod even when version is "pipeline". This means you can pre-configure DAG settings before switching modes — invalid values will be caught at startup regardless of the active mode.
The DAG cross-agent isolation adds an agent_id column to the full-text search index, created once on a fresh database (no migration — sessions start fresh on the LCD engine). A pre-existing development ~/.comis database created before this release may need to be wiped to pick up the isolated index; a fresh install needs nothing. See Compaction.
Engine scope — served-window honesty and the viable floor (the recorded pipeline-parity verdict). The turn-time pre-flight fit check, output-headroom enforcement, and the served/cap context_exhausted provenance (the exhaustion text that names OLLAMA_CONTEXT_LENGTH / PARAMETER num_ctx / contextEngine.budget.effectiveContextCapSmall — see the served-window section) apply to the DEFAULT version: "dag" engine only. The "pipeline" engine’s fit guard is its budget-aware compaction trigger — it compacts when the estimated context exceeds 85% of the budget window, computed against the UNCAPPED configured contextWindow (no served reconcile, no capability-class cap) — plus reactive provider-side context_too_long classification when the provider rejects an oversized request. (The trigger behavior is test-pinned.) Pipeline compaction likewise summarizes only the longest oldest-first span that fits the summarizer’s window (reserving a summarizer-sized output allowance — at most a quarter of that window — for the summary itself) and keeps the un-summarized remainder in context (never dropped). When even the oldest message alone exceeds that budget, that single message is escalated through the compaction fallback ladder (worst case a bounded count-only note), so every evaluation shrinks the backlog and the 85% trigger re-fires on later turns until it drains.Recorded decision: the boot viable-floor WARN (the minViable equation) is deliberately ENGINE-AGNOSTIC — it fires for dag AND pipeline agents alike, because the minViable arithmetic (bootstrap + tool schemas + output headroom + fresh-tail reserve + safety margin vs the effective window) holds regardless of engine. What differs per engine is which TURN-TIME guard backs it up — dag: the pre-flight with knob-named exhaustion; pipeline: the 85% compaction trigger + reactive classification. An operator running a pipeline agent should read the boot WARN as applying to them, while the per-turn preflight surfaces do not.

Capacity Cap (agents.*.contextEngine.budget)

Prevents 256K context-window models from over-provisioning when running a small executive (e.g., qwen3.6:27b). Applied in computeTokenBudgetForProfile before history budget computation. frontier and mid classes always receive the full contextWindow.
KeyTypeDefaultDescription
agents.<id>.contextEngine.budget.effectiveContextCapSmallnumber (non-negative int)32000Maximum effective context tokens for capabilityClass="small". 0 = no cap (use raw contextWindow). Applied to prevent 256K overfill degrading a 27–35B executive.
agents.<id>.contextEngine.budget.effectiveContextCapNanonumber (non-negative int)16000Maximum effective context tokens for capabilityClass="nano". 0 = no cap. frontier and mid classes always receive the full contextWindow.
agents.<id>.contextEngine.budget.minVisibleOutputTokensnumber (int, 256–8192)768Minimum visible output tokens guaranteed on every LLM dispatch — the non-reasoning floor (answer or tool-call body) that must remain after the model’s thinking block. The total output headroom is thinkingReserve(reasoningStyle, thinkingLevel) + minVisibleOutputTokens. Raising this value increases the safety margin but reduces available history tokens. Applies to all capability classes; the thinking reserve on top of this floor varies by thinkingLevel (high adds 2,048 tokens; xhigh adds 4,096 tokens; models with no thinking block add 0).

Thinking-Effort Governor (agents.*.thinking)

Controls whether the thinking-effort governor may automatically adjust the active thinkingLevel when the remaining context window after eviction is tight.
KeyTypeDefaultDescription
agents.<id>.thinking.downshiftOnTightWindowbooleantrueWhen true, the thinking-effort governor may automatically lower thinkingLevel (high → medium → low) for a dispatch when the remaining context room after eviction cannot cover thinkingReserve(thinkingLevel) + minVisibleOutputTokens. This prevents the model’s thinking block from consuming the entire output budget and silently truncating the answer or tool call. For frontier and mid-capability models the governor is always a no-op regardless of this setting (their effective windows are large enough that the threshold is never reached). Set to false to disable down-shifting and always preserve the configured thinkingLevel (may result in context_exhausted degradation on tight windows instead of graceful down-shift).

Compaction Routing (agents.*.contextEngine.compaction)

Controls how the context engine handles LLM summarization for low-capability models. Applied in both pipeline llm-compaction and DAG leaf-summarizer layers.
KeyTypeDefaultDescription
agents.<id>.contextEngine.compaction.preferEvictionByCapabilitybooleantrueWhen true: small/nano capabilityClass → eviction-first (or strongerSummarizerModel if set) instead of same-model LLM summarization. Prevents degraded summaries from a weak model. A context:compaction_routed event fires when routing occurs.
agents.<id>.contextEngine.compaction.strongerSummarizerModelstring""Optional "provider:modelId" for a stronger summarizer when small/nano models are detected. Empty string = pure eviction/deterministic fallback. Example: "anthropic:claude-haiku-4-5-20250929". A keyless local provider (Ollama / LM Studio) is also valid (e.g. "qwen36-local:qwen3.6:35b") — no API key required.
Security (S4): Eviction never drops security-relevant context (sender-trust, safety reinforcement, untrusted-content markers, canary token). The security-context-pinner enforces this fail-closed.

Compact Prompt (agents.*.contextEngine.compactPrompt)

Controls the compact-secure promptMode for small/nano models. This mode assembles the system prompt to a bounded target while always retaining the full safety core, sender-trust, and config-secret sections.
KeyTypeDefaultDescription
agents.<id>.contextEngine.compactPrompt.enabledbooleantrueEnable compact-secure promptMode for small/nano capabilityClass. When true, retains the FULL safety core, sender-trust, and config-secret sections — never falls back to minimal mode’s empty safety. frontier/mid agents are unaffected.
agents.<id>.contextEngine.compactPrompt.targetTokensnumber (500–8000)3000Soft token target for the compact prompt (~chars/3.5). At 3000 ≈ 10,500 chars.
Security warning (S1): Setting enabled=false does not make the prompt smaller — it restores the full-size full promptMode. The compact prompt is safe for all deployments because it retains the security core. It is not safe to use minimal promptMode (which drops the safety block) for any security-sensitive deployment.

Relevance Policy (agents.*.contextEngine.relevance)

Controls whether within-conversation history is assembled relevance-first (the margin arbiter allocates the contended history budget across tiers by fused rank, with the fresh-tail and security-pinned floors guaranteed) or recency-first (the existing newest-kept eviction). The decision is capability-gated: small/nano models on a non-caching provider default relevance-first (reordering is free when there is no prompt cache to break); frontier/mid and any prompt-caching model default recency-first and are byte-identical to prior releases (the arbiter does not run for them). Precedence: explicit per-agent config > capability default > off.
KeyTypeDefaultDescription
agents.<id>.contextEngine.relevance.firstByDefaultboolean (optional)unset → capability default (small/nano + non-caching → relevance-first; frontier/mid + caching → recency-first)Force the relevance policy. true runs the margin arbiter at the eviction seam (relevance-first); false keeps recency-first. The field is optional with no default — omit it to let the capability gate decide. An explicit value (either direction) always wins. Setting true on a frontier/mid agent opts that agent into relevance-first; setting false on a non-caching small/nano agent forces recency-first.
Capability-gated default (Phase 173): The small/nano relevance-first default is gated on supportsPromptCache=false — a non-caching local model (typical Ollama) reorders history for free, while a caching model stays recency-first below the cache fence (reordering would break the prefix cache). frontier/mid are byte-identical by default (the arbiter never runs for them). The small/nano default-on flip is measurement-gated (validated by the outcome harness before being relied upon in production); the mechanism ships behind this flag. Precedence: explicit per-agent config > capability default > off.
Security (RETR-05): The arbiter never demotes security-relevant history (canary token, untrusted-content delimiters, safety reinforcement, sender-trust markers) — those items are unconditional floors, excluded from relevance candidacy and always kept, exactly as the recency path’s pre-flight already protects them. A content-free context:arbitrated event (per-tier kept counts; the discretionary pool offered and consumed plus the unconditional floor-token weight; the kept LTM/KG ids; and a relevanceFirst boolean) fires only on the relevance-first path; it carries no message, memory, or query content.

Memory Consolidation (agents.*.memoryConsolidation)

Periodic LLM-driven consolidation of similar memories into observations. Opt-in (default off) because each run spends model tokens — it is a cost gate. When enabled, a scheduled cron clusters near-duplicate memories and folds each cluster into a single observation; external-trust memories are excluded by default.
KeyTypeDefaultDescription
enabledbooleanfalseEnable periodic consolidation for this agent (opt-in cost gate)
schedulestring"30 3 * * *"Cron schedule for consolidation runs (daily at 03:30 UTC by default)
similarityThresholdnumber0.82Cluster-neighbour cosine threshold for single-link clustering (0-1)
dedupThresholdnumber0.9Content-similarity threshold for the deterministic dedup pre-check (0-1)
maxCandidatesPerRunnumber200Maximum raw candidates fetched per run (positive)
maxClusterSizenumber12Maximum candidates folded into one observation (positive)
maxClustersPerRunnumber25Maximum clusters consolidated per run (positive)
maxConsolidationTokensnumber1024Maximum LLM response tokens for one merge call (positive)
consolidateExternalbooleanfalseInclude external-trust memories in consolidation (excluded by default)
autoTagsstring[][]Extra tags applied to every created observation

Source Gate (agents.*.sourceGate)

KeyTypeDefaultDescription
maxResponseBytesnumber2000000Default byte cap for HTTP responses
stripHiddenHtmlbooleantrueWhether to strip hidden HTML before extraction

Goal Anchor (agents.*.goalAnchor)

Injects the current execution objective and uncompleted step checklist at the context tail every turn. Helps weak models stay on task across multi-turn executions. Default-ON for scaffolded (small/nano) tiers; off by default for frontier/mid, but injected when enabled: true is set explicitly. Precedence: explicit per-agent config > capability default > off.
KeyTypeDefaultDescription
agents.<id>.goalAnchor.enabledbooleanautomatic for small/nano (no config needed); false for frontier/midEnable GoalAnchor tail injection. Default-ON for small/nano (no config needed); off for frontier/mid unless enabled: true is set explicitly (which injects on frontier/mid too). When effective, the execution objective and uncompleted step checklist is tail-appended each turn. Precedence: explicit per-agent config > capability default > off.
agents.<id>.goalAnchor.maxCharsnumber (100–2000)500Maximum characters for the injected GoalAnchor block. ~5–10 steps at ~50 chars/step.
Capability-gated default (Phase 158): For capabilityClass in {small, nano} (e.g. any Ollama qwen3.6 deployment), goalAnchor is default-ON — no enabled: true required. Set agents.<id>.goalAnchor.enabled: false explicitly to disable it on a small/nano agent. For frontier/mid agents, behavior is unchanged (default off). Precedence: explicit per-agent config > capability default > off.

Verification Critic (agents.*.verification)

A pre-delivery critic that checks the terminal response against the GoalAnchor checklist before delivery. Unmet requirements redirect the executor; exhausted retries deliver an honest unmet-list. Meaningful only for scaffoldLevel=max (small/nano) agents.
KeyTypeDefaultDescription
agents.<id>.verification.enabledbooleancost-gated automatic for small/nano when a distinct cheap critic is configured; false otherwiseEnable pre-delivery verification critic. Fires only when a completion-claiming response meets minResponseChars. Default false for frontier/mid and for small/nano when no distinct cheap critic model is configured = opt-in.
agents.<id>.verification.minResponseCharsnumber (50–2000)200Minimum response length in characters before the critic is invoked. Prevents firing on short acks, clarifying questions, and non-completion replies.
Security (S2): The critic treats the output-under-review as untrusted (wrapExternalContent), inherits the safety core, embeds the canary, fails closed (uncertain → not-verified, never auto-approve), and re-validates any implied tool calls through the same exec gates. Use agents.<id>.operationModels.verification to run the critic on a cheaper or faster model.
Capability-gated default + cost-gate (Phase 158): For capabilityClass in {small, nano}, verification is default-ON — but only when agents.<id>.operationModels.verification resolves to a distinct cheaper model (i.e. operationModels.verification.model is explicitly configured to a different, faster model). If no distinct critic model is configured, the default stays OFF — the critic never silently doubles local-CPU inference latency. Set agents.<id>.verification.enabled: false explicitly to force-off the critic on a small/nano agent even when a cheap critic is configured. Set agents.<id>.verification.enabled: true to force-on the critic regardless of class (including frontier/mid, if a critic model is configured). Precedence: explicit per-agent config > capability default > off.

Honesty Guardrail (agents.*.honesty)

KeyTypeDefaultDescription
agents.<id>.honesty.maxCriticRetriesnumber (0–5)2Maximum critic retry redirects before delivering an honest unmet-list. After this many not-verified verdicts, the executor delivers an honest unmet-list instead of an unqualified “done”. Prevents infinite re-prompt loops.

Skills (agents.*.skills)

KeyTypeDefaultDescription
discoveryPathsstring[]["./skills"]Directories to scan for SKILL.md files. For named agents, Comis automatically prepends the agent’s workspace skills directory (~/.comis/workspace-{agentId}/skills/) at startup — you do not need to include it here.
watchEnabledbooleantrueEnable file watching for automatic skill reload
watchDebounceMsnumber400Debounce interval in ms (100-5000)
Built-in Tools (agents.*.skills.builtinTools)
KeyTypeDefaultDescription
readbooleantrueRead file contents
writebooleantrueWrite or overwrite files
editbooleantrueSurgical search-and-replace on files
grepbooleantrueRegex search across files (requires rg)
findbooleantrueFind files by glob pattern (requires fd)
lsbooleantrueList directory contents
execbooleantrueShell command execution
processbooleantrueBackground process management
webSearchbooleantrueWeb search API integration
webFetchbooleantrueURL content fetching
browserbooleantrueHeadless browser control
Tool Policy (agents.*.skills.toolPolicy)
KeyTypeDefaultDescription
profileenum"full"Baseline tool set: minimal, coding, messaging, supervisor, full
allowstring[][]Additional tools to allow beyond the profile
denystring[][]Tools to deny even if in the profile
Prompt Skills (agents.*.skills.promptSkills)
KeyTypeDefaultDescription
maxBodyLengthnumber20000Maximum skill body length in characters
enableDynamicContextbooleanfalseEnable shell command execution in skill bodies
maxAutoInjectnumber3Maximum prompt skills auto-injected per request (0-20)
allowedSkillsstring[][]Skill names allowed (empty = allow all)
deniedSkillsstring[][]Skill names denied (applied after allowedSkills)
Runtime Eligibility (agents.*.skills.runtimeEligibility)
KeyTypeDefaultDescription
enabledbooleantrueEnable runtime eligibility filtering based on OS, binary, and env var prerequisites
Content Scanning (agents.*.skills.contentScanning)
KeyTypeDefaultDescription
enabledbooleantrueEnable content scanning at skill load time
blockOnCriticalbooleantrueBlock skill loading when CRITICAL findings are present
Exec Sandbox (agents.*.skills.execSandbox)
KeyTypeDefaultDescription
enabledenum"always"OS-level sandbox for exec commands: "always" (sandbox active, graceful fallback if binary unavailable) or "never" (sandbox disabled)
readOnlyAllowPathsstring[][]Additional filesystem paths exposed read-only inside the sandbox (e.g., /opt/data)

Secrets (agents.*.secrets)

KeyTypeDefaultDescription
allowstring[][]Glob patterns for allowed secret names. Empty = unrestricted access.

Session (agents.*.session)

Optional session configuration containing reset policy, DM scope, pruning, and compaction.Reset Policy (agents.*.session.resetPolicy)
KeyTypeDefaultDescription
modeenum"none"Reset mode: daily, idle, hybrid, none
dailyResetHournumber4Hour of day for daily reset (0-23)
dailyResetTimezonestring""IANA timezone (empty = system local)
idleTimeoutMsnumber14400000Idle timeout in ms (default 4 hours)
sweepIntervalMsnumber300000How often to check sessions (default 5 min)
resetTriggersstring[][]Phrases that trigger immediate session reset
perType.dmResetPolicyOverride(unset)DM-specific override (mode, dailyResetHour, timezone, idleTimeoutMs)
perType.groupResetPolicyOverride(unset)Group-specific override
perType.threadResetPolicyOverride(unset)Thread-specific override
DM Scope (agents.*.session.dmScope)
KeyTypeDefaultDescription
modeenum"per-channel-peer"Isolation: main, per-peer, per-channel-peer, per-account-channel-peer
threadIsolationbooleanfalseAppend :thread:<threadId> to session keys
Pruning (agents.*.session.pruning)
KeyTypeDefaultDescription
enabledbooleantrueEnable session pruning of oversized tool results
softTrimThresholdCharsnumber8000Chars above which tool results are soft-trimmed
hardClearThresholdCharsnumber30000Chars above which tool results are hard-cleared
preserveHeadCharsnumber500Characters to preserve at the start of soft-trimmed result
preserveTailCharsnumber500Characters to preserve at the end of soft-trimmed result
pruneableToolsstring[][]Tools eligible for pruning (empty = all)
protectedToolsstring[][]Tools never pruned (takes precedence)
protectImageBlocksbooleantrueProtect tool results containing image content blocks
preserveRecentCountnumber6Recent messages exempt from pruning
Compaction (agents.*.session.compaction)
KeyTypeDefaultDescription
softThresholdRationumber0.75Context fraction at which soft flush triggers (0-1)
hardThresholdRationumber0.90Context fraction at which hard compaction triggers (0-1)
flushModelstring(unset)Model for memory extraction during flush
chunkMaxCharsnumber50000Max characters per summarization chunk
chunkOverlapMessagesnumber2Overlap messages between chunks
chunkMergeSummariesbooleantrueWhether to merge chunk summaries via LLM
reserveTokensnumber16384Tokens reserved for summary during auto-compaction
keepRecentTokensnumber32768Recent message tokens to keep after auto-compaction
postCompactionSectionsstring[]["Session Startup", "Red Lines"]AGENTS.md sections to re-inject after compaction (from read-only AGENTS.md)

Scheduler (agents.*.scheduler)

Cron (agents.*.scheduler.cron)
KeyTypeDefaultDescription
enabledbooleantrueEnable cron job scheduling
maxConcurrentRunsnumber3Maximum concurrent cron job runs
defaultTimezonestring""Default timezone (empty = UTC)
maxJobsnumber100Maximum cron jobs (0 = unlimited)
Heartbeat (agents.*.scheduler.heartbeat)All fields are optional and inherit from the global scheduler.heartbeat when omitted.
KeyTypeDefaultDescription
enabledboolean(inherit)Override heartbeat enabled state
intervalMsnumber(inherit)Override heartbeat interval in ms
showOkboolean(inherit)Override show OK status
showAlertsboolean(inherit)Override show alerts
targetHeartbeatTarget(unset)Delivery target channel (channelType, channelId, chatId, isDm)
promptstring(unset)Custom heartbeat prompt
modelstring(unset)Model override for heartbeat LLM calls
sessionstring(unset)Session key for heartbeat isolation
allowDmboolean(inherit)Allow heartbeat in DM conversations
lightContextboolean(unset)Include ONLY HEARTBEAT.md in bootstrap context
ackMaxCharsnumber(unset)Max chars for soft acknowledgment threshold
responsePrefixstring(unset)Prefix to strip from LLM responses before delivery
skipHeartbeatOnlyDeliveryboolean(unset)Suppress delivery of HEARTBEAT_OK-only responses
alertThresholdnumber(inherit)Consecutive failure threshold for alerting
alertCooldownMsnumber(inherit)Alert cooldown period in ms
staleMsnumber(inherit)Stuck detection timeout in ms
agents:
  assistant:
    name: "My Assistant"
    provider: anthropic
    model: claude-sonnet-4-5-20250929
    maxSteps: 30
    budgets:
      perExecution: 1000000
      perHour: 5000000
    rag:
      enabled: true
      maxResults: 10
    skills:
      toolPolicy:
        profile: coding
        allow: ["memory_search"]
    session:
      resetPolicy:
        mode: hybrid
        idleTimeoutMs: 7200000
See Agent Identity and Agent Lifecycle for how these settings affect agent behavior.

Channels

Channel adapter configuration. Each key is a channel platform name.Type: ChannelConfig (strict object with per-platform entries)

Base Channel Entry (shared by all channels)

KeyTypeDefaultDescription
enabledbooleanfalseWhether this channel is active
apiKeystring | SecretRef(unset)API key for the channel service
botTokenstring | SecretRef(unset)Bot token for the channel service
webhookUrlstring (URL)(unset)Webhook URL for receiving events
allowFromstring[][]Allowed sender IDs (empty = allow all)
mediaProcessingMediaProcessing(all true)Per-channel media processing overrides
Media Processing (channels.*.mediaProcessing)
KeyTypeDefaultDescription
transcribeAudiobooleantrueEnable voice transcription (STT)
analyzeImagesbooleantrueEnable image analysis (Vision)
describeVideosbooleantrueEnable video description
extractDocumentsbooleantrueEnable document text extraction
understandLinksbooleantrueEnable link content fetching

Ack Reactions (channels.*.ackReaction — global schema)

KeyTypeDefaultDescription
enabledbooleanfalseSend ack reaction when processing starts
emojistring"eyes"Emoji to react with (Unicode or platform-specific)

Platform-Specific Fields

Slack (channels.slack)
KeyTypeDefaultDescription
appTokenstring | SecretRef(unset)App-level token for Socket Mode (xapp-...)
signingSecretstring | SecretRef(unset)Signing secret for HTTP request verification
modeenum(unset)Connection mode: socket or http
WhatsApp (channels.whatsapp)
KeyTypeDefaultDescription
authDirstring(unset)Directory for multi-device auth state files
printQRboolean(unset)Print QR code to terminal for pairing
Signal (channels.signal)
KeyTypeDefaultDescription
baseUrlstring"http://127.0.0.1:8080"signal-cli REST API base URL
accountstring(unset)Phone number registered with Signal
cliPathstring(unset)Path to signal-cli binary for auto-spawn
iMessage (channels.imessage)
KeyTypeDefaultDescription
binaryPathstring(unset)Path to imsg binary
accountstring(unset)Apple ID for iMessage account
LINE (channels.line)
KeyTypeDefaultDescription
channelSecretstring | SecretRef(unset)Channel secret for webhook signature verification
webhookPathstring"/webhooks/line"Webhook path for LINE events
IRC (channels.irc)
KeyTypeDefaultDescription
hoststring(unset)IRC server hostname
portnumber(unset)IRC server port
nickstring(unset)Bot nickname
tlsbooleantrueWhether to use TLS
channelsstring[](unset)Channels to auto-join on connect
nickservPasswordstring | SecretRef(unset)NickServ password for identification
channels:
  telegram:
    enabled: true
    botToken: "${TELEGRAM_BOT_TOKEN}"
    allowFrom: ["123456789"]
    mediaProcessing:
      transcribeAudio: true
      analyzeImages: true
  discord:
    enabled: true
    botToken: "${DISCORD_BOT_TOKEN}"
  slack:
    enabled: true
    botToken: "${SLACK_BOT_TOKEN}"
    appToken: "${SLACK_APP_TOKEN}"
    mode: socket
See the Channels section for platform-specific setup guides.
SQLite-backed memory system configuration.
KeyTypeDefaultDescription
dbPathstring"memory.db"Path to SQLite database file (relative to dataDir)
walModebooleantrueEnable WAL mode for concurrent reads
embeddingModelstring"text-embedding-3-small"Embedding model identifier
embeddingDimensionsnumber1536Embedding vector dimensions
Reranker model (memory.reranker*)The cross-encoder model used when agents.*.rag.rerank.enabled is true. The hf: URI auto-downloads on first enable; nothing is downloaded while reranking is off. This is a distinct model from the bi-encoder embedder (see the embedding accordion) — do not conflate the two.
KeyTypeDefaultDescription
rerankerModelstring"hf:gpustack/bge-reranker-v2-m3-GGUF:bge-reranker-v2-m3-Q8_0.gguf"Reranker GGUF model URI (HuggingFace hf: ref or local path)
rerankerModelsDirstring"models"Directory (relative to dataDir) to store/resolve the reranker GGUF
rerankerGpuenum"auto"GPU acceleration: auto, metal, cuda, vulkan, false
rerankerThreadsnumber4Thread count for the reranker ranking context (positive)
Compaction (memory.compaction)
KeyTypeDefaultDescription
enabledbooleantrueWhether automatic compaction is enabled
thresholdnumber1000Minimum entries before compaction triggers
targetSizenumber500Maximum entries after compaction
Retention (memory.retention)
KeyTypeDefaultDescription
maxAgeDaysnumber0Maximum age in days (0 = no limit)
See Memory and Search for how these settings affect agent memory behavior.
Embedding provider configuration for vector search. Supports local GGUF models via node-llama-cpp or remote OpenAI.
KeyTypeDefaultDescription
enabledbooleantrueEnable embedding generation. When false, only FTS5 search is used.
providerenum"auto"Provider preference: auto (tries local then remote), local, openai
autoReindexbooleantrueAuto-reindex when provider model changes
multilingualboolean(unset)Advisory: declare the embedder multilingual for the comis fleet model-health line. Omitted: inferred from the model id (bge-m3 / multilingual-e5 / LaBSE / E5 read as multilingual; otherwise unknown). Does not gate search — the FTS5 trigram floor carries recall regardless. See Multilingual.
Local (embedding.local)
KeyTypeDefaultDescription
modelUristring"hf:nomic-ai/nomic-embed-text-v1.5-GGUF:nomic-embed-text-v1.5.Q8_0.gguf"HuggingFace model URI or path to local GGUF file
modelsDirstring"models"Directory to store downloaded models
gpuenum"auto"GPU acceleration: auto, metal, cuda, vulkan, false
contextSizenumber2048Context size for embedding model (tokens). nomic-embed-text-v1.5 trains on 2048; extending to 8192 requires YaRN RoPE scaling not available in node-llama-cpp.
OpenAI (embedding.openai)
KeyTypeDefaultDescription
modelstring"text-embedding-3-small"OpenAI embedding model
dimensionsnumber1536Vector dimensions (must match model output)
Cache (embedding.cache)
KeyTypeDefaultDescription
maxEntriesnumber10000Maximum cached embeddings in L1 in-memory cache (0 = disabled)
persistentbooleanfalseEnable persistent L2 SQLite cache
persistentMaxEntriesnumber50000Maximum entries in L2 persistent cache
ttlMsnumber(none)TTL in milliseconds for cache entries. When unset, LRU eviction only.
pruneIntervalMsnumber300000Prune check interval in milliseconds (5 min)
Batch (embedding.batch)
KeyTypeDefaultDescription
batchSizenumber100Texts per batch call
indexOnStartupbooleantrueIndex unembedded memories on startup
See Embeddings for a guide on choosing between local and remote embedding providers.

Gateway & API

Hono HTTPS server for JSON-RPC, WebSocket, and REST API access.
KeyTypeDefaultDescription
enabledbooleantrueEnable the gateway server
hoststring"127.0.0.1"Host to bind (use "0.0.0.0" for external access)
portnumber4766Port to listen on (1-65535)
maxBatchSizenumber50Maximum JSON-RPC batch size
wsHeartbeatMsnumber30000WebSocket heartbeat interval in ms
corsOriginsstring[][]CORS allowed origins (empty = same-origin only)
allowInsecureHttpbooleanfalseSuppress insecure-HTTP warning
trustedProxiesstring[][]Trusted proxy IPs for X-Forwarded-For
httpBodyLimitBytesnumber1048576Max HTTP request body size (1MB)
wsMaxMessageBytesnumber1048576Max WebSocket message size (1MB)
TLS (gateway.tls) — optional, enables mTLS when provided
KeyTypeDefaultDescription
certPathstring(required)Path to server TLS certificate (PEM)
keyPathstring(required)Path to server TLS private key (PEM)
caPathstring(required)Path to CA certificate for client verification (PEM)
requireClientCertbooleantrueRequire client certificates for mTLS
Tokens (gateway.tokens) — array of bearer token entries
KeyTypeDefaultDescription
idstring(required)Unique token identifier
secretstring | SecretRef(unset)Secret value (min 32 chars; auto-generated if omitted)
scopesstring[][]Allowed scopes (e.g., ["rpc", "ws", "admin"])
Rate Limit (gateway.rateLimit)
KeyTypeDefaultDescription
windowMsnumber60000Time window in ms
maxRequestsnumber100Maximum requests per window
WebSocket Message Rate Limit (gateway.wsMessageRateLimit)
KeyTypeDefaultDescription
maxMessagesnumber60Maximum messages per window
windowMsnumber60000Time window in ms
Web Dashboard (gateway.web)
KeyTypeDefaultDescription
enabledbooleantrueMount the @comis/web SPA at /app/* and the REST/SSE API at /api/* (sharing gateway host/port/auth). When false, the daemon skips /app/*, /api, SSE, and the //app/ redirect.
gateway:
  enabled: true
  host: "0.0.0.0"
  port: 4766
  tokens:
    - id: admin
      secret: "${COMIS_GATEWAY_TOKEN}"
      scopes: ["*"]
  rateLimit:
    windowMs: 60000
    maxRequests: 200
See HTTP Gateway and WebSocket for API usage details.
Webhook subsystem for receiving external events (GitHub, Gmail, custom services).
KeyTypeDefaultDescription
enabledbooleanfalseEnable the webhook subsystem
pathstring"/hooks"Base path for webhook endpoints
tokenstring | SecretRef(unset)Bearer token for authentication (min 32 chars)
maxBodyBytesnumber262144Max request body size (256KB)
presetsstring[][]Preset mapping names (e.g., ["gmail", "github"])
mappingsWebhookMapping[][]Custom webhook mappings
Webhook Mapping (webhooks.mappings[])
KeyTypeDefaultDescription
idstring(unset)Unique mapping identifier
match.pathstring(unset)URL path to match
match.sourcestring(unset)Source identifier to match
actionenum"agent"Action: wake (trigger heartbeat) or agent (invoke agent)
wakeModeenum"now"Wake timing: now or next-heartbeat
namestring(unset)Human-readable mapping name
agentIdstring(unset)Target agent ID
sessionKeystring(unset)Session key template (supports {{expr}})
messageTemplatestring(unset)Message template (supports {{expr}})
deliverboolean(unset)Whether to deliver to a channel
channelstring(unset)Target channel for delivery
tostring(unset)Target recipient for delivery
modelstring(unset)Model override for agent execution
timeoutSecondsnumber(unset)Timeout in seconds for agent execution
See Webhooks for webhook configuration patterns and payload formats.

Routing & Sessions

Multi-agent routing dispatch. Bindings are evaluated in order (first match wins).
KeyTypeDefaultDescription
defaultAgentIdstring"default"Agent ID when no binding matches
bindingsRoutingBinding[][]Ordered list of routing bindings
Routing Binding (routing.bindings[])
KeyTypeDefaultDescription
channelTypestring(unset)Channel type to match (e.g., "telegram")
channelIdstring(unset)Channel identifier to match
peerIdstring(unset)Peer (user) identifier to match
guildIdstring(unset)Guild (server/group) identifier to match
agentIdstring(required)Agent ID to route to
routing:
  defaultAgentId: general
  bindings:
    - channelType: telegram
      peerId: "123456789"
      agentId: personal
    - channelType: discord
      guildId: "987654321"
      agentId: community
See Routing for detailed routing patterns and specificity rules.
Command queue for session serialization and concurrency control.
KeyTypeDefaultDescription
enabledbooleantrueEnable the command queue
maxConcurrentSessionsnumber10Max concurrent agent executions globally
cleanupIdleMsnumber600000Idle lane garbage collection interval (10 min)
defaultModeenum"steer+followup"Default queue mode: followup, collect, steer, steer+followup
defaultDebounceMsnumber0Default debounce delay in ms
Default Overflow (queue.defaultOverflow)
KeyTypeDefaultDescription
maxDepthnumber20Max queued messages per session
policyenum"drop-new"Overflow policy: drop-old, drop-new, summarize
Per-Channel Override (queue.perChannel.)
KeyTypeDefaultDescription
modeenum"steer+followup"Queue mode for this channel
overflowOverflowConfig(inherit defaults)Overflow settings
debounceMsnumber0Debounce delay in ms
Debounce Buffer (queue.debounce)Ingress-layer message coalescing before queue entry.
KeyTypeDefaultDescription
windowMsnumber0Debounce window in ms (0 = disabled)
maxBufferedMessagesnumber10Max messages to buffer per session
firstMessageImmediatebooleantrueFirst message triggers immediately
Follow-up (queue.followup)
KeyTypeDefaultDescription
maxFollowupRunsnumber3Max follow-up runs in a single chain
followupOnCompactionbooleantrueTrigger follow-up on compaction flush
See Queue for queue behavior and session serialization details.

Security

Security configuration for log redaction, audit logging, permissions, action confirmation, agent-to-agent messaging, and encrypted secrets.
KeyTypeDefaultDescription
logRedactionbooleantrueEnable structured log redaction of sensitive fields
auditLogbooleantrueEnable audit event logging
Permission (security.permission)
KeyTypeDefaultDescription
enableNodePermissionsbooleanfalseEnable Node.js --permission flag enforcement
allowedFsPathsstring[][]Allowed filesystem read/write paths
allowedNetHostsstring[][]Allowed network hosts for outbound connections
Action Confirmation (security.actionConfirmation)
KeyTypeDefaultDescription
requireForDestructivebooleantrueRequire confirmation for destructive actions
requireForSensitivebooleanfalseRequire confirmation for sensitive actions
autoApprovestring[][]Actions that bypass confirmation
Agent-to-Agent (security.agentToAgent)
KeyTypeDefaultDescription
enabledbooleantrueEnable cross-agent session messaging
maxPingPongTurnsnumber3Max reply-back loop turns (0-5)
allowAgentsstring[][]Allowed agent IDs for sub-agents (empty = all)
subAgentRetentionMsnumber3600000Retention for completed sub-agent sessions (1 hour)
waitTimeoutMsnumber60000Default timeout for wait mode (60 seconds)
subAgentMaxStepsnumber50Default max steps for sub-agent execution
subAgentToolGroupsenum[]["coding"]Default tool profile groups: minimal, coding, messaging, supervisor, full
subAgentMcpToolsenum"inherit"MCP tool inheritance: inherit or none
Subagent Context (security.agentToAgent.subagentContext)Controls how sub-agent sessions receive context, condense results, and manage their lifecycle. All fields have sensible defaults — you only need to configure values you want to change. See Subagent Context Lifecycle for a full explanation.
KeyTypeDefaultDescription
maxSpawnDepthnumber3Maximum spawn chain depth (1-10). A depth of 3 means parent -> child -> grandchild.
maxChildrenPerAgentnumber5Maximum concurrent active children per parent agent (1-20). Graph pipeline nodes bypass this limit.
maxResultTokensnumber4000Token threshold for condensation (100-100,000). Results under this pass through unchanged.
resultRetentionMsnumber86400000How long full result files are kept on disk before auto-sweep (default 24 hours).
condensationStrategyenum"auto"When to condense: auto (based on token count), always (force condensation), never (always passthrough).
includeParentHistoryenum"none"Parent context mode: none (no parent context), summary (condensed parent conversation summary).
objectiveReinforcementbooleantrueInject the sub-agent’s objective after compaction so it survives context trimming.
artifactPassthroughbooleantruePass artifact file references from the spawn call to the sub-agent’s context.
autoCompactThresholdnumber0.95Context fill ratio (0.5-1.0) for triggering auto-compaction. This field is present in the schema but its runtime effect on the context engine compaction trigger is being refined in a future release.
maxRunTimeoutMsnumber600000Maximum wall-clock time for a sub-agent run before watchdog force-fail (10 minutes). Hard ceiling regardless of step count.
perStepTimeoutMsnumber60000Per-step time budget for dynamic watchdog calculation (1 minute). Dynamic timeout = min(max_steps x perStepTimeoutMs, maxRunTimeoutMs).
errorPreservationbooleantruePreserve error details in condensed results instead of summarizing them away.
narrativeCastingbooleantrueFormat sub-agent results with tagged prefixes and metadata for the parent agent.
resultTagPrefixstring"Subagent Result"Tag prefix used in narrative casting (1-100 characters). Appears as [{prefix}: {label}].
parentSummaryMaxTokensnumber1000Token limit for the parent context summary when includeParentHistory is "summary" (100-10,000).
Storage Mode (security.storage)
KeyTypeDefaultDescription
storage"encrypted" | "file" | "env""encrypted"Credential storage mode (security.storage) for all three stores (secrets, OAuth profiles, MCP tokens).
Three modes are supported:storage: "encrypted" (default — secure-by-default) — AES-256-GCM-encrypted rows in ~/.comis/secrets.db (SQLite). Requires SECRETS_MASTER_KEY to be set (generated automatically on first boot). Defends against disk/backup theft at the cost of making SECRETS_MASTER_KEY the crown jewel.storage: "file" — plaintext opt-in bargain — structured JSON at ~/.comis/secrets.json, ~/.comis/auth-profiles.json, and ~/.comis/mcp-tokens/ with mode 0600 (user-only read) in a 0700 directory. Defends against other local users reading secrets; does not defend against root, disk/backup theft, or process-memory inspection. No SECRETS_MASTER_KEY required. Hot-reload: comis auth login writes are picked up without a daemon restart.storage: "env" (security.storage: env) — read-only posture — snapshots .env/process.env into the SecretManager at boot and scrubs sensitive names from process.env. Runtime writes (env.set, secrets.set, comis auth login) are rejected with an actionable error. Use for read-only deployments where credentials are injected via environment variables.
security.storage is runtime-immutable — changing it requires editing config.yaml and restarting the daemon. The config schema is z.strictObject, so unknown keys are rejected at boot. Back up ~/.comis/config.yaml before editing.
Mode mismatch detection: If you switch modes while credentials remain in the inactive backend, the daemon emits a boot WARN naming the stranded store and the manual migration step. Cross-mode migration tooling is planned for a future release.
~/.comis/config.yaml
security:
  storage: encrypted  # default — AES-256-GCM, requires SECRETS_MASTER_KEY
  # storage: file     # plaintext opt-in — 0600 JSON files, no SECRETS_MASTER_KEY
  # storage: env      # read-only — snapshots process.env, rejects runtime writes
See Security for a comprehensive overview of the security model.
Action approval workflow. Rules are evaluated in order (first match wins).
KeyTypeDefaultDescription
enabledbooleantrueEnable the approval workflow
defaultModeenum"auto"Default mode for unmatched actions: auto, require, deny
rulesApprovalRule[][]Ordered approval rules
defaultTimeoutMsnumber300000Approval request timeout in ms (5 min)
denialCacheTtlMsnumber60000How long denied actions are cached before re-prompting (ms). Set to 0 to disable denial caching.
batchApprovalTtlMsnumber30000How long approved actions are cached for automatic re-approval in batch operations (ms). Set to 0 to disable batch approval caching.
Batch approval caching (batchApprovalTtlMs) allows sequential identical tool calls to auto-approve within the TTL window, reducing approval fatigue during batch operations. The cache persists across daemon restarts.
Approval Rule (approvals.rules[])
KeyTypeDefaultDescription
actionPatternstring(required)Pattern matching action types
modeenum"auto"Approval mode: auto, require, deny
timeoutMsnumber300000Timeout for human approval (0 = no timeout)
minTrustLevelenum"verified"Trust level for auto-approve: untrusted, basic, verified, admin
See Approvals for approval workflow configuration patterns.

Daemon & Scheduler

Daemon process configuration for watchdog, shutdown, metrics, logging, and config change webhooks.
KeyTypeDefaultDescription
watchdogIntervalMsnumber30000Watchdog interval in ms (0 to disable)
shutdownTimeoutMsnumber30000Graceful shutdown timeout in ms
metricsIntervalMsnumber30000Process metrics collection interval in ms
eventLoopDelayThresholdMsnumber500Event loop delay threshold — skip watchdog if exceeded
logLevelsRecord<string, LogLevel>{}Per-module log level overrides
Logging (daemon.logging)
KeyTypeDefaultDescription
filePathstring"~/.comis/logs/daemon.log"Path to the active log file
maxSizestring"10m"Max file size before rotation (k/m/g suffixes)
maxFilesnumber5Number of rotated files to keep (0-100)
compressbooleanfalseCompress rotated files
Tracing Defaults (daemon.logging.tracing)
KeyTypeDefaultDescription
outputDirstring"~/.comis/traces"Default output directory for JSONL traces
maxSizestring"5m"Max trace file size before rotation (k/m/g suffixes)
maxFilesnumber3Rotated trace files to keep per session (0-100)
Config Webhook (daemon.configWebhook)
KeyTypeDefaultDescription
urlstring (URL)(unset)Webhook URL for config change notifications
timeoutMsnumber5000Delivery timeout in ms
secretstring | SecretRef(unset)Shared secret for HMAC-SHA256 signature
See Daemon for daemon management and Logging for log configuration details.
Proactive automation configuration for cron scheduling, heartbeat monitoring, quiet hours, execution safety, and task extraction.Cron (scheduler.cron)
KeyTypeDefaultDescription
enabledbooleantrueEnable cron job scheduling
storeDirstring"./data/scheduler"Directory for cron job state persistence
maxConcurrentRunsnumber3Max concurrent cron job runs
defaultTimezonestring""Default timezone (empty = UTC)
maxJobsnumber100Max cron jobs (0 = unlimited)
Heartbeat (scheduler.heartbeat)
KeyTypeDefaultDescription
enabledbooleantrueEnable periodic heartbeat checks
intervalMsnumber300000Heartbeat interval in ms (5 min)
showOkbooleanfalseShow OK status in heartbeat output
showAlertsbooleantrueShow alerts in heartbeat output
alertThresholdnumber2Consecutive failures before alerting
alertCooldownMsnumber300000Minimum ms between alerts for same source (5 min)
staleMsnumber120000Max ms before stuck detection (2 min)
Quiet Hours (scheduler.quietHours)
KeyTypeDefaultDescription
enabledbooleanfalseEnable quiet hours
startstring"22:00"Start time (HH:MM format)
endstring"07:00"End time (HH:MM format)
timezonestring""Timezone (empty = system local)
criticalBypassbooleantrueAllow critical items to bypass quiet hours
Execution (scheduler.execution)
KeyTypeDefaultDescription
lockDirstring"./data/scheduler/locks"Directory for execution lock files
staleMsnumber600000Lock stale timeout in ms (10 min)
updateMsnumber30000Lock update interval in ms
logDirstring"./data/scheduler/logs"Directory for execution log files
maxLogBytesnumber2000000Max log file size in bytes
keepLinesnumber2000Max lines in ring-buffer log
Tasks (scheduler.tasks)
KeyTypeDefaultDescription
enabledbooleanfalseEnable task extraction from conversations
confidenceThresholdnumber0.8Minimum confidence threshold (0-1)
storeDirstring"./data/scheduler/tasks"Directory for task state persistence
See Scheduler for scheduling configuration and Monitoring for heartbeat setup.
System health monitoring with sub-monitors for disk, resources, systemd, security updates, and git repos.Disk (monitoring.disk)
KeyTypeDefaultDescription
enabledbooleantrueEnable disk space monitoring
pathsstring[]["/"]Filesystem paths to monitor
thresholdPercentnumber90Alert threshold percentage (0-100)
Resources (monitoring.resources)
KeyTypeDefaultDescription
enabledbooleantrueEnable CPU/memory monitoring
cpuThresholdPercentnumber85CPU alert threshold (0-100)
memoryThresholdPercentnumber90Memory alert threshold (0-100)
Systemd (monitoring.systemd)
KeyTypeDefaultDescription
enabledbooleantrueEnable systemd service monitoring
servicesstring[][]Specific services to monitor (empty = all failed)
Security Updates (monitoring.securityUpdates)
KeyTypeDefaultDescription
enabledbooleantrueEnable security update monitoring
securityOnlybooleantrueOnly check security updates (not all)
Git (monitoring.git)
KeyTypeDefaultDescription
enabledbooleanfalseEnable git repository monitoring
repositoriesstring[][]Absolute paths to git repositories
checkRemotebooleantrueCheck remote for unpushed commits
See Monitoring for monitoring setup and alert configuration.

Integrations

External service integrations for search, MCP servers, media processing, and auto-reply rules.Brave Search (integrations.braveSearch)
KeyTypeDefaultDescription
apiKeystring | SecretRef(unset)Brave Search API key (disabled without it)
maxResultsDefaultnumber5Default number of results
cacheTtlMsnumber3600000Cache TTL in ms (1 hour)
rateLimitRpsnumber1Rate limit in requests per second
MCP (integrations.mcp)
KeyTypeDefaultDescription
callToolTimeoutMsnumber120000Default timeout for MCP tool calls in ms
serversMcpServerEntry[][]List of MCP servers
MCP Server Entry (integrations.mcp.servers[])
KeyTypeDefaultDescription
namestring(required)Unique server name
transportenum(required)Transport: stdio or sse
commandstring(unset)Command for stdio transport
argsstring[](unset)Arguments for stdio command
urlstring (URL)(unset)URL for SSE transport
envRecord<string, string>(unset)Environment variables for stdio process
enabledbooleantrueWhether this server is enabled
Media (integrations.media) — contains nested sub-schemas for all media processing:Transcription (integrations.media.transcription)
KeyTypeDefaultDescription
providerenum"openai"STT provider: openai, groq, deepgram
modelstring(unset)Provider-specific model ID
maxFileSizeMbnumber25Max file size in MB
timeoutMsnumber60000API request timeout in ms
languagestring(unset)BCP-47 language hint (auto-detect if omitted)
autoTranscribebooleantrueAuto-transcribe voice messages in pipeline
preflightbooleantrueEnable preflight STT for mention detection
fallbackProvidersenum[][]Ordered fallback providers
TTS (integrations.media.tts)
KeyTypeDefaultDescription
providerenum"openai"TTS provider: openai, elevenlabs, edge
voicestring"alloy"Voice identifier
formatstring"opus"Output audio format
modelstring(unset)Provider-specific model ID
autoModeenum"off"Auto mode: off, always, inbound, tagged
maxTextLengthnumber4096Max text length for synthesis
tagPatternstring"\\[\\[tts(?::.*?)?\\]\\]"Regex pattern for TTS-tagged responses
TTS Output Formats (integrations.media.tts.outputFormats)
KeyTypeDefaultDescription
telegramstring"opus"Telegram format
discordstring"mp3"Discord format
whatsappstring"mp3"WhatsApp format
slackstring"mp3"Slack format
defaultstring"mp3"Default format
ElevenLabs Settings (integrations.media.tts.elevenlabsSettings) — optional
KeyTypeDefaultDescription
stabilitynumber(unset)Voice stability (0-1)
similarityBoostnumber(unset)Similarity boost (0-1)
stylenumber(unset)Style exaggeration (0-1)
useSpeakerBoostboolean(unset)Enable speaker boost
speednumber(unset)Playback speed multiplier
seednumber(unset)Random seed for reproducible output
applyTextNormalizationenum"auto"Text normalization: auto, on, off
Image Analysis (integrations.media.imageAnalysis)
KeyTypeDefaultDescription
maxFileSizeMbnumber20Max image file size in MB
Vision (integrations.media.vision)
KeyTypeDefaultDescription
enabledbooleantrueEnable vision analysis
providersenum[]["openai", "anthropic", "google"]Ordered vision providers
defaultProviderstring(unset)Preferred default provider
videoMaxBase64Bytesnumber70000000Max base64 video size (70MB)
videoMaxRawBytesnumber50000000Max raw video file size (50MB)
videoTimeoutMsnumber120000Video description timeout in ms
videoMaxDescriptionCharsnumber500Max chars for video description
imageMaxFileSizeMbnumber20Max image file size in MB
scopeRulesVisionScopeRule[][]Scope rules (first match wins)
defaultScopeActionenum"allow"Default when no rule matches: allow or deny
Vision Scope Rule (integrations.media.vision.scopeRules[])
KeyTypeDefaultDescription
channelstring(unset)Channel type to match
chatTypestring(unset)Chat type to match
keyPrefixstring(unset)Session key prefix to match
actionenum(required)Action: allow or deny
Link Understanding (integrations.media.linkUnderstanding)
KeyTypeDefaultDescription
enabledbooleanfalseEnable automatic link understanding
maxLinksnumber3Max links to process per message
fetchTimeoutMsnumber10000Timeout for each URL fetch in ms
maxContentCharsnumber5000Max extracted content per link
userAgentStringstring"Comis/1.0 (Link Understanding)"User-Agent for outbound requests
Media Infrastructure (integrations.media.infrastructure)
KeyTypeDefaultDescription
maxRemoteFetchBytesnumber26214400Max remote media fetch size (25MB)
concurrencyLimitnumber3Max concurrent media operations
tempFileTtlMsnumber1800000Temp file TTL in ms (30 min)
tempCleanupIntervalMsnumber300000Cleanup interval in ms (5 min)
Document Extraction (integrations.media.documentExtraction)
KeyTypeDefaultDescription
enabledbooleantrueEnable document extraction
allowedMimesstring[](13 MIME types)Allowed MIME types for extraction
maxBytesnumber10485760Max file size (10MB)
maxCharsnumber200000Max extracted text chars
maxTotalCharsnumber500000Max total chars across all attachments
maxPagesnumber20Max pages from paginated documents
timeoutMsnumber30000Extraction timeout in ms
pdfImageFallbackbooleanfalseUse OCR for PDFs with little text
pdfImageFallbackThresholdnumber50Min chars per page to trigger fallback (0 = always)
Media Persistence (integrations.media.persistence)
KeyTypeDefaultDescription
enabledbooleantrueEnable automatic media file persistence
maxStorageMbnumber1024Soft limit for workspace media storage (1GB)
maxFileBytesnumber52428800Max individual file size (50MB)
Auto-Reply (integrations.autoReply)
KeyTypeDefaultDescription
enabledbooleanfalseEnable pattern-based auto-reply rules
rulesAutoReplyRule[][]List of auto-reply rules
Auto-Reply Rule (integrations.autoReply.rules[])
KeyTypeDefaultDescription
idstring(required)Unique rule identifier
patternstring(required)Regex pattern to match messages
templatestring(required)Response template (supports {{match}})
channelsstring[](unset)Channel filter (omit for all channels)
prioritynumber0Rule ordering priority (higher = first)
See MCP for MCP server configuration and Voice for STT/TTS setup.
Browser automation via Chrome DevTools Protocol (CDP).
KeyTypeDefaultDescription
enabledbooleanfalseEnable browser automation
chromePathstring(unset)Path to Chrome/Chromium binary (auto-detected)
cdpPortnumber9222CDP debug port (1-65535)
defaultProfilestring"default"Named browser profile directory
headlessbooleantrueRun in headless mode
noSandboxbooleanfalseDisable Chrome sandbox (security-sensitive)
screenshotMaxSidenumber2000Max screenshot dimension in pixels
screenshotQualitynumber80JPEG quality (1-100)
snapshotMaxCharsnumber120000Max chars for DOM snapshot
timeoutMsnumber30000Page load timeout in ms
baseCdpPortnumber18800Base CDP port for profile allocation
maxProfilesnumber10Max concurrent named profiles (1-50)
profilesDirstring(unset)Override directory for profile data
downloadsDirstring(unset)Directory for tracked downloads
downloadTimeoutMsnumber120000Max download wait time in ms
Viewport (browser.viewport)
KeyTypeDefaultDescription
widthnumber1280Viewport width in pixels
heightnumber720Viewport height in pixels
See Browser for browser tool usage and configuration.
Plugin system configuration.
KeyTypeDefaultDescription
enabledbooleantrueGlobal plugin system toggle
pluginsRecord<string, PluginEntry>{}Per-plugin configuration keyed by plugin ID
Plugin Entry (plugins.plugins.)
KeyTypeDefaultDescription
enabledbooleantrueWhether this plugin is active
prioritynumber0Hook execution priority (-100 to 100, higher runs first)
configRecord<string, unknown>{}Plugin-specific configuration (opaque)
See Plugins for the plugin system and hook reference.

tooling

The tool-first capability layer. Operator-only — agents cannot self-configure capability routing or detour policy. The entire tooling tree is added to IMMUTABLE_CONFIG_PREFIXES and rejected by config.patch from agent-callable surfaces.
Restart required. Changes to tooling.capabilityIndex.enabled and tooling.installDetours.mode apply only after the daemon restarts. The capabilityIndex.enabled toggle selects between two cached system-prompt shapes (one-line residual vs flat tool dump); in-process reload is not supported. Operator config edits go through the standard config.patch / config.apply path which validates → writes → 200 ms delayed SIGUSR1 → process restart.
Top-level fields:
  • capabilityClusters — cluster definitions and builtin tool→cluster assignments. Operator-defined clusters merge key-by-key with the three reserved IDs (external-integrations, prompt-skills, other-tools); operator values win per key.
  • mcp.capabilityHints — operator hints for connected MCP servers (record keyed by server name; each entry is { cluster, description, replacesPackages }).
  • skills.capabilityHints — operator hints for prompt skills (record keyed by skill name or skill key; each entry is { cluster, description?, replacesPackages }).
  • capabilityIndex.enabled — boolean toggle for the per-turn ## Capabilities block (default true).
  • installDetours.mode"observe" / "advise" / "soft-stop" (default "advise"). Controls how the install-detour validator acts when an exec command would pip install / npm install a package that overlaps with an already-connected MCP server or skill.
Operator typos in any cluster reference (under capabilityClusters.builtinAssignments[*], mcp.capabilityHints[*].cluster, or skills.capabilityHints[*].cluster) emit a Pino WARN at daemon startup with errorKind: "config", an operator-actionable hint, and the offending { configPath, unresolvedClusterId } payload — the daemon does NOT crash; unresolved cluster references fall back to external-integrations (for MCP hints) or prompt-skills (for skill hints). Check pm2 logs comis or journalctl -u comis after restart to surface typos.

Streaming & Messaging

Block-based response delivery across all channels.
KeyTypeDefaultDescription
enabledbooleantrueGlobal enable/disable for block streaming
defaultChunkModeenum"paragraph"Default chunk mode: paragraph, newline, sentence, length
defaultTypingModeenum"thinking"Default typing mode: never, instant, thinking, message
defaultTypingRefreshMsnumber6000Typing indicator refresh interval in ms
defaultTableModeenum"code"Table conversion mode: code, bullets, off
defaultUseMarkdownIRbooleantrueEnable Markdown IR pipeline for format-aware chunking
Per-Channel Override (streaming.perChannel.)
KeyTypeDefaultDescription
enabledbooleantrueEnable streaming for this channel
chunkModeenum"paragraph"Chunk mode
chunkMaxCharsnumber(unset)Max chars per block (falls back to platform limit)
chunkMinCharsnumber100Min chars before allowing split
typingModeenum"thinking"Typing indicator mode
typingRefreshMsnumber6000Typing refresh interval in ms. Per-platform defaults are applied automatically (Telegram 4s, Discord 8s, etc.) — this field overrides the automatic default
typingCircuitBreakerThresholdnumber3Consecutive typing failures before circuit breaker permanently stops indicator
typingTtlMsnumber60000Maximum typing indicator duration in ms before auto-stop (refreshes on content signals)
useMarkdownIRbooleantrueUse Markdown IR pipeline
tableModeenum"code"Table conversion mode
Each per-channel entry also includes nested deliveryTiming and coalescer overrides with the same schema as the top-level versions.
See Delivery for block streaming behavior and platform-specific delivery.
Controls whether the agent pipeline activates for inbound messages. Separate from the pattern-based auto-reply rules in integrations.autoReply.
KeyTypeDefaultDescription
enabledbooleantrueEnable the auto-reply engine
groupActivationenum"mention-gated"Group chat mode: always, mention-gated, custom
customPatternsstring[][]Custom regex patterns for custom mode
historyInjectionbooleantrueInject non-trigger group messages as context
maxHistoryInjectionsnumber50Max history-injected messages per session
maxGroupHistoryMessagesnumber20Max group history messages stored per session
Outbound message gating rules. Rules evaluated in order; first match wins.
KeyTypeDefaultDescription
enabledbooleantrueEnable send policy enforcement
defaultActionenum"allow"Default action: allow or deny
rulesSendPolicyRule[][]Ordered list of rules
Send Policy Rule (sendPolicy.rules[])
KeyTypeDefaultDescription
channelIdstring(unset)Channel ID to match
chatTypestring(unset)Chat type: dm, group, thread, channel, forum
channelTypestring(unset)Channel type: telegram, discord, slack, whatsapp
actionenum"allow"Action: allow or deny
descriptionstring(unset)Human-readable description
Message envelope enrichment for inbound messages before they reach the LLM.
KeyTypeDefaultDescription
timezoneModestring"utc"Timezone: utc, local, or IANA timezone string
timeFormatenum"12h"Time display: 12h or 24h
showElapsedbooleantrueShow elapsed time since previous message
showProviderbooleantrueShow platform prefix (e.g., [telegram])
elapsedMaxMsnumber86400000Max elapsed time to display (24 hours)
Messaging UX configuration for outbound message formatting.
KeyTypeDefaultDescription
maxOutboundLengthnumber0Max outbound message length (0 = no limit)
splitLongMessagesbooleantrueSplit long messages into parts
splitMaxCharsnumber4000Character limit per split part
splitSeparatorstring"\n\n"Separator between split parts
showTypingIndicatorbooleantrueShow typing indicator during processing
systemMessagePrefixstring"[System] "Prefix for system messages
readReceiptsbooleanfalseEnable read receipts
Model catalog and alias configuration for model discovery and friendly names.
KeyTypeDefaultDescription
scanOnStartupbooleanfalseEnable automatic model scanning
scanTimeoutMsnumber30000Scan timeout in ms
aliasesModelAlias[][]Friendly model aliases
defaultModelstring""Default model ID (falls back to claude-sonnet-4-5-20250929)
defaultProviderstring""Default provider (falls back to anthropic)
Model Alias (models.aliases[])
KeyTypeDefaultDescription
aliasstring(required)Short alias name
providerstring(required)Provider identifier
modelIdstring(required)Full model identifier at the provider
models:
  defaultModel: claude-sonnet-4-5-20250929
  defaultProvider: anthropic
  aliases:
    - alias: gpt4
      provider: openai
      modelId: gpt-4o
LLM provider configuration. API keys are referenced by SecretManager key name, never stored in plaintext.
KeyTypeDefaultDescription
entriesRecord<string, ProviderEntry>{}Named provider configurations
Provider Entry (providers.entries.)
KeyTypeDefaultDescription
typestring(required)Provider type (e.g., "anthropic", "openai", "ollama")
namestring""Display name
baseUrlstring""API base URL override
apiKeyNamestring""SecretManager key name for API key
enabledbooleantrueWhether this provider is enabled
timeoutMsnumber120000Config-echo only — NOT enforced on completion calls. The completion deadline is agents.<id>.promptTimeout.promptTimeoutMs (stall budget). Setting a non-default value emits a one-time boot WARN naming the real knob.
maxRetriesnumber2Max retries for transient errors
headersRecord<string, string>{}Custom headers for API requests
capabilitiesProviderCapabilities(auto-detected)Provider-level behavioral overrides. Usually auto-detected from provider type; manual config overrides auto-detection. See sub-schema below.
modelsUserModel[][]User-defined model entries for this provider. Allows registering custom/fine-tuned models with capability metadata. See sub-schema below.
ProviderCapabilities (providers.entries..capabilities)
KeyTypeDefaultDescription
providerFamilyenum"default"Provider family for response handling: default, openai, anthropic, google
dropThinkingBlockModelHintsstring[][]Model ID substrings that trigger thinking block suppression
transcriptToolCallIdModeenum"default"Tool call ID format: default (pass-through) or strict9 (truncate to 9 chars for providers with ID length limits)
transcriptToolCallIdModelHintsstring[][]Model ID substrings that trigger strict9 tool call ID mode
supportsVisionbooleanfalseWhen true, image attachments are forwarded to the model. When false (or unset), images are warn-dropped with a log entry. Set true for qwen3.6:27b/35b GGUF variants that support image input.
supportsPromptCacheboolean(auto from providerFamily)Whether models on this provider support prompt caching. Auto-detected from providerFamily: true for anthropic/google, false for default/openai. Set explicitly to override. When false, prompt assembly emits a single block with no cache_control split overhead.
supportsStructuredOutputbooleanfalseWhen true, tool-call repair uses constrained decoding (e.g., Ollama’s /api/generate format param) for near-miss tool JSON. When false, lenient parse-and-repair is used.
capabilityClass"frontier" | "mid" | "small" | "nano"(resolver heuristic)Explicit capability-class override for all models on this provider. Overrides the default resolver heuristic (which keys off model context-window and provider-family). Forces scaffoldLevel (GoalAnchor, critic) and securityLevel (lockdown intensity). Use "small" for an Ollama provider running qwen3.6. Setting "frontier" for a weak model silences scaffolding — not recommended for production.
probeServedWindowbooleantrue for type: "ollama"Probe the Ollama served num_ctx at daemon boot and reconcile with the configured contextWindow. When unset (or true), Comis queries GET /api/ps and POST /api/show on start and uses the smaller of the served window vs. configured. Set to false to skip the probe if Ollama is offline at daemon start.
v2.15 capability-gated defaults: The v2.14 reliability scaffold (GoalAnchor, relevance floor, cost-gated critic) is now default-ON for small/nano agents — no per-agent opt-in required. Phase 159 adds two capacity defaults for small/nano: bootstrap.maxChars drops to 3_500 chars per file (SD6), and the active-tool ceiling is set to 24 tools (SD7; overflow tools stay reachable via discover_tools). Phase 160 adds a total bootstrap budget (5,000 chars sum-cap for small/nano), capability-gated graph concurrency (small/nano→2, frontier/mid→4), and a corrected bootstrap-budget warn threshold. frontier/mid behavior is byte-identical to v2.14 (the non-regression guarantee). The security guarantee holds: a weaker model class cannot lower the platform’s security posture; the scaffolding defaults reinforce it.
UserModel (providers.entries..models[])
KeyTypeDefaultDescription
idstring(required)Model identifier (e.g., "my-finetuned-model")
namestring(unset)Display name for the model
reasoningbooleanfalseWhether this model supports extended thinking
contextWindownumber(unset)Context window size in tokens
maxTokensnumber(unset)Maximum output tokens
inputstring[]["text"]Supported input modalities: "text", "image"
costModelCost(unset)Token cost rates for budget tracking. See sub-schema below.
comisCompatModelCompatConfig(unset)Comis-specific compatibility flags. See sub-schema below.
sdkCompatRecord<string, unknown>(unset)Pass-through SDK compatibility overrides (provider-specific)
ModelCompatConfig (providers.entries..models[].comisCompat)
KeyTypeDefaultDescription
supportsToolsboolean(unset)Whether this model supports tool/function calling
toolSchemaProfileenum(unset)Tool schema normalization profile: "default", "xai" (strips constraint keywords xAI rejects), or "gbnf" (GBNF-safe structural transforms for llama.cpp-family local providers: collapses nullable anyOf/oneOf and ["T","null"] type arrays, injects properties: {} on free-form objects and a type on typeless nodes — removal/relaxation only, pattern/format are kept)
toolCallArgumentsEncodingenum(unset)How tool call arguments are encoded: "json" or "html-entities" (auto-decoded)
nativeWebSearchToolboolean(unset)Whether this model uses native web search (filters out Comis web-fetch tool)
GBNF auto-detection: providers with type: "ollama" default their models to the gbnf profile automatically. An explicit toolSchemaProfile value always wins for gbnf (set "default" to opt a model out while debugging) — unlike xai, whose auto-detected flags are non-negotiable API requirements and override user config.Explicit opt-in: LM Studio, llama-server (llama.cpp), and vLLM endpoints have no provider type that auto-enables the profile (only type: "ollama" does) — they opt in per model via comisCompat.toolSchemaProfile: "gbnf" (zero new config keys; this is the existing comisCompat surface).Reactive repair: if a provider still rejects a schema at grammar-compile time, Comis classifies the 400 (tool_schema_unsupported), strips pattern/format from the offending tools, and retries exactly once per session before failing honestly; comis explain names the offending tool. A once-per-boot INFO line summarizes which tools were transformed (names and keyword counts only — never schema bodies).For the served-context-window side of local-provider setup, see Local model context window.
ModelCost (providers.entries..models[].cost)
KeyTypeDefaultDescription
inputnumber(unset)Cost per input token (for budget tracking)
outputnumber(unset)Cost per output token
cacheReadnumber(unset)Cost per cache-read token (Anthropic prompt caching)
cacheWritenumber(unset)Cost per cache-write token
providers:
  entries:
    ollama:
      # type: "ollama" auto-enables comisCompat.toolSchemaProfile: "gbnf" for its models
      type: ollama
      name: "Local Ollama"
      baseUrl: "http://localhost:11434"
      enabled: true
    lmstudio:
      type: lm-studio
      name: "LM Studio"
      baseUrl: "http://localhost:1234/v1"
      enabled: true
      models:
        - id: qwen3.6-35b
          comisCompat:
            toolSchemaProfile: gbnf   # explicit opt-in — LM Studio / llama.cpp / vLLM have no auto-detected type
    xai:
      type: xai
      apiKeyName: XAI_API_KEY
      capabilities:
        transcriptToolCallIdMode: strict9
      models:
        - id: grok-3
          reasoning: true
          comisCompat:
            toolSchemaProfile: xai
            toolCallArgumentsEncoding: html-entities
The following config.yaml snippet shows the recommended settings for a secure, scaffolded qwen3.6 local deployment. All security-relevant keys from Phases 151–155 are shown with their recommended values.
providers:
  entries:
    qwen36-local:
      baseUrl: "http://localhost:11434/v1"
      # No apiKeyName — keyless Ollama
      capabilities:
        capabilityClass: small          # Forces scaffoldLevel=max + securityLevel=locked
        supportsVision: true            # For 27b/35b GGUF; false for MLX variants
        supportsStructuredOutput: true  # Enables constrained-decode tool-call repair

models:
  defaultModel: "qwen36-local:qwen3.6:35b"

agents:
  default:
    contextEngine:
      budget:
        effectiveContextCapSmall: 32000  # Cap effective history at 32K (prevents 256K overfill)
      compaction:
        preferEvictionByCapability: true  # Evict rather than summarize with the small model
      compactPrompt:
        enabled: true        # Compact-secure prompt (retains full safety core)
        targetTokens: 3000
    goalAnchor:
      # enabled: true is the capability default for small/nano — omit to rely on capability default
      enabled: true          # Explicit: tail-inject objective checklist each turn (scaffoldLevel=max)
      maxChars: 500
    verification:
      # enabled: true is the capability default for small/nano when operationModels.verification is set
      enabled: true          # Explicit force-on: pre-delivery critic (honest unmet-list on failure)
      minResponseChars: 200
    honesty:
      maxCriticRetries: 2
    rag:
      baseFloor: 0.3         # Memory relevance floor; capability default is 0.15 for small/nano (0 = sentinel)

Capability-gated capacity defaults (Phase 159/160 — D2/D3)

These defaults fire only for capabilityClass in {small, nano} (e.g., any Ollama qwen3.6 deployment). The 2026-06-08 re-verification measured ~32K input tokens per qwen3.6 turn — 98 active tool schemas and a 14.4K-char bootstrap file drove a 495% bootstrap warning. Phase 159 adds two capability-gated defaults to address this without changing frontier/mid behavior. Phase 160 adds a total bootstrap budget, capability-gated graph concurrency, and a corrected bootstrap-budget warn threshold.

bootstrap.maxChars capability default (SD6)

For capabilityClass in {small, nano}, bootstrap.maxChars defaults to 3_500 chars per file (down from the schema baseline of 20_000). For frontier/mid, the value is unchanged (20_000 per file — byte-identical to v2.14). Key properties:
  • Per-file limit. Each workspace file is truncated individually. At 3_500 chars, AGENTS.md (the largest file) is preserved head 70% + tail 20%; smaller identity files (SOUL/IDENTITY/USER/ROLE/TOOLS/HEARTBEAT/BOOT) fit entirely within the limit.
  • Precedence: explicit agents.<id>.bootstrap.maxChars > capability default (3_500 for small/nano) > 20_000 (frontier/mid baseline).
  • Sentinel: 20_000 is treated as “unset”. Setting bootstrap.maxChars: 20000 explicitly has the same effect as omitting the key — the capability default of 3_500 applies for small/nano. This is because 20_000 is the schema default, so the runtime cannot distinguish “operator chose 20_000” from “no override”. To force the 20_000 limit on a small/nano agent, set agents.<id>.capabilityClassOverride: frontier instead.
If security-critical content lives in the middle of AGENTS.md (between the first 70% and last 20%), it may be truncated for small/nano models. Place critical rules in the first ~2,450 chars or last ~700 chars of AGENTS.md for reliable injection. See AGENTS.md placement guidance for the recommended section order.
To override the capability default for a single agent:
agents:
  my-agent:
    bootstrap:
      maxChars: 8000   # override the small/nano default of 3_500

Active-tool ceiling (SD7)

For capabilityClass: small, at most 24 tool schemas are active in each prompt request. The cold long-tail (tools not in the core set and not recently used) is deferred.
No capability is removed. All deferred tools remain fully callable via the discover_tools mechanism. The model can search for and invoke any deferred tool in the same turn. This ceiling is a prompt-size control, not a security control — it does not restrict which tools the agent is authorized to use.
The ceiling applies only to capabilityClass: small. Key properties:
  • CORE_TOOLS are never deferred. The following tools are always kept active regardless of the ceiling: read, edit, write, grep, find, ls, apply_patch, exec, process, message, memory_search, memory_store, memory_get, web_search, web_fetch.
  • Recently-used tools are never deferred. Any tool the agent invoked in recent turns is preserved in the active set.
  • nano class is not affected. Nano already uses aggressive CORE_TOOLS-only deferral; the 24-tool ceiling does not change its behavior.
  • frontier/mid classes are not affected. No ceiling applies — behavior is unchanged from v2.14.
  • Savings: 15 CORE_TOOLS + 9 discretionary slots at 24 active tools vs. the previous 40 saves approximately 750–1,250 tokens per turn (~300 chars per tool schema average).

Total bootstrap budget (Phase 160 — F2)

For capabilityClass in {small, nano}, the total bootstrap budget caps the sum of all bootstrap file content at 5,000 chars, applied as a second pass after the per-file 3_500-char cap (SD6). Key properties:
  • Sum-cap, not per-file. If all workspace files together would exceed 5,000 chars after per-file truncation, each file is proportionally scaled down to fit the total budget.
  • Per-file floor. Each file retains at least 300 chars, regardless of how many files compete for the budget. No file is silenced entirely.
  • Proportional truncation. Each file’s allocation is (file_chars / total_chars) * totalMaxChars, floored at 300. Content is taken from the beginning of the file (direct slice — not head+tail; the per-file SD6 pass already applied head+tail truncation).
  • frontier/mid unaffected. No total cap is applied; behavior is byte-identical to v2.14.
  • Config override. Explicit agents.<id>.bootstrap.maxChars always takes precedence over both the per-file and total-budget capability defaults.
Typical result: with the default workspace files (AGENTS.md, SOUL, IDENTITY, USER, ROLE, TOOLS, HEARTBEAT, BOOT), the total bootstrap fits within 5,000 chars after per-file truncation — the proportional pass is a safety net, not the primary reducer. To override the total budget for a single agent:
agents:
  my-agent:
    bootstrap:
      maxChars: 8000   # overrides the small/nano per-file default (3_500)
      # Note: there is no explicit totalMaxChars config key — the total budget is
      # capability-derived (5_000 for small/nano). Override the per-file limit instead,
      # or use capabilityClassOverride: frontier to remove both caps entirely.

Capability-gated graph concurrency (Phase 160 — F3)

For capabilityClass in {small, nano}, the graph coordinator’s maxConcurrency defaults to 2 (down from 4). For frontier/mid, the default remains 4 (byte-identical to v2.14). Why lower concurrency for small/nano? Local inference on Ollama (or any local GPU-bound runtime) serializes model loads in practice. With 4 concurrent sub-agents, all four issue simultaneous inference requests; the GPU queues them, producing peak saturation that can cause 50–80% of sub-agents to time out on longer prompts. Lowering the default to 2 staggers the load while keeping two in-flight at all times. To override the default for your deployment:
security:
  agentToAgent:
    graphMaxConcurrency: 4   # restore the frontier default, or set any value >= 1
The operator override always takes precedence over the capability-class default. Prompt timeout for local Ollama. promptTimeoutMs is a stall budget: the deadline resets on stream activity (text/thinking deltas, throttled ~1/s) and tool completions, so it only needs to cover the longest SILENT gap — in practice the prefill before the first token, which on a loaded local GPU (e.g., qwen3.6 with multiple concurrent agents) can exceed the 180,000 ms default. Once the model streams, activity keeps the turn alive; the makespan ceiling (promptTimeoutMs × stallCeilingMultiplier, default ×10) still bounds the total turn even while streaming. For local deployments, raise the stall budget to cover slow prefill:
agents:
  my-agent:
    promptTimeout:
      promptTimeoutMs: 300000   # 5 minutes — covers slow local prefill for qwen3.6 under load
A served context window smaller than configured makes prefill behavior harder to predict — see Local model context window. Fallback-model recommendation. For deep multi-agent pipelines on local hardware, configure a models.failoverModel pointing to a lighter local model (e.g., a faster quantized variant) that can handle sub-agent calls when the primary model is loaded. This provides graceful degradation rather than timeout cascades.

Bootstrap-budget warn re-base (Phase 160 — F4)

The Bootstrap content exceeds budget threshold WARN fires when bootstrap files exceed 40% of the estimated total prompt (system prompt chars + tool schema definition chars). Old behavior (pre-Phase 160): The threshold was 85% of the system prompt character count alone. With the compact-secure prompt (~2,800 chars), this meant bootstrap content over ~2,380 chars triggered the warning — which fired on every small-model turn even with a normal workspace (false-alarm rate: ~100%). New behavior: The denominator is systemPromptChars + toolDefOverheadChars (the same formula used by executor-tool-assembly.ts for the context-budget breakdown). With a compact-secure system prompt (~2,800 chars) and 24 active tool schemas (~12,000 chars overhead), the denominator is ~14,800 chars. With the Phase 160 F2 total bootstrap budget of 5,000 chars, the ratio is ~34% — below the 40% threshold, so no warning fires under normal conditions. When does the warn still fire? If an operator adds large custom workspace files that push the total bootstrap above 5,920 chars (40% of ~14,800), the warn fires correctly — signaling that bootstrap content is crowding out the system prompt and tool schemas in a meaningful way.

Summary

Classbootstrap.maxChars (per-file)bootstrap total budgetActive-tool ceilingGraph maxConcurrency
frontier / mid20_000 (unchanged)No total capNo ceiling4
small3_500 (capability default)5_000 chars sum-cap24 tools active2
nano3_500 (capability default)5_000 chars sum-capCORE_TOOLS-only (existing)2
All capability defaults follow the same precedence: explicit per-agent config > capability default. The D2/D3 capacity defaults are additive: all behaviors activate together for small/nano agents with no per-agent config required.
Relationship to Phase 158 D1 (reliability defaults): The GoalAnchor, rag.baseFloor, and verification critic defaults from Phase 158 follow the same precedence model — explicit per-agent config > capability default > off. The D2/D3 capacity defaults (bootstrap budget, tool ceiling, graph concurrency) are additive: all default-on behaviors activate together for small/nano agents with no per-agent config required.

General vs. Coding-Tuned Model Guidance

For agent reliability, prefer general-purpose qwen3.6 variants over coding-tuned models:
  • General models (e.g., qwen3.6:35b, qwen3.6:27b) are the right choice for agentic tasks. They handle multi-turn conversations, multi-constraint instructions, and tool use reliably.
  • Coding-tuned models can exhibit goal fixation — continuing to pursue a sub-task at the expense of the original objective. The Comis scaffold (GoalAnchor, verification critic, compact-secure prompt) is designed to detect and correct this behavior, but general models avoid the problem in the first place. The v2.14 small-model milestone originated from a snake game incident involving a coding-tuned model; general-purpose qwen3.6 reproduces this scenario far less.
  • The Comis scaffold is designed for general models. It is not a substitute for choosing the right model class.
For local model runtime selection (MLX vs. GGUF), see the Environment Variables reference.

Local model context window

For Ollama providers, the served context window (num_ctx) may differ from the configured contextWindow. By default, Comis probes GET /api/ps and POST /api/show at daemon boot and reconciles:
effectiveWindow = min(configured contextWindow, served num_ctx, capability class cap)
The reconciled value is used for context budgeting and the post-turn context-window guard — so the agent plans against the model’s actual KV-cache limit, not a stale declaration. For model recommendations with measured receipts and the full local-deployment knob map, see the Local models playbook. Probed values are sanitized before use: fractional context_length values are floored to integers, and implausibly small ones (below 512 tokens — e.g. a typo’d Modelfile PARAMETER num_ctx) are rejected as bogus, falling back to the configured window via the probe’s normal fail-open path. Changing the served window: The probe is read-only; it discovers what Ollama has loaded. To serve a larger context (up to the model’s native maximum), set it on the Ollama side:
  • Environment variable: OLLAMA_CONTEXT_LENGTH=65536 (before starting Ollama)
  • Modelfile parameter: PARAMETER num_ctx 65536
  • Ollama.app: Settings → Models → Context Length
VRAM / KV-cache caveat: A larger num_ctx increases GPU memory usage proportionally (the KV-cache scales linearly with context length). A 35B model at 256K context can OOM or cause heavy memory thrashing on consumer hardware. Raise num_ctx only as far as your VRAM allows; start conservatively (e.g., 32768 or 65536) and measure. To suppress the boot probe for a provider (e.g., Ollama is offline at daemon start):
providers:
  entries:
    my-ollama:
      type: ollama
      capabilities:
        probeServedWindow: false
Boot WARN — served below configured: When the probe discovers a served window smaller than the configured contextWindow, Comis logs exactly one WARN per provider per boot"Ollama served context window below configured" — naming both numbers and the probed model (the probe checks ONE model per provider: defaultModel, else the first models[] entry — per-model probing is a known limitation). The hint names the fixes: OLLAMA_CONTEXT_LENGTH=131072 ollama serve (substituting your configured window), or Modelfile PARAMETER num_ctx 131072 (see the VRAM caveat above), and the opt-out (providers.entries.<id>.capabilities.probeServedWindow: false). Healthy boots stay silent: served at or above configured, equal windows, non-Ollama providers, and providers the probe skipped all log nothing. Exhaustion provenance — the served bind names its knobs: When a turn exhausts a served-bound window, the context_exhausted error text carries the suffix (model contextWindow 131072 but Ollama serves only 8192 — fix: OLLAMA_CONTEXT_LENGTH=131072 ollama serve, or Modelfile 'PARAMETER num_ctx 131072') — both Ollama knobs plus the TRUE configured window. In logs and comis explain, rawContextWindowTokens reports the configured window with windowCapSource: "served" (previously the served value masqueraded as the model’s declared window with source "none"). When both the served window and a capability-class cap clamp, the message names the full chain: (model contextWindow 131072, Ollama serves 50000, capped to 32000 by contextEngine.budget.effectiveContextCapSmall — raise it (0 = uncapped) or reduce active tool schemas). The cap wording is branched by the lever that actually binds: the contextEngine.budget.effectiveContextCapSmall/Nano form above appears only when the budget-side cap genuinely clamped (raising that key works); when the window was instead capped upstream by an operator providers.entries.<id>.capabilities.capabilityClass pin (the executor’s per-class default — small 32000 / nano 16000 — which never reads the budget keys), the suffix reads capped to 32000 by providers.entries.<id>.capabilities.capabilityClass — pin a higher class (or remove the pin) or reduce active tool schemas and windowCapSource reports "capabilityClass" — on that branch raising contextEngine.budget.effectiveContextCapSmall (or setting it to 0) changes nothing, so the error and comis explain name the pin instead of that dead knob. The reconcile line — "Context window reconciled (served or capability cap bound)", with source / effectiveWindow / configured / served / capabilityCap fields — logs at INFO once per session (the first reconciled turn; a session reset grants a fresh INFO) and at DEBUG per turn. A session whose configured window simply wins logs no reconcile line at any level (nothing was reconciled). The served window is provider-scoped: it clamps only executions that resolve to the provider it was probed from — a per-execution model override to another provider (a graph node’s model: anthropic:... on an Ollama-primary agent, a subagent spawn) keeps that model’s full window and never gets "Ollama serves only N" attribution. Boot viable floor (minViable): At boot, per agent, Comis computes minViable = bootstrapTotalTokens + toolSchemaTokens + outputHeadroomFloor + freshTailReserve + safetyMargin — each term single-sourced from its turn-time preflight home (the scaffold bootstrap budget, the tool-schema overhead estimate, the output headroom at the post-downshift minimum thinking level, the per-class preamble reserve, and the token-budget safety margin) — and WARNs when the effective window cannot fit even that floor: "Boot viable-floor check: effective window below minViable — agent will degrade on real turns (WARN-only, boot continues)". The hint spells the full equation with every term’s value, e.g. minViable = bootstrapTotalTokens(1429) + toolSchemaTokens(9714) + outputHeadroomFloor(1792) + freshTailReserve(2000) + safetyMargin(2048) = 16983 exceeds effectiveWindow 8192 [source: served], followed by the knob for the binding window source — served: the two Ollama knobs above; capability: pin a higher class (or remove the pin) via providers.entries.<id>.capabilities.capabilityClass (the contextEngine.budget.* caps cannot raise this bind); configured: providers.entries.<id>.models[].contextWindow. When tool schemas dominate the floor, the hint adds the active-tool-ceiling lever: pin capabilityClass (the small class defers to a 24-tool active ceiling via discover_tools) or disable unused MCP servers / builtin tool groups. WARN-only: Comis never refuses to boot below the floor (the adapt-down posture) — and because the boot floor and the turn-time preflight share one source module and one tool corpus (the boot toolSchemaTokens term measures the same converted tool definitions — lean descriptions plus guidelines — the turn actually ships, not the raw factory descriptions), the same numbers re-appear in any later turn-time context_exhausted for that agent. The boot viable-floor WARN is engine-AGNOSTIC — it fires for "dag" and "pipeline" agents alike (the minViable arithmetic holds regardless of engine); the per-turn preflight surfaces are dag-only — see the engine-scope note in the Context Engine section. Fleet-wide, under-served providers surface as the config_posture:served_below_configured finding in comis fleet and the obs.fleet.health RPC (see the JSON-RPC reference).

UX Features

Processing phase emoji reactions. When enabled, the agent reacts to messages with emoji reflecting current phase (thinking, tool use, generating, done, error).
KeyTypeDefaultDescription
enabledbooleanfalseEnable lifecycle reactions globally
emojiTierenum"unicode"Emoji set: unicode, platform, custom
Timing (lifecycleReactions.timing)
KeyTypeDefaultDescription
debounceMsnumber700Debounce before committing a phase transition
holdDoneMsnumber3000How long to hold done emoji
holdErrorMsnumber5000How long to hold error emoji
stallSoftMsnumber15000Soft stall warning threshold in ms
stallHardMsnumber30000Hard stall warning threshold in ms
Per-Channel (lifecycleReactions.perChannel.)
KeyTypeDefaultDescription
enabledboolean(unset)Override enabled state
emojiTierenum(unset)Override emoji tier
Response prefix/suffix template injected into agent replies.
KeyTypeDefaultDescription
templatestring""Template string (empty = disabled). Supports variables like {agent.emoji}, {model|short}.
positionenum"prepend"Insert position: prepend or append
Inter-block delivery pacing to simulate natural typing rhythm.
KeyTypeDefaultDescription
modeenum"natural"Mode: off, natural, custom, adaptive
minMsnumber800Minimum delay in ms between blocks
maxMsnumber2500Maximum delay in ms between blocks
jitterMsnumber200Random jitter in ms
firstBlockDelayMsnumber0Extra delay before first block
Block coalescer that accumulates small streaming blocks before delivery.
KeyTypeDefaultDescription
minCharsnumber0Blocks below this are always coalesced
maxCharsnumber500Flush threshold
idleMsnumber1500Idle timeout in ms before flushing
codeBlockPolicyenum"standalone"Code block handling: standalone or coalesce
adaptiveIdlebooleanfalseAdapt timeout to accumulated block length
Controls how sender identity is surfaced to the LLM in the message envelope.
KeyTypeDefaultDescription
enabledbooleanfalseInclude sender identity in envelope
displayModeenum"hash"Mode: raw (platform ID), hash (HMAC prefix), alias (operator name)
hashPrefixnumber8Hex characters from HMAC digest (4-16)
hashSecretRefstring""SecretManager key for HMAC secret
aliasesRecord<string, string>{}Sender ID to alias mapping
Documentation links injected into the system prompt so the agent can reference them.
KeyTypeDefaultDescription
enabledbooleanfalseEnable documentation link injection
localDocsPathstring""Filesystem path to local docs
publicDocsUrlstring""Public documentation URL
sourceUrlstring""Source code repository URL
communityUrlstring""Community or support URL
skillsMarketplaceUrlstring""Skills marketplace URL
mcpRegistryUrlstring""MCP server registry URL
customLinksDocumentationLink[][]Additional custom links (label + url)
Detects hallucinated file paths in responses destined for Telegram (where file:// links are meaningless).
KeyTypeDefaultDescription
enabledbooleantrueEnable file reference guard
additionalExtensionsstring[][]Extra file extensions to detect
excludedExtensionsstring[][]Extensions to exclude from detection
Crash-safe outbound delivery queue. Messages are persisted to SQLite before delivery attempts, surviving daemon restarts.
KeyTypeDefaultDescription
enabledbooleantrueEnable the delivery queue. When false, messages bypass persistence.
maxQueueDepthnumber10000Maximum entries allowed in the queue. Enqueue rejects when full.
defaultMaxAttemptsnumber5Maximum delivery attempts before marking as failed.
defaultExpireMsnumber3600000Time-to-live in ms before an entry expires (1 hour).
drainOnStartupbooleantrueDrain pending entries on daemon startup (crash recovery).
drainBudgetMsnumber60000Maximum time in ms for startup drain before continuing.
pruneIntervalMsnumber300000Interval in ms between prune sweeps for expired entries.
deliveryQueue:
  enabled: true
  maxQueueDepth: 10000
  defaultMaxAttempts: 5
  defaultExpireMs: 3600000
  drainOnStartup: true
  drainBudgetMs: 60000
  pruneIntervalMs: 300000
Session delivery mirroring for deduplication. Records delivered messages so they can be injected into agent context, preventing repeated deliveries.
KeyTypeDefaultDescription
enabledbooleantrueEnable the delivery mirror. When false, no entries are recorded.
retentionMsnumber86400000Maximum age in ms before mirror entries are pruned (24 hours).
pruneIntervalMsnumber300000Interval in ms between prune sweeps (5 minutes).
maxEntriesPerInjectionnumber10Maximum mirror entries injected per prompt turn.
maxCharsPerInjectionnumber4000Maximum total characters of mirror text injected per turn.
deliveryMirror:
  enabled: true
  retentionMs: 86400000
  pruneIntervalMs: 300000
  maxEntriesPerInjection: 10
  maxCharsPerInjection: 4000
Observability persistence layer. Stores channel health snapshots, execution metrics, and system telemetry in SQLite for historical analysis and dashboards.
KeyTypeDefaultDescription
persistence.enabledbooleantrueEnable observability persistence.
persistence.retentionDaysnumber30Days to retain data before pruning (1-365).
persistence.snapshotIntervalMsnumber300000Interval in ms between channel health snapshots (min 60000).
observability:
  persistence:
    enabled: true
    retentionDays: 30
    snapshotIntervalMs: 300000

Diagnostics

Per-recall ranking trace written as bounded JSONL for “why did recall pick X?” debugging. Opt-in (default off) — unlike its cacheTrace sibling (which defaults true), the recall trace is only captured during a focused debug session.
KeyTypeDefaultDescription
enabledbooleanfalseEnable the recall-trace writer (opt-in). Also honors the COMIS_DISABLE_RECALL_TRACE env hard-off.
filePathstring(optional)Full path override. When unset, resolves to ~/.comis/logs/recall-trace.jsonl (tilde-prefix supported).
maxFileBytesnumber52428800Per-file byte cap (50 MB; positive).
The recall trace has no raw-content opt-in — there is intentionally no includeMessages / includeSystem / includePrompt slot (unlike cacheTrace). Every payload is full-sanitized (bound, then sanitize, then redact) before it touches disk. There is no way to disable that sanitization.
diagnostics:
  recallTrace:
    enabled: true                   # opt-in (default false) -- full-sanitized, bounded JSONL

Credential Broker Bindings (executor.broker)

The executor.broker block is wired into the daemon (AppConfigSchemasetupBroker): adding it to config.yaml starts the broker at boot — a TCP listener plus a 0600 unix socket at ~/.comis/broker.sock.
~/.comis/config.yaml
# executor.broker.bindings — provider-agnostic; presets are optional sugar
# The broker starts at daemon boot whenever an executor.broker block is present.
executor:
  broker:
    bindings:
      # Option A: built-in preset — Anthropic (header injection)
      - preset: anthropic
        secretRef: ANTHROPIC_EXECUTOR_KEY

      # Option B: built-in preset — Finnhub (query param injection)
      - preset: finnhub
        secretRef: FINNHUB_API_KEY

      # Option C: custom binding — any host, no preset required
      # A binding with no 'inject' defaults to Authorization: Bearer
      - hostRules:
          - pattern: { kind: exact, host: my-internal-api.example.com }
            inject: []    # defaults to Authorization: Bearer
        secretRef: INTERNAL_API_TOKEN

      # Option D: custom binding with explicit header injection
      - hostRules:
          - pattern: { kind: suffix, suffix: .amazonaws.com }
            inject:
              - kind: setHeader
                name: x-amz-security-token
                format: raw
        secretRef: AWS_SESSION_TOKEN
Credential Broker Bindings (executor.broker.bindings[*])
KeyTypeRequiredDescription
presetstringone of preset/hostRulesBuilt-in preset ID (anthropic, finnhub) — expands to preset’s host rules
hostRulesHostRule[]one of preset/hostRulesCustom host rules (provider-agnostic)
secretRefstringyesSecretManager key resolved per-request; never cached to disk
credentialRefsRecord<string, string>noExtra refs for multi-field finalizers
HostRule fields (hostRules[*]):
KeyTypeRequiredDescription
pattern{ kind: "exact", host: string } | { kind: "suffix", suffix: string }yesHost match pattern
injectInjectionRule[]noInjection rules; empty array defaults to Authorization: Bearer
pathPrefixstringnoRestrict to requests with this path prefix
pathPolicystringnoGlob pattern for allowed paths
finalizerobjectnoPost-injection body transform (e.g., awsSigV4 — no-op in this release)
For full configuration details and examples, see Credential Broker →.

Environment Variables

Environment variable reference

Hot Reload

Configuration hot reload behavior

Secret Manager

Encrypted secret storage and $ substitution

Configuration Guide

Step-by-step configuration walkthrough