Skip to main content
When messages arrive faster than your agent can respond, Comis queues them up and processes them in order. If the queue gets too full, overflow rules kick in automatically.

How the queue works

Messages are processed one at a time per conversation. This means your agent finishes responding to one message before moving on to the next in the same conversation. However, multiple conversations can run at the same time (up to maxConcurrentSessions, which defaults to 10). All messages share a single global concurrency gate — multiple conversations can run in parallel (up to maxConcurrentSessions), but within each conversation messages are processed in arrival order.

Debouncing

Sometimes users send several short messages in quick succession instead of one longer message. The debounce feature waits a short time after the last message before processing, so your agent sees the complete thought instead of responding to each fragment separately. By default, debouncing is disabled (windowMs: 0). If your users often send rapid-fire messages, set a short debounce window (like 1000-2000 milliseconds) to combine them automatically.

Queue modes

When a new message arrives while your agent is still processing a previous one, the queue mode determines what happens: steer+followup (default) — The agent sees the new message immediately and adjusts its current response. If the agent finishes before the new message needs a separate response, it can do a follow-up action. This is the most responsive mode. steer — The agent sees new messages that arrive mid-response and can adjust, but does not perform follow-up actions afterward. followup — New messages wait in the queue. After the agent finishes its current response, it processes the next message as a separate follow-up. collect — Messages that arrive close together are batched into a single request. Useful when users tend to send multiple short messages in quick succession instead of one longer message.

When the queue is full

If too many messages pile up for a single conversation, the overflow policy determines what happens: drop-new (default) — The newest message is dropped and the sender gets a notification that the agent is busy. Existing queued messages are preserved. drop-old — The oldest queued message is removed to make room for the new one. Good when the latest message is more important than earlier ones. summarize — Older queued messages are summarized into one combined message, making room for the new one without losing the gist of what was said.

Configuration

OptionTypeDefaultWhat it does
queue.enabledbooleantrueEnable message queuing
queue.maxConcurrentSessionsnumber10Maximum conversations processed at the same time
queue.defaultModestring"steer+followup"How new messages are handled during active processing
queue.defaultOverflow.maxDepthnumber20Maximum queued messages per conversation before overflow
queue.defaultOverflow.policystring"drop-new"What to do when the queue is full
queue.debounce.windowMsnumber0Debounce window in milliseconds (0 = disabled)
~/.comis/config.yaml
queue:
  enabled: true
  maxConcurrentSessions: 10
  defaultMode: "steer+followup"
  defaultOverflow:
    maxDepth: 20
    policy: "drop-new"
  debounce:
    windowMs: 0
Most users do not need to change queue settings. The defaults handle common chat patterns well. Adjust only if your agent is overwhelmed by high message volume or if you want different behavior for batched messages.
For the full list of queue configuration options including per-channel overrides, see Configuration Reference.

How it works (developer detail)

The queue is a per-session lane on top of a global concurrency gate.
  • Per-session ordering — Each session has its own FIFO lane. Within a lane, messages are processed strictly in arrival order so the agent never responds to message 3 before message 2.
  • Global concurrency — Up to maxConcurrentSessions lanes can be draining at once. Additional lanes wait for a slot.
  • Backpressure — When a lane reaches maxDepth, the configured overflow policy (drop-new, drop-old, or summarize) decides which message to shed. Drops fire a queue:overflow event so observability dashboards can surface them.
  • Debouncing — The optional debounce buffer coalesces a burst of short messages into one envelope before the lane sees it.
The implementation lives in packages/orchestrator/src/queue/. Key files:
FilePurpose
command-queue.tsThe session-lane queue itself
lane.tsPer-session lane state (concurrency=1 serialized execution)
overflow.tsThe three overflow policies
debounce-buffer.tsPre-queue coalescing of rapid messages
coalescer.tsThe summarize overflow strategy

Sessions

How conversations are created and managed.

Routing

How messages are directed to the right agent.