Monitoring - Comis

Comis continuously monitors the health of your system and alerts you if something needs attention. Think of it as a health dashboard that watches the important vital signs — disk space, CPU usage, memory, system services, and more — so you do not have to. There are three ways to look at “is Comis healthy right now?”:

The GET /health HTTP endpoint — a one-shot readiness probe used by Docker, systemd, and curl-based scripts.
The comis health CLI command — fail/warn findings only, with an exit code suitable for shell pipelines.
The heartbeat monitoring system — continuous background checks that fire alerts to your agent (and through it to your chat channels) when thresholds are crossed.

The /health endpoint and comis health CLI answer “is the daemon up?” right now. The heartbeat system answers “what’s been wrong over the last few minutes?” — the rest of this page is about the heartbeat.

The `/health` HTTP endpoint

The gateway exposes a readiness probe at /health:

curl http://localhost:4766/health
# {"status":"ok","timestamp":"2026-04-25T10:00:00.000Z"}

It is unauthenticated (used by orchestrators) and returns 200 with status: "ok" once the daemon has finished its bootstrap sequence. The Docker HEALTHCHECK and the comis daemon start wait-loop both check this endpoint.

The `comis health` CLI

The CLI runs the same checks as comis doctor but only prints fail/warn findings, exiting with status 1 if anything failed:

node packages/cli/dist/cli.js health
node packages/cli/dist/cli.js health --all          # also include passing checks
node packages/cli/dist/cli.js health --format json  # machine-readable output

Use this in CI/CD or in shell scripts that need to gate on Comis being healthy.

What Gets Monitored

Comis checks five different health sources via the heartbeat. Each source can be enabled or disabled independently.

Disk Space

Watches the disk usage on configured paths and alerts when usage exceeds the threshold.Why it matters: If the disk fills up, Comis cannot write logs, save memories, or store agent data. Getting an early warning gives you time to free up space.Default behavior: Monitors / and alerts at 90% usage.

# config.yaml
monitoring:
  disk:
    enabled: true
    paths: ["/"]
    thresholdPercent: 90

System Resources

Monitors CPU and memory usage and alerts when utilization is consistently high.Why it matters: High CPU or memory usage can slow down agent responses, causing the event loop to lag and degrading user-facing latency.Default behavior: Alerts at 85% CPU and 90% memory usage.

# config.yaml
monitoring:
  resources:
    enabled: true
    cpuThresholdPercent: 85
    memoryThresholdPercent: 90

systemd Services (Linux only)

Checks for failed system services on Linux servers running systemd.Why it matters: A failed dependency (like a database or reverse proxy) can affect Comis even if the daemon itself is running fine.Default behavior: Checks all system services for failures. Set services to a specific list to limit which services are checked.

# config.yaml
monitoring:
  systemd:
    enabled: true
    services: []     # Empty = check all failed services

systemd monitoring only works on Linux. On macOS, these checks are skipped automatically.

Security Updates

Checks whether security patches are available for your operating system.Why it matters: Unpatched systems are vulnerable. This check gives your agent visibility into pending security updates so it can alert you.Default behavior: Checks for security updates only (not all package updates).

# config.yaml
monitoring:
  securityUpdates:
    enabled: true
    securityOnly: true

Git Repositories

Optionally monitors git repositories for uncommitted changes or unpushed commits.Why it matters: Useful for development environments where you want your agent to remind you about uncommitted work.Default behavior: Disabled by default. Enable and add repository paths to activate.

# config.yaml
monitoring:
  git:
    enabled: false
    repositories: []
    checkRemote: true

How Alerts Work

Monitoring runs on the heartbeat schedule — every 5 minutes by default. When a threshold is exceeded, the agent receives an alert and can notify you through your connected chat channels. Alerts have built-in protection against repeated notifications:

Alert threshold: A source must fail multiple consecutive checks (default: 2) before triggering an alert
Alert cooldown: After alerting, the same source will not alert again for 5 minutes

Monitoring integrates with the heartbeat system. See the Scheduler page for how to configure check intervals and quiet hours.

Full Configuration Reference

# config.yaml
monitoring:
  disk:
    enabled: true           # Watch disk space
    paths: ["/"]            # Paths to monitor
    thresholdPercent: 90    # Alert above this percentage

  resources:
    enabled: true           # Watch CPU and memory
    cpuThresholdPercent: 85
    memoryThresholdPercent: 90

  systemd:
    enabled: true           # Check for failed services (Linux)
    services: []            # Empty = check all

  securityUpdates:
    enabled: true           # Check for security patches
    securityOnly: true      # Only security updates, not all

  git:
    enabled: false          # Watch git repos (off by default)
    repositories: []        # Absolute paths to repos
    checkRemote: true       # Check for unpushed commits

Daemon

Daemon lifecycle and recovery

Scheduler

Heartbeat intervals and quiet hours

Observability

Token tracking and performance metrics

Troubleshooting

Solutions to common issues

​The /health HTTP endpoint

​The comis health CLI

​What Gets Monitored

​How Alerts Work

​Full Configuration Reference

​Related Pages

Daemon

Scheduler

Observability

Troubleshooting

The `/health` HTTP endpoint

The `comis health` CLI

What Gets Monitored

How Alerts Work

Full Configuration Reference

Related Pages