Complete technical reference for all security mechanisms, thresholds, patterns, and configuration
This page is the exhaustive technical reference for every security mechanism in
Comis, organized by processing stage. Each section documents the exact rules,
thresholds, pattern weights, and default values as implemented in the source
code.For a user-friendly overview of how these layers work together, see
Defense in Depth.
Operator config, the daemon process, secrets in the encrypted store, the action registry, the gateway token store
Authoritative
Semi-trusted
The LLM (Anthropic / OpenAI / etc.)
Verified outputs only
Adversarial
User messages on chat channels, fetched web pages, emails, MCP tool outputs, RAG memory entries fetched at runtime
Treated as untrusted by default
Every external input is wrapped, scanned, or filtered before it reaches the
prompt. Every model output is scanned before it reaches the user. Every tool
call is classified before it executes. The architecture never assumes any
single check will catch everything.
The input layer validates and scans incoming messages before they reach the
agent. It runs four mechanisms in sequence: structural validation, jailbreak
detection, injection rate limiting, and external content wrapping.
The Input Guard performs semantic jailbreak detection using weighted compound
phrase patterns, typoglycemia detection, and code block exclusion. It scores
input text on a 0.0-1.0 scale.
Each category groups related regex patterns. If any pattern in a category
matches, the category weight is added once (boolean per category — multiple
matches within the same category do not multiply the weight).
Category
Weight
Patterns Detected
ignore_instructions
0.6
”Ignore previous instructions”, “ignore all instructions”
disregard_previous
0.5
”Disregard previous”, “disregard your instructions”
forget_instructions
0.5
”Forget everything”, “forget your instructions”
role_assumption
0.4
”You are now [role]”, “you are now a/an”
new_instructions
0.5
”New instructions”, “new instructions:“
important_override
0.5
”Important: override” and variants
override_safety
0.6
”Override safety” and variants
act_as_role
0.4
”Act as [role]” pattern
context_reset
0.4
Context reset manipulation
rule_replacement
0.4
Rule replacement attempts
system_markers
0.3
<system> tags, [system] brackets, system command markers
special_tokens
0.3
Special token delimiters (<|...|> patterns)
role_markers
0.2
Role boundary markers, assistant role markers
The 13 categories in the table above are evaluated each scan. Pattern constants
are imported from injection-patterns.ts.
The guard detects scrambled-middle variants of 8 key jailbreak terms. Each
match adds 0.3 to the score. A word is a typoglycemia variant if it has the
same length, same first and last characters, same sorted middle characters, but
is not an exact match.
Content inside fenced code blocks (triple backticks) and inline code (single
backticks) is stripped before pattern matching. This minimizes false positives
on technical content that legitimately discusses prompt injection or system
commands.
External content from untrusted sources (emails, webhooks, APIs, web tools) is
wrapped with random delimiters and security warnings before passing to the LLM.Delimiter generation: Each session gets a deterministic random delimiter
from AsyncLocalStorage context (or a fresh 24-hex-character delimiter from
randomBytes(12)). The wrapping format is:
<<<UNTRUSTED_{delimiter}>>>Source: EmailFrom: sender@example.comSubject: Help request---[content here]<<<END_UNTRUSTED_{delimiter}>>>
Source types:email, webhook, api, channel_metadata, web_search,
web_fetch, document, unknown.Marker sanitization: If the content itself contains delimiter patterns
(static <<<EXTERNAL_UNTRUSTED_CONTENT>>> or dynamic <<<UNTRUSTED_hex>>>),
they are replaced with [[MARKER_SANITIZED]] before wrapping. This also
handles fullwidth Unicode equivalents to prevent bypass via character
substitution.Suspicious pattern detection: Content is scanned against 17 injection
patterns from injection-patterns.ts. When patterns are found, an
onSuspiciousContent callback fires with the source, matched patterns, content
length, and sender.
Source: wrapExternalContent() in packages/core/src/security/external-content.ts
Scans skill body content for dangerous patterns across 6 categories. Pure
function with no side effects — callers handle audit events and blocking
decisions.
Category
Rules
Severity
Description
exec_injection
4 rules
CRITICAL
Subshell injection $(cmd), backtick injection, eval(), pipe to shell
env_harvesting
3 rules
WARN
printenv, /proc/self/environ, env dump piped to encoding/exfiltration
crypto_mining
3 rules
CRITICAL/WARN
stratum:// protocol, known miner binaries (CRITICAL); pool domains (WARN)
Long base64 80+ chars (WARN), long hex 20+ sequences (WARN); base64 decode piped (CRITICAL)
xml_breakout
2 rules
CRITICAL
Closing skill XML tags, system-level message tags (breakout attempts)
Total: 18 scan rules across 6 categories in the skill scanner. (The
broader injection-pattern library aggregates 65 patterns across 8
categories — see the table below.)
Source: scanSkillContent() and CONTENT_SCAN_RULES in packages/skills/src/prompt/content-scanner.ts
Skill body content passes through a 4-step pipeline before reaching the system
prompt:
Step
Operation
Purpose
1
Strip HTML comments
Remove <!-- hidden content --> that could contain injection
2
NFKC normalization
Decompose fullwidth/ligature characters to canonical form
3
Strip invisible characters
Remove zero-width characters including Unicode tag block bypass
4
Enforce size limit
Truncate to maxBodyLength with [TRUNCATED] marker
Size enforcement applies to the final sanitized output, not the raw input.
This prevents unnecessary truncation when HTML comments inflate the raw size.
Source: sanitizeSkillBody() in packages/skills/src/prompt/sanitizer.ts
Every action in the system is classified by risk level. Unknown actions default
to "destructive" (fail-closed principle).
Classification
Description
Example Actions
read
No side effects, safe to auto-approve
file.read, memory.search, config.read
mutate
Modifiable side effects, logged
file.write, memory.store, message.send
destructive
Irreversible or high-risk, may require confirmation
file.delete, memory.clear, system.shutdown
The registry contains 178 registered actions across 21 categories. After
bootstrap, the registry is locked via lockRegistry() to prevent runtime
classification downgrades by malicious plugins.For the complete action registry, see Action Classifier.
Source: classifyAction() and ACTION_REGISTRY in packages/core/src/security/action-classifier.ts
The approval system controls whether agent-initiated actions proceed
automatically, require human confirmation, or are denied.Three modes:
Mode
Behavior
auto
Action proceeds without human confirmation
require
Action pauses until human approves or denies
deny
Action is rejected immediately
Configuration layers: Comis has two distinct configuration sections that
interact:
security.actionConfirmation — Quick toggle for destructive/sensitive action confirmation
approvals — Full rule-based approval workflow with pattern matching
The approvals system evaluates rules in order (first match wins). Each rule
matches action types by pattern and specifies a mode, timeout, and minimum
trust level.Approval rule fields:
Field
Type
Default
Description
actionPattern
string
(required)
Pattern matching action types
mode
"auto" | "require" | "deny"
"auto"
Approval behavior
timeoutMs
number
300000 (5 min)
Timeout for human approval (0 = no timeout)
minTrustLevel
"untrusted" | "basic" | "verified" | "admin"
"verified"
Trust level required to auto-approve
Source: ApprovalRuleSchema and ApprovalsConfigSchema in packages/core/src/config/schema-approvals.ts
Validates URLs before any outbound HTTP request to prevent Server-Side Request
Forgery. Uses DNS-pinned validation: the URL is resolved to an IP address and
checked against blocked ranges before the actual fetch.Blocked IP ranges:
Range
Description
private
RFC 1918 addresses (10.x, 172.16-31.x, 192.168.x)
loopback
127.0.0.0/8 and ::1
linkLocal
169.254.0.0/16 and fe80::/10
uniqueLocal
fc00::/7 (IPv6 private)
unspecified
0.0.0.0 and ::
reserved
IANA reserved ranges
Cloud metadata IPs (explicitly blocked):
IP
Service
169.254.169.254
AWS, GCP, Azure instance metadata
169.254.170.2
AWS ECS task metadata
100.100.100.200
Alibaba Cloud metadata
Protocol check: Only http: and https: protocols are allowed.Validation sequence:
Parse URL (reject invalid)
Check protocol (reject non-HTTP/HTTPS)
DNS resolution (reject unresolvable hostnames)
Cloud metadata IP check (reject explicit metadata IPs)
IP range classification (reject private/loopback/link-local/reserved)
Source: validateUrl(), BLOCKED_RANGES, and CLOUD_METADATA_IPS in packages/core/src/security/ssrf-guard.ts
Pre-sandbox shell-substitution check applied to every system.exec command
string. Implemented as a quote-aware state machine (ShellQuoteTracker) that
tracks normal / single / double / backtick context.
Pattern
Behavior
Command substitution $(...)
Rejected outside single quotes
Backtick substitution `...`
Rejected outside single quotes
Process substitution <(...) / >(...)
Rejected
Zsh process substitution =(...)
Rejected
Zsh equals expansion =cmd at word boundary
Rejected
Returns null if safe, or an error message describing the dangerous pattern.
The validator runs before the OS sandbox so it never even reaches
bubblewrap/sandbox-exec. Quote-awareness avoids false positives on legitimate
strings like printf '$(echo)' where the shell does not interpret the
substitution.
Source: validateExecCommand() and ShellQuoteTracker in packages/skills/src/builtin/exec-security.ts
Validates file paths to prevent directory traversal attacks. Returns a resolved,
validated absolute path that is guaranteed to stay within the base directory.Attack vectors defended:
Vector
Defense
Basic traversal (../)
Path resolution + prefix check
URL-encoded traversal (%2e%2e%2f)
decodeURIComponent before resolution
Prefix attacks (/uploads vs /uploads-evil)
Trailing separator in prefix check
Null byte injection
Explicit \0 check in all segments
Symlink escapes
Walk each path component, check lstat for symlinks resolving outside base
What it does. For API-key CLIs driven from the exec sandbox, the credential broker acts as an in-process HTTPS MITM proxy. The real key never enters the sandbox — the broker resolves it per-request from SecretManager and injects it at the TLS boundary. On Linux, the credentialed sandbox runs in broker-only network mode, where --unshare-net is applied and the broker unix socket is the only bind-mounted network path. This kernel-enforcement is validated on the Linux production host class: direct egress from inside the namespace fails (network unreachable), while the bound broker socket stays reachable.The broker is fail-closed: a missing binding returns 403, a missing secret returns 502, and a forged proxy token returns 407. No path forwards the request without a valid credential. All broker activity is audited via broker:* events carrying agentId and traceId.Network modes (SandboxOptions.network):
Driven-CLI sandbox; only broker unix socket reachable (Linux only)
Source: packages/infra/src/credential-broker/mitm-broker.ts — fail-closed gates; packages/skills/src/tools/builtin/sandbox/bwrap-provider.ts — broker-only branch. See Credential Broker for the full deep-dive.
Scans LLM responses for secret patterns, canary token leakage, and prompt
extraction attempts. Critical findings are blocked and redacted; warning-level
findings are reported but not redacted.
If a canaryToken is provided in the scan context and found in the response,
it is redacted as [REDACTED:canary] with critical severity. This indicates
the agent leaked its canary token, suggesting prompt extraction succeeded.
Canary tokens are injected into system prompts to detect prompt extraction
attacks. If the token appears in the agent’s output, it means the system prompt
was leaked.Generation: HMAC-SHA256 of "canary:{sessionKey}" using a configured
secret. The first 16 hex characters are used, prefixed with CTKN_.
CTKN_a1b2c3d4e5f67890
Properties:
Deterministic per session: Same session key and secret always produce the same canary
Format:CTKN_ prefix followed by 16 hex characters
Detection: The output guard checks if the canary token appears anywhere in the response
Source: generateCanaryToken() and detectCanaryLeakage() in packages/core/src/security/canary-token.ts
Defense-in-depth regex-based credential scrubbing for log strings. This is a
second line of defense after Pino’s structured field redaction — it catches
credentials embedded in free-text log messages.Pino’s structured-field redactor (the first line of defense) auto-redacts
the canonical credential field names: apiKey, token, password,
secret, authorization, botToken, privateKey, cookie,
webhookSecret. The Log Sanitizer below covers credentials that escape into
free-text log messages.18 credential patterns processed in order (more specific patterns first):
Pattern
Replacement
Description
Anthropic API keys
sk-ant-[REDACTED]
sk-ant-... keys
OpenAI project keys
sk-proj-[REDACTED]
sk-proj-... keys
Generic sk- keys
sk-[REDACTED]
Any sk- prefixed key (20+ chars)
Bearer tokens
Bearer [REDACTED]
Bearer token in text
Telegram bot tokens
[REDACTED_BOT_TOKEN]
digits:alphanumeric format
AWS access keys
AKIA[REDACTED]
AKIA followed by 16 chars
AWS secret keys
$1[REDACTED_AWS_SECRET]
40-char base64-like after common prefixes
Stripe keys
sk_[REDACTED]
sk_live_ or sk_test_ keys
Google API keys
AIza[REDACTED]
AIzaSy... keys
Slack app tokens
xapp-[REDACTED]
xapp- prefixed tokens
SendGrid keys
SG.[REDACTED]
SG. prefixed keys
JWTs
[REDACTED_JWT]
Three dot-separated base64url segments
DB connection strings
[REDACTED_CONN_STRING]
postgres://, mysql://, etc.
URL passwords
://$1:[REDACTED]@
Passwords in URL credentials
Discord bot tokens
[REDACTED_DISCORD_TOKEN]
Dot-separated token segments
Hex secrets
[REDACTED_HEX]
40+ character hex strings
GitHub tokens
gh[REDACTED]
ghp_, gho_, etc. (36+ chars)
Size limit: Inputs exceeding 1 MB are returned as-is to prevent ReDoS on
oversized strings.
Source: sanitizeLogString() and CREDENTIAL_PATTERNS in packages/core/src/security/log-sanitizer.ts
Pre-storage security scan for memory content. Prevents memory poisoning attacks
where adversaries store prompt injection payloads for later retrieval via RAG.
The validator uses the same detectSuspiciousPatterns() function from
external-content.ts for pattern detection. Critical patterns are a subset:
execution-oriented patterns that are dangerous when stored for later RAG
retrieval.
Source: validateMemoryWrite() in packages/core/src/security/memory-write-validator.ts
Secrets are encrypted using AES-256-GCM with HKDF-SHA256 key derivation. Each
encryption operation generates unique cryptographic material.Algorithm chain:
Generate 32-byte random salt
Derive encryption key via HKDF-SHA256 from master key + salt
Generate 12-byte random IV (AES-GCM standard nonce)
Encrypt with AES-256-GCM
Output: ciphertext, IV, 16-byte authentication tag, salt
EncryptedSecret fields:
Field
Type
Size
Description
ciphertext
Buffer
Variable
AES-256-GCM encrypted data
iv
Buffer
12 bytes
Initialization vector (AES-GCM standard nonce)
authTag
Buffer
16 bytes
GCM authentication tag
salt
Buffer
32 bytes
Random salt for HKDF key derivation
Master key requirements: At least 32 bytes, provided as hex (64+ chars) or
base64 (44+ chars). Only the first 32 bytes are used.
Source: createSecretsCrypto() and EncryptedSecret in packages/core/src/security/secret-crypto.ts
Multi-source resolver for SecretRef values in YAML config. Three providers,
all routed through SecretManager for the env case so every credential read
is auditable.
Maps bearer tokens to client identities and scopes. Uses
crypto.timingSafeEqual() for constant-time comparison and compares
every entry in the token store even after a match, to prevent
timing-based enumeration of valid tokens.
Component
Behavior
Comparison
crypto.timingSafeEqual(), constant-time
Iteration
All entries scanned even on hit
Wildcard scope
"*" grants all scopes
Scope check
checkScope(token, requested) returns boolean
Source: createTokenStore(), checkScope(), extractBearerToken() in packages/gateway/src/auth/token-auth.ts
Validates server certificate, key, and CA at startup; verifies clients
against the configured CA at connection time. All validation happens via
Node’s built-in X509Certificate class — no third-party crypto.
Check
When
Server cert PEM format and expiry
Daemon startup (fail-fast)
Server key PEM format
Daemon startup
CA cert PEM format and expiry
Daemon startup
Client cert validation
Per connection (Hono TLS layer)
Client CN extraction
Per connection (extractClientCN)
Source: validateCertificates(), extractClientCN() in packages/gateway/src/auth/mtls-verifier.ts
The credential store backend is selected globally via security.storage.
Setting
Type
Default
Description
security.storage
"encrypted" | "file" | "env"
"encrypted"
Credential storage backend: encrypted (AES-256-GCM SQLite), file (plaintext JSON at 0600), or env (read-only, reads .env/process.env). The database path is fixed at <dataDir>/secrets.db and is not configurable.
Source: SecurityConfigSchema in packages/core/src/config/schema-security.ts