How memory recall works
Every time you send a message, your agent runs a single recall orchestrator —MemoryRecall — that searches, fuses, optionally reranks, scores, filters, and
de-duplicates its long-term memory before answering:
- Search — Comis runs every retrieval lane in parallel: keyword (FTS5), meaning (vector, when embeddings exist), and the optional entity lane.
- Fuse — the lanes are merged with Reciprocal Rank Fusion (RRF,
k=60). Fusion order is the default recall ordering. - Rerank (opt-in, default skipped) — when enabled, a cross-encoder re-scores the top fused candidates for sharper relevance.
- Score — recency, temporal, proof, and trust boosts are applied.
- Trust-filter — only the eligible trust levels survive (
externalexcluded by default). - Dedup — near-duplicates are collapsed and the result is trimmed to
maxResults/maxContextChars, then injected into the agent’s context.
What gets recalled
Memories are ranked by relevance. Only the top results (up tomaxResults) that
meet a minimum relevance score (minScore) are included in the agent’s context.
This keeps the agent focused on the most useful information rather than flooding
it with everything it has ever seen.
The total amount of memory context is also capped by maxContextChars (default
4000 characters). This prevents memories from taking up too much of the agent’s
thinking space, leaving room for your current conversation and the agent’s
instructions.
Each memory also has a trust level. By default, only system memories
(created by Comis itself) and learned memories (from your conversations) are
included. Memories tagged as external (from outside sources like web
searches) are excluded by default for safety. You can change this by adjusting
includeTrustLevels.
If a recalled memory comes from an external source, it is marked with a
warning label so the agent knows to treat it with appropriate caution.
When is memory recall used?
Memory recall runs automatically before every agent response. You do not need to do anything to activate it — as long asrag.enabled is true (the
default), your agent will check its memory every time.
You can also configure your agent to use memory tools directly. These tools
let the agent search, store, and manage memories as part of its reasoning
process. See Memory for details on the types of memories
your agent can store and retrieve.
How retrieval works under the hood
For developers and operators who want to know what is actually happening, theMemoryRecall orchestrator runs these stages:
- Query formulation — The current user message and recent conversation context are used as the search query.
- Search (N lanes) — The query fans out to every retrieval lane on the
memory store: SQLite FTS5 (BM25 text ranking), vector
similarity (cosine distance over embeddings) when an embedding provider is
configured, and the optional entity lane when
rag.entityLane.enabledis on. - Fuse (N-lane RRF) — The lanes are merged with Reciprocal Rank Fusion
(
score = 1 / (k + rank),k=60), which is robust against the lanes’ incompatible score scales. Fusion order is the default recall order. - Rerank (opt-in, default skipped) — When
rag.rerank.enabledistrue, the top fused candidates (capmaxCandidates, default40) are re-scored by an on-device cross-encoder under an800mstimeout; on timeout or unavailability recall falls back to the fused order. Defaultfalse, so this stage is normally skipped. - Score — Multiplicative boosts are applied to the reranked-or-fused
score:
recencyAlpha,temporalAlpha,proofAlpha, andtrustAlpha(therag.scoring.*knobs). Trust here is a ranking signal with the tie-breaksystem > learned > external, not only a filter. - Trust-filter — Hits are filtered by
includeTrustLevels. By default, onlysystemandlearnedmemories pass;externalmemories (web pages, third-party API responses) are excluded unless you opt in. - Dedup + budget — Hits are deduplicated by content fingerprint (first 200
characters), then appended to a memory section until
maxResultsormaxContextCharsis reached. If a single hit would push the section over budget, it is dropped (not truncated). - Provenance and sanitization — Each included memory is formatted with its date, trust level, and source. External content is wrapped in safety markers so the model treats it as untrusted input.
- Injection — The formatted memory section is appended to the system prompt as the dynamic preamble (so it never invalidates the cached prefix — see Compaction).
packages/agent/src/rag/memory-recall.ts, and the
underlying lane search engine lives in packages/memory/src/hybrid-search.ts.
What gets indexed
RAG only retrieves what has been written to memory. Three things write to memory automatically:- System facts — Hardcoded knowledge configured by you or platform admins.
- Learned facts — Extracted from past conversations by the background
memory review job (
packages/agent/src/memory/memory-review-job.ts). - Tool-stored facts — Anything your agent saves via
memory_storeduring a conversation.
external trust-level entries only if
your agent explicitly stores them.
Configuration
| Option | Type | Default | What it does |
|---|---|---|---|
rag.enabled | boolean | true | Enable automatic memory recall before each response |
rag.maxResults | number | 5 | Maximum number of memories to include |
rag.maxContextChars | number | 4000 | Maximum characters of memory context to add |
rag.minScore | number | 0.1 | Minimum relevance score (0-1) for a memory to be included |
rag.includeTrustLevels | array | ["system", "learned"] | Which trust levels to include in recall |
rag.rerank.enabled | boolean | false | Opt-in cross-encoder reranking (see Memory) |
rag.rerank.maxCandidates | number | 40 | Max candidates re-scored when rerank is enabled |
rag.rerank.timeoutMs | number | 800 | Rerank wall-clock budget; on timeout falls back to fusion order |
rag.scoring.recencyAlpha | number | 0.2 | Recency (record-time) boost weight |
rag.scoring.temporalAlpha | number | 0.2 | Event-time proximity boost weight |
rag.scoring.proofAlpha | number | 0.1 | Proof-count (consolidation) boost weight |
rag.scoring.trustAlpha | number | 0.1 | Trust-level boost weight + system > learned > external tie-break |
rag.entityLane.enabled | boolean | false | Opt-in one-hop entity associative lane (see Memory) |
rag.entityLane.seedCount | number | 5 | Top hits that seed the entity self-join |
rag.entityLane.perEntityCap | number | 200 | Max shared-entity neighbours the lane returns |
~/.comis/config.yaml
Memory
The different types of memories your agent stores.
Search
How Comis finds relevant memories using text and meaning matching.
Embeddings
Setting up meaning-based search for better recall.
