How search works
Comis uses two search methods and combines their results: Text matching — Comis looks for memories that contain the same words as the search query. This is fast and precise for exact topics. If you ask about “project deadlines,” text matching finds memories that literally contain the words “project” and “deadlines.” Meaning matching — Comis compares the meaning of your query against stored memories, finding related information even if the exact words are different. If you ask about “project deadlines,” meaning matching might also find a memory about “task due dates” because the meaning is similar. This uses embeddings to understand meaning. Combined results — The lanes are merged into a single ranked list, duplicates are removed, and the most relevant results are ranked to the top. This combination means your agent benefits from both precise word matching and flexible meaning matching. When the optional entity lane is enabled, its associative hits join the same merge as a third lane.When search is used
Memory search happens automatically in two situations:- During memory recall (RAG) — Before every agent response, Comis searches for relevant memories and includes them in the agent’s context. See Memory Recall (RAG) for how this works.
- Through memory tools — Your agent can explicitly search its memory as part of its reasoning process. For example, if you ask “what did we decide about the logo?” the agent might search its memory for previous discussions about logos.
Search quality
The quality of search results depends on which search methods are available:- Both methods active (recommended) — The best results come from combining text matching and meaning matching. This is the default when embeddings are configured.
- Text matching only — If embeddings are not available, search still works but only finds memories with matching words. A search for “pasta recipe” would not find a memory about “how to cook spaghetti” because the words are different even though the meaning is the same.
auto provider handles this automatically.
If the meaning-matching system is not available (embeddings not configured),
Comis falls back to text matching only. Search still works — it is just less
flexible at finding related information when different words are used.
Relevance scoring
Each search result gets a relevance score between 0 and 1. Higher scores mean a closer match to your query. The RAG configuration lets you set a minimum score threshold (minScore, default 0.1) so only sufficiently
relevant memories are included.
You generally do not need to adjust the scoring — the defaults work well for
most use cases. If your agent is recalling too many irrelevant memories,
increase minScore. If it is missing important context, lower it.
Reference: how the lanes are merged
The hybrid search lives inpackages/memory/src/hybrid-search.ts. Several
retrieval lanes combine into the final ranking:
- FTS5 BM25 — The full-text index (
memory_fts) returns a BM25 rank per match (lower is better). Stop words are filtered for Latin scripts; Unicode-aware tokenisation handles CJK, Cyrillic, and Arabic without stripping content. - sqlite-vec cosine KNN — When embeddings exist for stored memories, a
vector KNN query against
vec_memoriesreturns a similarity-ranked list. - Entity lane (opt-in) — When
rag.entityLane.enabledis on, a one-hop self-join over shared linked entities contributes a third ranked list. See Memory. - N-lane Reciprocal Rank Fusion (RRF) — All active lanes are merged using
the classic RRF formula
score = 1 / (k + rank)withk = 60. RRF is robust against the lanes’ score scales being incompatible, and adding or dropping a lane (e.g. the entity lane, or vector when no embeddings exist) just changes how many lists feed the same formula.
sqlite-vec is not available or the query has no embedding, search falls
back gracefully to FTS5 only.
Cross-encoder reranking (opt-in)
After fusion, recall can optionally rerank the top fused candidates with a cross-encoder before scoring. This stage is opt-in (rag.rerank.enabled
defaults to false), so by default the fused RRF order is the recall order.
When rag.rerank.enabled: true:
- The top fused candidates (cap
rag.rerank.maxCandidates, default40) are re-scored by an on-device cross-encoder (bge-reranker-v2-m3Q8_0, run throughnode-llama-cpp— the same runtime used for local embeddings). - A cross-encoder reads the query and a candidate together and emits a direct relevance score, so after reranking the cross-encoder score becomes the primary order and the fused RRF order is discarded.
- The stage is bounded by
rag.rerank.timeoutMs(default800); if it times out or the reranker is unavailable, recall falls back to the fused order.
Filters supported
Search results can be filtered by:tenantId— multi-tenant isolation (required, set by the caller).agentId— restrict to memories owned by one agent.trustLevel—system,learned, orexternal(see Memory).memoryType—working,episodic,semantic, orprocedural.expiresAt— already-expired entries are silently excluded.
Calling search from outside the agent
The web dashboard and external integrations call search through the JSON-RPC endpoint. The handler is registered asmemory.search:
fts, vec, or both), and the underlying memory record. The memory.browse
RPC offers a non-search paginated view of memories when you just want to list
them by date.
Configuration
Search does not have its own configuration section. Search behavior is controlled through:- Memory — Where memories are stored and how they are organized.
- RAG — How many results to include, minimum relevance score, and trust level filtering.
- Embeddings — Whether meaning matching is available and which provider powers it.
Memory
The different types of memories your agent stores.
Embeddings
Setting up meaning-based search for better results.
Memory Recall (RAG)
How recalled memories are used in agent responses.
