Skip to main content
What it does: Gives agents internet access — web_search queries search engines, web_fetch reads page content. Both tools include security protections against accessing internal networks. Who it’s for: Pretty much every agent. Web tools are the most commonly used agent capabilities — they let your agent look up current information, read documentation, check prices, verify facts, and stay informed about topics that go beyond its training data. Both tools are enabled by default in the full tool policy profile and ship in the cron-minimal profile so scheduled jobs can call web_search out of the box. The web_search tool queries a search engine and returns results. Comis supports 8 search providers — you can configure one or more as a fallback chain.

Parameters

ParameterTypeRequiredDescription
querystringYesThe search query
countnumberNoNumber of results to return (1-10)
deepFetchnumberNoAuto-fetch full content for the top N result pages (0-5, default 0). When > 0, each result includes inline fullContent so you can skip a separate web_fetch call.
countrystringNo2-letter country code for region-specific results (e.g., DE, US). Default: US.
search_langstringNoISO language code for results (e.g., de, en, fr)
freshnessstringNoFilter results by discovery time. Shortcuts: pd (past day), pw (past week), pm (past month), py (past year). Date range YYYY-MM-DDtoYYYY-MM-DD is supported by Brave, Tavily, and Exa. Supported by Brave, DuckDuckGo, Tavily, Exa, and SearXNG; ignored by grok, perplexity, jina.
providerstringNoOverride search provider for this call (brave, perplexity, grok, duckduckgo, searxng, tavily, exa, jina). Defaults to the configured provider with fallback chain.

Search Providers

Comis supports 8 search providers. Each has different strengths, and some require API keys while others are free.
ProviderAPI Key RequiredDescription
braveYesBrave Search API — fast, privacy-focused
perplexityYesPerplexity AI via OpenRouter — AI-enhanced results
grokYesxAI Responses API — Grok-powered search
duckduckgoNoDuckDuckGo — no API key needed, uses HTML scraping
searxngNoSearXNG — self-hosted meta-search engine
tavilyYesTavily — AI-optimized search API
exaYesExa — neural search for precise results
jinaYesJina Reader Search — content-aware search

Provider Details

AI-enhanced search results delivered through OpenRouter. Returns synthesized answers alongside traditional search results.API key: Required. Set OPENROUTER_API_KEY in your environment.Best for: Questions that benefit from AI synthesis, such as “explain the differences between X and Y” or “what are the pros and cons of Z.”
Powered by xAI’s Grok model via the Responses API. Provides real-time search with AI-generated summaries.API key: Required. Set XAI_API_KEY in your environment.Best for: Real-time information and current events. Grok’s training data is frequently updated.
Free search with no API key required. Uses HTML scraping to retrieve results from DuckDuckGo.API key: Not required.Best for: A zero-configuration fallback. Works immediately without any setup. Results may be less structured than API-based providers.
DuckDuckGo uses HTML scraping rather than an official API, so result formatting may vary. It is a reliable fallback when API-based providers are unavailable.
A self-hosted meta-search engine that aggregates results from multiple search engines. You run your own SearXNG instance.API key: Not required, but you need a running SearXNG instance. Set SEARXNG_URL to point to your instance (e.g., http://localhost:8888).Best for: Privacy-conscious setups where you want full control over the search infrastructure. SearXNG queries multiple engines (Google, Bing, DuckDuckGo, etc.) on your behalf without sending your data to third parties.
Purpose-built search API designed for AI agents. Returns clean, structured results optimized for LLM consumption.API key: Required. Set TAVILY_API_KEY in your environment.Best for: Agents that need well-structured search results. Tavily is specifically designed for AI-agent use cases and returns content pre-formatted for LLM processing.
Neural search engine that understands meaning, not just keywords. Finds results based on semantic similarity to your query.API key: Required. Set EXA_API_KEY in your environment.Best for: Research-style queries where you need precise, relevant results. Exa excels at finding specific technical content, academic papers, and niche topics.
Content-aware search and reading service. Retrieves search results with extracted, readable content.API key: Required. Set JINA_API_KEY in your environment.Best for: Queries where you need the full content of search results, not just titles and snippets. Jina extracts and returns the readable text from each result page.

Configuring Search Providers

Configure your preferred search providers in config.yaml. Providers are tried in order — if the first one fails, the next is attempted automatically.
skills:
  builtinTools:
    webSearch:
      providers:
        - brave
        - duckduckgo   # Free fallback if Brave fails
You can list as many providers as you like. The agent uses the first available provider by default, or you can specify a provider in the tool call.

Environment Variables Reference

ProviderEnvironment Variable
BraveBRAVE_API_KEY
PerplexityOPENROUTER_API_KEY
GrokXAI_API_KEY
DuckDuckGo(none required)
SearXNGSEARXNG_URL
TavilyTAVILY_API_KEY
ExaEXA_API_KEY
JinaJINA_API_KEY
A good starter configuration is brave as the primary provider (fast and reliable) with duckduckgo as a free fallback that requires no API key.

web_fetch — Page Content Fetching

The web_fetch tool retrieves content from a URL and extracts the readable text using Readability — the same technology behind browser reader modes. This strips away navigation, ads, and other clutter, leaving just the main content.

Parameters

ParameterTypeRequiredDescription
urlstringYesHTTP or HTTPS URL to fetch
extractModestringNoExtraction mode: markdown (default) or text
maxCharsnumberNoMaximum characters to return (default: 50000, clamped by configured cap)

How It Works

  1. The URL is validated against SSRF (Server-Side Request Forgery) rules
  2. The page content is downloaded
  3. Readability extracts the main article or content area
  4. The clean text is returned to the agent

Important Notes

  • SSRF protectionweb_fetch cannot access internal network addresses (localhost, private IPs like 10.x.x.x or 192.168.x.x, or cloud metadata endpoints like 169.254.169.254). This prevents attacks where a crafted URL tricks the tool into reaching your internal services.
  • Content truncation — Very large pages are truncated to prevent overwhelming the agent’s context window. The tool returns the most important content within the size limit.
  • Clean output — The Readability extraction removes navigation menus, sidebars, ads, and other non-content elements. The result is clean, readable text similar to what you see in a browser’s “Reader Mode.”
web_fetch retrieves content from the open internet. Be cautious about fetching URLs provided by untrusted users, as the content could contain misleading or harmful information. SSRF protection prevents access to internal resources, but the fetched content itself is not sanitized for truthfulness.

web_search vs. web_fetch

These two tools complement each other:
Use CaseToolWhy
Find information about a topicweb_searchReturns multiple search results with titles and snippets
Read a specific web pageweb_fetchDownloads and extracts the full content of a single URL
Research a questionBothSearch first to find relevant pages, then fetch the most promising results for detailed reading
Agents often use both tools together: web_search to find relevant URLs, then web_fetch to read the full content of the best results. For pages that require JavaScript rendering or interactive navigation (single-page apps, sites behind login screens), use the Browser tool instead — it provides a full headless browser.

End-to-end example: fact-check workflow

A common pattern: a user makes a claim and asks the agent to verify it. The agent searches for primary sources, then fetches the most authoritative result for detailed reading.
You: Verify the claim that the EU AI Act took effect on August 1, 2024.
     Quote the official source.
The agent runs a two-step workflow:
# 1. Search for primary sources
tool: web_search
query: "EU AI Act effective date official Commission"
provider: brave
freshness: py
count: 5
The search returns results from europa.eu, EUR-Lex, and a few news outlets. The agent picks the EUR-Lex link because it is the canonical source.
# 2. Fetch the full content of the official document
tool: web_fetch
url: "https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ:L_202401689"
extractMode: markdown
maxChars: 20000
web_fetch returns the cleaned Readability extract. The agent quotes the relevant article and entry-into-force clause, links the source, and presents the verified answer to the user. For sources that need fresh data (rate updates, market prices, breaking news) but are not yet indexed by search engines, set deepFetch: 3 on the web_search call — the tool will inline the top three results’ full content, sparing a separate web_fetch.

SSRF protection

Both web_fetch and web_search validate URLs against the SSRF guard before sending any HTTP request. The guard rejects:
  • localhost, 127.0.0.0/8, ::1
  • RFC 1918 private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
  • Link-local (169.254.0.0/16) including AWS / GCP / Azure metadata endpoints
  • IPv6 unique-local (fc00::/7) and link-local (fe80::/10)
The browser tool applies the same guard on navigate and open. Comis treats SSRF as a defense-in-depth concern: the network layer and the URL validator both enforce it, so a misconfiguration in one does not bypass the other. See Security for the full picture.

Browser

Full browser automation for interactive web tasks

Built-in Tools

All built-in tools including web tools

Agent Tools Overview

See all available agent tools

Config Reference

Search provider and web tool configuration options