web_search queries search engines, web_fetch reads page content. Both tools include security protections against accessing internal networks.
Who it’s for: Pretty much every agent. Web tools are the most commonly used agent capabilities — they let your agent look up current information, read documentation, check prices, verify facts, and stay informed about topics that go beyond its training data.
Both tools are enabled by default in the full tool policy profile and ship in the cron-minimal profile so scheduled jobs can call web_search out of the box.
web_search — Multi-Provider Web Search
Theweb_search tool queries a search engine and returns results. Comis supports 8 search providers — you can configure one or more as a fallback chain.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | The search query |
count | number | No | Number of results to return (1-10) |
deepFetch | number | No | Auto-fetch full content for the top N result pages (0-5, default 0). When > 0, each result includes inline fullContent so you can skip a separate web_fetch call. |
country | string | No | 2-letter country code for region-specific results (e.g., DE, US). Default: US. |
search_lang | string | No | ISO language code for results (e.g., de, en, fr) |
freshness | string | No | Filter results by discovery time. Shortcuts: pd (past day), pw (past week), pm (past month), py (past year). Date range YYYY-MM-DDtoYYYY-MM-DD is supported by Brave, Tavily, and Exa. Supported by Brave, DuckDuckGo, Tavily, Exa, and SearXNG; ignored by grok, perplexity, jina. |
provider | string | No | Override search provider for this call (brave, perplexity, grok, duckduckgo, searxng, tavily, exa, jina). Defaults to the configured provider with fallback chain. |
Search Providers
Comis supports 8 search providers. Each has different strengths, and some require API keys while others are free.| Provider | API Key Required | Description |
|---|---|---|
brave | Yes | Brave Search API — fast, privacy-focused |
perplexity | Yes | Perplexity AI via OpenRouter — AI-enhanced results |
grok | Yes | xAI Responses API — Grok-powered search |
duckduckgo | No | DuckDuckGo — no API key needed, uses HTML scraping |
searxng | No | SearXNG — self-hosted meta-search engine |
tavily | Yes | Tavily — AI-optimized search API |
exa | Yes | Exa — neural search for precise results |
jina | Yes | Jina Reader Search — content-aware search |
Provider Details
Brave Search
Brave Search
Fast, privacy-focused search powered by Brave’s independent search index.API key: Required. Set
BRAVE_API_KEY in your environment.Best for: General-purpose searching with strong privacy. A good default choice for most setups.How to get a key: Sign up at brave.com/search/api for a free tier with 2,000 queries per month.Perplexity
Perplexity
AI-enhanced search results delivered through OpenRouter. Returns synthesized answers alongside traditional search results.API key: Required. Set
OPENROUTER_API_KEY in your environment.Best for: Questions that benefit from AI synthesis, such as “explain the differences between X and Y” or “what are the pros and cons of Z.”Grok
Grok
Powered by xAI’s Grok model via the Responses API. Provides real-time search with AI-generated summaries.API key: Required. Set
XAI_API_KEY in your environment.Best for: Real-time information and current events. Grok’s training data is frequently updated.DuckDuckGo
DuckDuckGo
Free search with no API key required. Uses HTML scraping to retrieve results from DuckDuckGo.API key: Not required.Best for: A zero-configuration fallback. Works immediately without any setup. Results may be less structured than API-based providers.
DuckDuckGo uses HTML scraping rather than an official API, so result formatting may vary. It is a reliable fallback when API-based providers are unavailable.
SearXNG
SearXNG
A self-hosted meta-search engine that aggregates results from multiple search engines. You run your own SearXNG instance.API key: Not required, but you need a running SearXNG instance. Set
SEARXNG_URL to point to your instance (e.g., http://localhost:8888).Best for: Privacy-conscious setups where you want full control over the search infrastructure. SearXNG queries multiple engines (Google, Bing, DuckDuckGo, etc.) on your behalf without sending your data to third parties.Tavily
Tavily
Purpose-built search API designed for AI agents. Returns clean, structured results optimized for LLM consumption.API key: Required. Set
TAVILY_API_KEY in your environment.Best for: Agents that need well-structured search results. Tavily is specifically designed for AI-agent use cases and returns content pre-formatted for LLM processing.Exa
Exa
Neural search engine that understands meaning, not just keywords. Finds results based on semantic similarity to your query.API key: Required. Set
EXA_API_KEY in your environment.Best for: Research-style queries where you need precise, relevant results. Exa excels at finding specific technical content, academic papers, and niche topics.Jina
Jina
Content-aware search and reading service. Retrieves search results with extracted, readable content.API key: Required. Set
JINA_API_KEY in your environment.Best for: Queries where you need the full content of search results, not just titles and snippets. Jina extracts and returns the readable text from each result page.Configuring Search Providers
Configure your preferred search providers inconfig.yaml. Providers are tried in order — if the first one fails, the next is attempted automatically.
Environment Variables Reference
| Provider | Environment Variable |
|---|---|
| Brave | BRAVE_API_KEY |
| Perplexity | OPENROUTER_API_KEY |
| Grok | XAI_API_KEY |
| DuckDuckGo | (none required) |
| SearXNG | SEARXNG_URL |
| Tavily | TAVILY_API_KEY |
| Exa | EXA_API_KEY |
| Jina | JINA_API_KEY |
web_fetch — Page Content Fetching
Theweb_fetch tool retrieves content from a URL and extracts the readable text using Readability — the same technology behind browser reader modes. This strips away navigation, ads, and other clutter, leaving just the main content.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | HTTP or HTTPS URL to fetch |
extractMode | string | No | Extraction mode: markdown (default) or text |
maxChars | number | No | Maximum characters to return (default: 50000, clamped by configured cap) |
How It Works
- The URL is validated against SSRF (Server-Side Request Forgery) rules
- The page content is downloaded
- Readability extracts the main article or content area
- The clean text is returned to the agent
Important Notes
- SSRF protection —
web_fetchcannot access internal network addresses (localhost, private IPs like10.x.x.xor192.168.x.x, or cloud metadata endpoints like169.254.169.254). This prevents attacks where a crafted URL tricks the tool into reaching your internal services. - Content truncation — Very large pages are truncated to prevent overwhelming the agent’s context window. The tool returns the most important content within the size limit.
- Clean output — The Readability extraction removes navigation menus, sidebars, ads, and other non-content elements. The result is clean, readable text similar to what you see in a browser’s “Reader Mode.”
web_search vs. web_fetch
These two tools complement each other:| Use Case | Tool | Why |
|---|---|---|
| Find information about a topic | web_search | Returns multiple search results with titles and snippets |
| Read a specific web page | web_fetch | Downloads and extracts the full content of a single URL |
| Research a question | Both | Search first to find relevant pages, then fetch the most promising results for detailed reading |
web_search to find relevant URLs, then web_fetch to read the full content of the best results.
For pages that require JavaScript rendering or interactive navigation (single-page apps, sites behind login screens), use the Browser tool instead — it provides a full headless browser.
End-to-end example: fact-check workflow
A common pattern: a user makes a claim and asks the agent to verify it. The agent searches for primary sources, then fetches the most authoritative result for detailed reading.web_fetch returns the cleaned Readability extract. The agent quotes the relevant article and entry-into-force clause, links the source, and presents the verified answer to the user.
For sources that need fresh data (rate updates, market prices, breaking news) but are not yet indexed by search engines, set deepFetch: 3 on the web_search call — the tool will inline the top three results’ full content, sparing a separate web_fetch.
SSRF protection
Bothweb_fetch and web_search validate URLs against the SSRF guard before sending any HTTP request. The guard rejects:
localhost,127.0.0.0/8,::1- RFC 1918 private ranges (
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16) - Link-local (
169.254.0.0/16) including AWS / GCP / Azure metadata endpoints - IPv6 unique-local (
fc00::/7) and link-local (fe80::/10)
navigate and open. Comis treats SSRF as a defense-in-depth concern: the network layer and the URL validator both enforce it, so a misconfiguration in one does not bypass the other. See Security for the full picture.
Related
Browser
Full browser automation for interactive web tasks
Built-in Tools
All built-in tools including web tools
Agent Tools Overview
See all available agent tools
Config Reference
Search provider and web tool configuration options
