Web Tools - Comis

What it does: Gives agents internet access — web_search queries search engines, web_fetch reads page content. Both tools include security protections against accessing internal networks. Who it’s for: Pretty much every agent. Web tools are the most commonly used agent capabilities — they let your agent look up current information, read documentation, check prices, verify facts, and stay informed about topics that go beyond its training data. Both tools are enabled by default in the full tool policy profile and ship in the cron-minimal profile so scheduled jobs can call web_search out of the box.

web_search — Multi-Provider Web Search

The web_search tool queries a search engine and returns results. Comis supports 8 search providers — you can configure one or more as a fallback chain.

Parameters

Parameter	Type	Required	Description
`query`	string	Yes	The search query
`count`	number	No	Number of results to return (1-10)
`deepFetch`	number	No	Auto-fetch full content for the top N result pages (0-5, default 0). When > 0, each result includes inline `fullContent` so you can skip a separate `web_fetch` call.
`country`	string	No	2-letter country code for region-specific results (e.g., `DE`, `US`). Default: `US`.
`search_lang`	string	No	ISO language code for results (e.g., `de`, `en`, `fr`)
`freshness`	string	No	Filter results by discovery time. Shortcuts: `pd` (past day), `pw` (past week), `pm` (past month), `py` (past year). Date range `YYYY-MM-DDtoYYYY-MM-DD` is supported by Brave, Tavily, and Exa. Supported by Brave, DuckDuckGo, Tavily, Exa, and SearXNG; ignored by grok, perplexity, jina.
`provider`	string	No	Override search provider for this call (`brave`, `perplexity`, `grok`, `duckduckgo`, `searxng`, `tavily`, `exa`, `jina`). Defaults to the configured provider with fallback chain.

Search Providers

Comis supports 8 search providers. Each has different strengths, and some require API keys while others are free.

Provider	API Key Required	Description
`brave`	Yes	Brave Search API — fast, privacy-focused
`perplexity`	Yes	Perplexity AI via OpenRouter — AI-enhanced results
`grok`	Yes	xAI Responses API — Grok-powered search
`duckduckgo`	No	DuckDuckGo — no API key needed, uses HTML scraping
`searxng`	No	SearXNG — self-hosted meta-search engine
`tavily`	Yes	Tavily — AI-optimized search API
`exa`	Yes	Exa — neural search for precise results
`jina`	Yes	Jina Reader Search — content-aware search

Provider Details

Brave Search

Fast, privacy-focused search powered by Brave’s independent search index.API key: Required. Set BRAVE_API_KEY in your environment.Best for: General-purpose searching with strong privacy. A good default choice for most setups.How to get a key: Sign up at brave.com/search/api for a free tier with 2,000 queries per month.

Perplexity

AI-enhanced search results delivered through OpenRouter. Returns synthesized answers alongside traditional search results.API key: Required. Set OPENROUTER_API_KEY in your environment.Best for: Questions that benefit from AI synthesis, such as “explain the differences between X and Y” or “what are the pros and cons of Z.”

Grok

Powered by xAI’s Grok model via the Responses API. Provides real-time search with AI-generated summaries.API key: Required. Set XAI_API_KEY in your environment.Best for: Real-time information and current events. Grok’s training data is frequently updated.

DuckDuckGo

Free search with no API key required. Uses HTML scraping to retrieve results from DuckDuckGo.API key: Not required.Best for: A zero-configuration fallback. Works immediately without any setup. Results may be less structured than API-based providers.

DuckDuckGo uses HTML scraping rather than an official API, so result formatting may vary. It is a reliable fallback when API-based providers are unavailable.

SearXNG

A self-hosted meta-search engine that aggregates results from multiple search engines. You run your own SearXNG instance.API key: Not required, but you need a running SearXNG instance. Set SEARXNG_URL to point to your instance (e.g., http://localhost:8888).Best for: Privacy-conscious setups where you want full control over the search infrastructure. SearXNG queries multiple engines (Google, Bing, DuckDuckGo, etc.) on your behalf without sending your data to third parties.

Tavily

Purpose-built search API designed for AI agents. Returns clean, structured results optimized for LLM consumption.API key: Required. Set TAVILY_API_KEY in your environment.Best for: Agents that need well-structured search results. Tavily is specifically designed for AI-agent use cases and returns content pre-formatted for LLM processing.

Exa

Neural search engine that understands meaning, not just keywords. Finds results based on semantic similarity to your query.API key: Required. Set EXA_API_KEY in your environment.Best for: Research-style queries where you need precise, relevant results. Exa excels at finding specific technical content, academic papers, and niche topics.

Jina

Content-aware search and reading service. Retrieves search results with extracted, readable content.API key: Required. Set JINA_API_KEY in your environment.Best for: Queries where you need the full content of search results, not just titles and snippets. Jina extracts and returns the readable text from each result page.

Configuring Search Providers

Configure your preferred search providers in config.yaml. Providers are tried in order — if the first one fails, the next is attempted automatically.

skills:
  builtinTools:
    webSearch:
      providers:
        - brave
        - duckduckgo   # Free fallback if Brave fails

You can list as many providers as you like. The agent uses the first available provider by default, or you can specify a provider in the tool call.

Environment Variables Reference

Provider	Environment Variable
Brave	`BRAVE_API_KEY`
Perplexity	`OPENROUTER_API_KEY`
Grok	`XAI_API_KEY`
DuckDuckGo	(none required)
SearXNG	`SEARXNG_URL`
Tavily	`TAVILY_API_KEY`
Exa	`EXA_API_KEY`
Jina	`JINA_API_KEY`

A good starter configuration is brave as the primary provider (fast and reliable) with duckduckgo as a free fallback that requires no API key.

web_fetch — Page Content Fetching

The web_fetch tool retrieves content from a URL and extracts the readable text using Readability — the same technology behind browser reader modes. This strips away navigation, ads, and other clutter, leaving just the main content.

Parameters

Parameter	Type	Required	Description
`url`	string	Yes	HTTP or HTTPS URL to fetch
`extractMode`	string	No	Extraction mode: `markdown` (default) or `text`
`maxChars`	number	No	Maximum characters to return (default: 50000, clamped by configured cap)

How It Works

The URL is validated against SSRF (Server-Side Request Forgery) rules
The page content is downloaded
Readability extracts the main article or content area
The clean text is returned to the agent

Important Notes

SSRF protection — web_fetch cannot access internal network addresses (localhost, private IPs like 10.x.x.x or 192.168.x.x, or cloud metadata endpoints like 169.254.169.254). This prevents attacks where a crafted URL tricks the tool into reaching your internal services.
Content truncation — Very large pages are truncated to prevent overwhelming the agent’s context window. The tool returns the most important content within the size limit.
Clean output — The Readability extraction removes navigation menus, sidebars, ads, and other non-content elements. The result is clean, readable text similar to what you see in a browser’s “Reader Mode.”

web_fetch retrieves content from the open internet. Be cautious about fetching URLs provided by untrusted users, as the content could contain misleading or harmful information. SSRF protection prevents access to internal resources, but the fetched content itself is not sanitized for truthfulness.

web_search vs. web_fetch

These two tools complement each other:

Use Case	Tool	Why
Find information about a topic	`web_search`	Returns multiple search results with titles and snippets
Read a specific web page	`web_fetch`	Downloads and extracts the full content of a single URL
Research a question	Both	Search first to find relevant pages, then fetch the most promising results for detailed reading

Agents often use both tools together: web_search to find relevant URLs, then web_fetch to read the full content of the best results. For pages that require JavaScript rendering or interactive navigation (single-page apps, sites behind login screens), use the Browser tool instead — it provides a full headless browser.

End-to-end example: fact-check workflow

A common pattern: a user makes a claim and asks the agent to verify it. The agent searches for primary sources, then fetches the most authoritative result for detailed reading.

You: Verify the claim that the EU AI Act took effect on August 1, 2024.
     Quote the official source.

The agent runs a two-step workflow:

# 1. Search for primary sources
tool: web_search
query: "EU AI Act effective date official Commission"
provider: brave
freshness: py
count: 5

The search returns results from europa.eu, EUR-Lex, and a few news outlets. The agent picks the EUR-Lex link because it is the canonical source.

# 2. Fetch the full content of the official document
tool: web_fetch
url: "https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ:L_202401689"
extractMode: markdown
maxChars: 20000

web_fetch returns the cleaned Readability extract. The agent quotes the relevant article and entry-into-force clause, links the source, and presents the verified answer to the user. For sources that need fresh data (rate updates, market prices, breaking news) but are not yet indexed by search engines, set deepFetch: 3 on the web_search call — the tool will inline the top three results’ full content, sparing a separate web_fetch.

SSRF protection

Both web_fetch and web_search validate URLs against the SSRF guard before sending any HTTP request. The guard rejects:

localhost, 127.0.0.0/8, ::1
RFC 1918 private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
Link-local (169.254.0.0/16) including AWS / GCP / Azure metadata endpoints
IPv6 unique-local (fc00::/7) and link-local (fe80::/10)

The browser tool applies the same guard on navigate and open. Comis treats SSRF as a defense-in-depth concern: the network layer and the URL validator both enforce it, so a misconfiguration in one does not bypass the other. See Security for the full picture.

Browser

Full browser automation for interactive web tasks

Built-in Tools

All built-in tools including web tools

Agent Tools Overview

See all available agent tools

Config Reference

Search provider and web tool configuration options

​web_search — Multi-Provider Web Search

​Parameters

​Search Providers

​Provider Details

​Configuring Search Providers

​Environment Variables Reference

​web_fetch — Page Content Fetching

​Parameters

​How It Works

​Important Notes

​web_search vs. web_fetch

​End-to-end example: fact-check workflow

​SSRF protection

​Related

Browser

Built-in Tools

Agent Tools Overview

Config Reference

web_search — Multi-Provider Web Search

Parameters

Search Providers

Provider Details

Configuring Search Providers

Environment Variables Reference

web_fetch — Page Content Fetching

Parameters

How It Works

Important Notes

web_search vs. web_fetch

End-to-end example: fact-check workflow

SSRF protection

Related