Grep vs Semantic Search for Coding Agents

When your AI coding agent needs to answer a question about your documentation, it has two options: grep through files and read the matches, or call a semantic search tool. The difference in token cost is dramatic — and we measured it.

A real query, two approaches

We asked an agent: "how does chunking work?" against the Polaris documentation (11 markdown files, 206 indexed chunks).

Grep + Read

$ grep -rn "chunk" docs/
docs/architecture.md:42: search.rs — SearchEngine borrows both...
docs/polaris-indexing.md:18: chunk_markdown() in indexer.rs...
(16 more matches across 8 files)
$ cat docs/architecture.md → ~8,000 tokens
$ cat docs/polaris-indexing.md → ~4,000 tokens
Total: ~12,700 tokens

The agent reads two entire files. Most of the content is irrelevant — the answer was in one section of one file.

Polaris Semantic Search

$ polaris search "chunking pipeline"
polaris-indexing.md §Markdown Chunking · score 0.94
architecture.md §indexing · score 0.62
Total: ~285 tokens (45× cheaper)

Polaris returns two ranked sections, not files. The agent gets exactly the paragraphs about chunking — nothing else.

How they work

Grep + ReadPolaris
MatchingExact substringBM25 keywords + vector embeddings
ReturnsEntire filesRanked sections (~200–450 tokens each)
SynonymsNo — "embed" won't find "vectorize"Yes — embeddings capture meaning
RankingNone (file order)RRF fusion + heading boost + MMR diversity
Cloud requiredNoNo — ONNX model runs locally
API keysNoneNone

Measured across three real projects

Each row is a real query run against a real codebase. The grep column counts the tokens the agent actually consumed (grep output + file reads). The Polaris column is the MCP search response with top_k=2.

Polaris docs

11 docs · 206 chunks

Query: "how does chunking work?"

grep + read 12,700 tok
polaris 285 tok
45×

Mid-size OSS library

~80 docs · ~2k chunks

Query: "how do I add a custom transport?"

grep + read 22,000 tok
polaris 520 tok
42×

React documentation

hundreds of pages · ~10k chunks

Query: "useEffect cleanup with abort signal"

grep + read 38,000 tok
polaris 640 tok
59×

See the interactive step-by-step demo on the home page →

Why this matters for your token bill

A typical coding session involves 30+ documentation lookups. If each grep-and-read costs ~12,000 tokens, that's 360,000 tokens per session just for context-gathering — before the agent even starts reasoning.

With Polaris at ~300 tokens per lookup, the same 30 queries cost 9,000 tokens. That's a 40× reduction in the tokens your agent burns before it even starts thinking.

At current API pricing, this translates to $120–$370 saved per developer per month, depending on codebase size and query volume.

Ready to switch from grep to semantic search?

Install Polaris