Grep vs Semantic Search for Coding Agents

When your AI coding agent needs to answer a question about your documentation, it has two options: grep through files and read the matches, or call a semantic search tool. The difference in token cost is dramatic — and we measured it.

A real query, two approaches

We asked an agent: "how does chunking work?" against the Polaris documentation (11 markdown files, 206 indexed chunks).

Grep + Read

$ grep -rn "chunk" docs/

docs/architecture.md:42: search.rs — SearchEngine borrows both...
docs/polaris-indexing.md:18: chunk_markdown() in indexer.rs...
(16 more matches across 8 files)

$ cat docs/architecture.md → ~8,000 tokens

$ cat docs/polaris-indexing.md → ~4,000 tokens

Total: ~12,700 tokens

The agent reads two entire files. Most of the content is irrelevant — the answer was in one section of one file.

Polaris Semantic Search

$ polaris search "chunking pipeline"

→ polaris-indexing.md §Markdown Chunking · score 0.94
→ architecture.md §indexing · score 0.62

Total: ~285 tokens (45× cheaper)

Polaris returns two ranked sections, not files. The agent gets exactly the paragraphs about chunking — nothing else.

How they work

	Grep + Read	Polaris
Matching	Exact substring	BM25 keywords + vector embeddings
Returns	Entire files	Ranked sections (~200–450 tokens each)
Synonyms	No — "embed" won't find "vectorize"	Yes — embeddings capture meaning
Ranking	None (file order)	RRF fusion + heading boost + MMR diversity
Cloud required	No	No — ONNX model runs locally
API keys	None	None

Measured across three real projects

Each row is a real query run against a real codebase. The grep column counts the tokens the agent actually consumed (grep output + file reads). The Polaris column is the MCP search response with top_k=2.

Polaris docs

11 docs · 206 chunks

Query: "how does chunking work?"

grep + read 12,700 tok

polaris 285 tok

45×

Mid-size OSS library

~80 docs · ~2k chunks

Query: "how do I add a custom transport?"

grep + read 22,000 tok

polaris 520 tok

42×

React documentation

hundreds of pages · ~10k chunks

Query: "useEffect cleanup with abort signal"

grep + read 38,000 tok

polaris 640 tok

59×

See the interactive step-by-step demo on the home page →

Why this matters for your token bill

A typical coding session involves 30+ documentation lookups. If each grep-and-read costs ~12,000 tokens, that's 360,000 tokens per session just for context-gathering — before the agent even starts reasoning.

With Polaris at ~300 tokens per lookup, the same 30 queries cost 9,000 tokens. That's a 40× reduction in the tokens your agent burns before it even starts thinking.

At current API pricing, this translates to $41–$123 saved per developer per month, depending on codebase size and query volume.

Ready to switch from grep to semantic search?

Install Polaris