Projects | Sahil Ashar

LLM Tool-Selection Benchmark (Search-Agent Harness)

June 12, 2026 in-progress

A search-agent evaluation harness that measures how well LLMs pick the right retrieval tool with the right arguments — scored on Cost Per Correct (CPC).

#ai-infra #retrieval #evaluation #llm