Subagent-reuse: Open-Source MCP Server Cuts Claude Code Token Waste by Recycling Agent Context

A developer named Amruth has released subagent-reuse, an open-source MCP (Model Context Protocol) server designed to cut down on redundant token consumption in Claude Code's multi-agent workflows. The tool addresses a structural inefficiency in how Claude Code operates: when it spawns subagents — Plan agents for architecture research, Explore agents for codebase search, and general agents for implementation — each one starts from scratch, independently reading the same files and rebuilding context that prior agents have already established. In a single session, the same source files can be read three or more times across different agents, with each read burning tokens on work that has already been done.

Subagent-reuse inserts itself between Claude Code and its subagents by scanning Claude Code's native session storage at ~/.claude/projects/, parsing JSONL transcripts to extract which files each agent has already read or modified. When a new task arrives via the route_task tool, the server scores existing agents using structural signals — file overlap, modification history, directory proximity, and recency — and either routes the task to a suitable existing agent (score of 40 or above triggers a REUSE decision) or spins up a new one. Staleness is tracked using SHA-256 content hashes organized in a Merkle tree, so the system can issue targeted warnings about changed files rather than invalidating an entire context match. Where no strong structural match exists, the server surfaces summaries of existing agents and delegates the final routing judgment to the LLM itself, deliberately sidestepping NLP-based text matching in favor of the model's own semantic reasoning.

Amruth framed subagent-reuse as the middle layer of a three-MCP optimization stack in a companion Medium article. The bottom layer is cocoindex-code, a semantic code search MCP using AST-aware chunking and local embeddings that the author claims can reduce token usage on code exploration by roughly 70% compared to grep-based approaches. The top layer is claude-mem, which handles cross-session memory persistence. Installation requires a single npx subagent-reuse --setup command that auto-configures Claude Code and approves tool permissions. The project exposes seven MCP tools — route_task, get_context, recall, list_agents, register_agent, log_work, and mark_done — and recommends adding instructions to CLAUDE.md to enforce the routing workflow automatically.

The zero-NLP design is the clearest signal of where Amruth is placing his bet. Rather than building a semantic index or training a classifier, the system leans entirely on file-system signals and hands ambiguous cases back to the model. That keeps the routing layer cheap and avoids adding a heavyweight dependency that would offset the token savings it exists to produce. The CLAUDE.md integration pattern is the practical enforcement mechanism: without it, developers have to remember to call route_task manually on every subagent invocation. With it, the orchestration becomes automatic. How far file-overlap heuristics stretch across large monorepos with deep cross-module dependencies is the open question Amruth's own usage data will answer fastest.