The output from an AI coding agent rarely stays consistent across a long session. Ask it to scaffold a new feature and the first few hundred lines tend to be sharp — sensible abstractions, clean variable naming, reasonable error handling. Keep the session running for several hours, let the conversation history grow, and something quietly shifts. The code still compiles. The logic mostly holds. But experienced developers have started naming what they're seeing: context rot.
Rishu Goyal built Tarvos to address this directly. Released this week as an open-source project, it introduces what Goyal calls Relay Architecture — an orchestration layer that replaces single long-running agent sessions with a chain of fresh instances, each picking up where the last left off before degradation sets in. The default handoff threshold is 100,000 tokens, though it's configurable.
The mechanics are deliberately simple. Every agent in the relay reads the same master plan — a phased PRD written in markdown — fresh from disk at the start of each session. Nothing from that document accumulates in conversation history. When a session ends, the outgoing agent writes a 40-line handoff note called the Baton, capturing what was completed, what comes next, and any gotchas the next agent needs to know. The 40-line cap is a design decision: a longer baton would start recreating the problem it was meant to solve.
Agents signal their own progress through three phrases — PHASE_COMPLETE, PHASE_IN_PROGRESS, and ALL_PHASES_COMPLETE — that the Tarvos orchestrator listens for without needing to understand what the agent is actually building. If a session crashes before the baton gets written, a recovery mechanism reconstructs it from git history.
At version 0.1.0, the only supported underlying agent is Claude Code, extended through a custom skill file dropped into the user's local Claude configuration. Every session runs in an isolated git worktree on its own branch. A TUI dashboard lets developers review and explicitly accept or reject changes before anything merges to main — a meaningful guardrail for a system where multiple agent sessions fire in sequence without direct oversight.
The demo Goyal ships with the project shows five successive agents completing a four-phase Stripe payments integration in 29 minutes. It's a controlled benchmark, but useful for illustrating throughput.
What Tarvos is not is a multi-agent system in the conventional sense. There's no parallel workstream coordination, no inter-agent messaging, no tool-calling mesh. It's a relay in the literal sense: one agent hands off to the next in sequence, each starting fresher than the last. The question left unanswered at this stage is how the architecture handles the messier reality of projects where plans shift mid-execution — a dependency turns out to be broken, requirements change, a phase takes three times longer than anticipated. The Baton format is tight by design, which is correct, but tight handoff notes also leave less room for capturing the contextual nuance that tends to accumulate on real-world work.
Support for additional coding agents is listed as planned. With an MIT license, contributing guide, code of conduct, and security policy already in place, the project looks like something Goyal intends to develop past the proof-of-concept stage. For teams running into the context ceiling on AI coding work, Relay Architecture is worth watching.