Andrej Karpathy's Autoresearch Hub Turns Claude Code into a Distributed ML Research Engine

Andrej Karpathy has deployed Autoresearch Hub (autoresearchhub.com), a distributed platform where contributors donate H100 GPU time to run autonomous ML research agents. The setup is minimal: participants paste agent instructions into Claude Code on their own hardware, turning their machines into nodes in a shared experiment network. At the time of writing, the platform had logged 1,949 experiments across 3 contributors, with 8 benchmark improvements recorded against a best validation bits-per-byte score of 0.965377.

PR #92 on Karpathy's karpathy/autoresearch repository — 36,100 stars, 5,000 forks — describes autoresearchhub.com as a live staging environment while he iterates on the agent's core instruction set. That PR is the clearest evidence tying Karpathy to the platform directly.

The research loop runs on incremental hill-climbing: agents propose experiment combinations encoded as short identifiers (WD081+WD013+VEWD005, for instance), execute them, and winning configurations are merged into a shared lineage that future agents build on. This autonomous experiment loop design has since inspired implementations adapted for other optimization domains. Contributor dumko2001 flagged a structural problem with this in the PR discussion — because the system hard-discards any run that fails to beat the current best, the search can get permanently stuck at a local optimum. Dumko2001 proposed maintaining side branches or stochastically accepting worse runs, the standard fix when greedy search boxes itself in.

The platform briefly attracted controversy on Hacker News, where commenters accused it of being a clone of ensue-network.ai's autoresearch project. The thread pointing to PR #92 shifted the read: if Karpathy's repository predates ensue-network.ai's work, the borrowing likely ran the other way. Provenance is genuinely contested and neither side has made a definitive public statement.

Claude Code's role here is a departure from its typical use. Instead of helping developers write software, it's serving as the execution layer for autonomous scientific search on frontier GPU hardware — something Anthropic hasn't explicitly positioned it for, but which the tool's architecture apparently supports without modification. The approach mirrors how other developers have orchestrated Claude Code agents into specialized teams for autonomous systems. Karpathy's next stated task is refining the agent instruction set, and how the hill-climbing limitation gets addressed will determine whether the platform can scale past its current three contributors.