HatmanStack, a GitHub user, has published Claude Forge — an open-source pipeline that borrows its design logic from Generative Adversarial Networks and applies it to software development with Anthropic's Claude Code CLI. The project assigns five AI agent roles across two camps: generators (Planner and Implementer) that produce artifacts, and discriminators (Plan Reviewer, Code Reviewer, and Final Reviewer) that challenge them. The premise is that adversarial pressure between producers and critics pushes code quality higher over successive iterations, the same dynamic that makes GANs useful in machine learning.

The workflow starts with two Claude Code slash commands. Running /brainstorm kicks off an interactive design session — exploring the codebase and posing up to 15 clarifying questions — before outputting a structured spec. Running /pipeline then executes the full plan-implement-review cycle against that spec. Each agent gets its own isolated context window with no shared history, so reviewers aren't anchored to earlier decisions. Feedback between roles flows through a shared feedback.md file, written as rhetorical prompts — "Consider," "Think about," "Reflect" — rather than direct instructions, nudging generating agents toward self-correction. A structured signal set (GO, NO-GO, PLAN_APPROVED, CHANGES_REQUESTED, and others) tells the orchestrator where to route work next.

A few hard constraints govern the pipeline. Iteration loops cap at three to prevent runaway refinement. Plan documents are immutable once written; only the Planner agent can modify them. If the Final Reviewer returns a NO-GO, the pipeline doesn't retry — it escalates to a human instead, a deliberate choice that keeps people in the decision chain at the highest-stakes quality gate. The Implementer works in test-driven style and commits atomically per phase. If the pipeline times out or hits a token limit, it can be resumed from disk, with progress persisted to date-stamped plan directories.

The project is MIT-licensed and on GitHub. Its central argument — that agents critiquing each other's work produces better software than deferring to a single model — is straightforward enough, and the explicit lift from GAN architecture gives it a cleaner conceptual frame than most multi-agent pipelines, which tend to accumulate roles without much theory behind how those roles should interact.