Frontend developer Ankur Sethi spent four weeks building a fully functional programming language — and never read a single line of the code Claude wrote. The language, named Cutlet after his cat, builds and runs on macOS and Linux, executes real programs, and ships with a working REPL. Rather than using AI as a coding assistant, Sethi positioned himself entirely as a product owner and spec writer, delegating every implementation decision to Claude.
Cutlet isn't a toy. It includes a Raku-inspired @ meta-operator that vectorizes binary operations across arrays, a zip-to-map operator, boolean array indexing for concise filter expressions, user-defined functions integrated with the meta-operator, prototypal inheritance, mixins, loops, and a mark-and-sweep garbage collector. Sethi's project choice wasn't accidental: language interpreters are self-contained, testable from the command line with purely textual inputs and outputs, and well-represented in LLM training data — a combination that suits autonomous generation unusually well.
His workflow inverts how most developers use AI tools. Sethi front-loaded the project with rigorous spec writing instead of iterating on prompts, ran Claude inside a Docker sandbox with full filesystem and execution permissions, and used automated tests as his only feedback mechanism — 'run make test and make check until there are no more errors.' He deliberately avoided MCPs, browser integrations, and anything requiring visual verification, drawing on previous frustrating attempts to use LLMs for responsive layout and data visualization work. The result is what he calls a four-part framework for agentic engineering: problem selection, communicating intent through precise specs, environment setup, and loop monitoring.
By his estimate, the approach compressed roughly six months of solo development into four weeks, producing something he admits is outside his own systems-programming abilities. What the experiment leaves open is a harder question about software craft: when the feedback loop is a test runner rather than a developer reading code, what gets lost — and whether that matters.