3.6B-Parameter Models Found What Mythos Found, For Pennies

AISLE, a security platform that's been finding real zero-days since mid-2025, ran a straightforward experiment: take the vulnerabilities Anthropic revealed in their Mythos announcement, isolate the relevant code, and feed it to small open-weights models. Eight out of eight models, including one with just 3.6 billion parameters costing $0.11 per million tokens, detected Mythos's flagship FreeBSD exploit. A 5.1B-parameter model recovered the core chain of a 27-year-old OpenBSD bug. Their takeaway: AI cybersecurity capability is 'jagged', meaning it doesn't scale smoothly with model size, and the real competitive advantage lives in the orchestration system, not the model weights.

But the comparison isn't quite apples-to-apples, and Hacker News commenters called that out fast. As johnfn put it, Mythos "scoured the entire continent for gold" by autonomously scanning whole codebases, while AISLE "pointed at a particular acre of land" by feeding already-isolated vulnerable code to smaller models. Tptacek drew a parallel to Heartbleed: spotting a bug in isolated code is one thing. Finding it buried in millions of lines, tracing how attacker-controlled data actually reaches that vulnerable function, is where the real difficulty lives.

Both points are valid. AISLE's production record speaks for itself: 15 CVEs in OpenSSL, 5 in curl, over 180 externally validated CVEs across 30+ projects. They've done this while being deliberately model-agnostic, routing tasks to whichever model performs best for that specific job. The real argument is narrower. Anthropic's announcement blends a pipeline of very different tasks (scanning, detection, triage, patching, exploitation) into one narrative, creating the impression that all of them require a frontier-scale model. AISLE's experience suggests otherwise.

The practical implication is worth sitting with. If small, cheap models can handle much of the detection work when properly targeted, you don't need to ration one expensive model's attention across a massive codebase. You can run cheap models everywhere, scanning broadly, and accept lower per-token intelligence in exchange for sheer coverage. A thousand adequate detectives searching everywhere will find more bugs than one brilliant detective guessing where to look. The question that remains open is whether that targeting itself, knowing where to point the cheap models, requires the expensive intelligence. AISLE says their scaffold handles it. Mythos says the model does. The market will sort out who's right. Mythos is the company behind the announcement, and Anthropic eyes classified Mythos AI deal with US intelligence.