Anthropic's Mythos made headlines last week for autonomously finding zero-day vulnerabilities, including a 27-year-old OpenBSD bug and a FreeBSD remote code execution exploit. Claude Mythos finds 27-year-old OpenBSD bug, writes exploits overnight. But AISLE, which has been running its own vulnerability discovery system since mid-2025, tested those same bugs on smaller, cheaper open-weights models. The results are uncomfortable for anyone betting that model scale is the competitive moat in AI security. All eight models AISLE tested detected the FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens. A 5.1B parameter open model recovered the core chain of the OpenBSD bug.

AISLE calls this the 'jagged frontier.' Cybersecurity capability doesn't scale smoothly with model size. No single model consistently tops rankings across different security tasks. AI vulnerability detection is really a pipeline of distinct jobs (scanning, detection, triage, patch generation, exploit construction), and small models are good enough for much of the detection work. Deploy cheap models broadly, scanning everything, rather than rationing one expensive model. As AISLE puts it: 'A thousand adequate detectives searching everywhere will find more bugs than one brilliant detective who has to guess where to look.'

The pushback matters though. Hacker News commenters pointed out that AISLE used older model versions (Qwen3 instead of Qwen3.5, DeepSeek R1 instead of V3.2) and omitted GLM-5.1, which some consider a leading open-weight model. More importantly, AISLE tested models on code where the vulnerability was already identified, not on raw discovery. Finding a known bug and finding an unknown bug are different problems. Still, AISLE's track record speaks for itself: 15 CVEs in OpenSSL, 5 in curl, over 180 validated CVEs across 30+ projects, all using a model-agnostic approach.

Project Glasswing, with its $100M in credits and $4M in donations to open source security organizations, is Anthropic's bet that democratizing access to frontier cybersecurity AI will strengthen the software supply chain. AISLE used those credits to run this very analysis. The irony is sharp. Their findings suggest the infrastructure and expertise around the model matter more than the model itself. For anyone building AI agent systems in security, that's the real takeaway.