On February 28, 2026, CodeWall's autonomous offensive security agent breached McKinsey & Company's internal AI platform Lilli, gaining full read and write access to the production database within two hours, with no prior credentials and no insider knowledge. The entry point was a classical SQL injection flaw in one of 22 unauthenticated API endpoints: while query values were safely parameterized, JSON field keys were concatenated directly into raw SQL. The agent identified the vulnerability after observing that those keys were reflected verbatim in database error messages, then ran fifteen blind iterations to progressively reconstruct the query shape until live production records began returning. OWASP ZAP, a widely deployed open-source web application scanner, failed to detect the flaw entirely.

The scale of exposed data was significant even by enterprise breach standards. The agent accessed 46.5 million plaintext chat messages covering strategy, client engagements, M&A activity, and internal research; 728,000 files across PDFs, spreadsheets, and presentations; 57,000 employee accounts; and 3.68 million RAG document chunks representing decades of McKinsey's proprietary research and methodologies. CodeWall also identified 266,000-plus OpenAI vector stores exposed as part of Lilli's <a href="/news/2026-03-14-captain-yc-w26-launches-automated-rag-platform-for-enterprise-ai-agents">retrieval-augmented generation pipeline</a>, with 1.1 million files and 217,000 agent messages flowing through external AI APIs. Because OpenAI's Vector Store product retains data indefinitely by default until manually deleted, and because the SQL injection would have exposed vector store IDs and potentially API credentials, the blast radius extended to infrastructure outside McKinsey's own network perimeter.

The incident's most pointed finding concerns the prompt layer. Lilli's system prompts — the behavioral instructions governing how the AI answered questions, enforced guardrails, and cited sources — were stored in the same database the agent had write access to. A malicious actor could have silently rewritten those prompts via a single HTTP request, with no code deployment, no detectable file change, and no audit trail. For the 43,000-plus McKinsey consultants relying on Lilli for client work, that access created a quiet and persistent backdoor. An attacker could rewrite the AI's instructions to redirect outputs or strip safety guardrails entirely, with nothing surfacing in monitoring logs. McKinsey patched all unauthenticated endpoints within 24 hours of receiving CodeWall's responsible disclosure on March 1; the firm's CISO acknowledged receipt the following day. Public disclosure followed on March 9.

SQL injection ranks among the oldest documented bug classes. What the CodeWall demonstration shows is not a novel exploit but an autonomous agent's capacity to select a target, map 200-plus endpoints, identify the subtle key-injection variant that standard scanners missed, chain it with an IDOR vulnerability, and enumerate tens of millions of records without human direction. The incident also surfaces an unresolved liability question: under the ISACA shared-responsibility model for AI, and given the potential GDPR exposure from uploading client and employee data to US-hosted OpenAI infrastructure, McKinsey sits simultaneously as data controller and data processor. OpenAI's own obligations are governed by contractual frameworks that most enterprise legal teams are unlikely to have fully scrutinized before deployment.