Mixedbread has released Wholembed v3, a unified omnimodal multilingual late-interaction retrieval model the company is positioning as foundational infrastructure for the agentic AI era. The headline result is on LIMIT, where Wholembed v3 posted a Recall@5 of 92.45 — the first time a semantic model has cleared BM25 lexical retrieval, which scored 85.7. Every competing dense embedding from OpenAI, Cohere, Voyage, and Google finished well behind, a gap that illustrates how persistently structured-like natural language documents have resisted semantic approaches.
Mixedbread also targeted BrowseComp-Plus, a benchmark that tests retrieval in the context of real agentic workloads rather than isolated document lookups. Many of its queries require an agent to chain dozens of searches before it can confirm an answer — the kind of deep research task that exposes a retriever's failure modes quickly. Strong performance there positions Wholembed v3 as a core component in the emerging stack for autonomous research agents, not just an incremental embedding upgrade.
The model handles text, images, audio, and video retrieval across hundreds of languages — a scope designed for the messy reality of enterprise data, where useful information is as likely to live in scanned PDFs, screenshots, or instructional videos as in clean prose. Wholembed v3 is now the default on all new Mixedbread Search stores, and was co-designed with the company's custom retrieval infrastructure to deliver high throughput and low latency without requiring developers to manage data pipelines themselves.
The release is a direct challenge to embedding offerings from OpenAI, Cohere, Voyage, and Google, arriving at a moment when retrieval quality is increasingly understood as a hard ceiling on what AI agents can accomplish. If agentic systems are only as good as the knowledge they can reliably surface, Mixedbread is betting that the retrieval layer — long treated as solved or commoditized — is where the next major capability gains will be won.