Google wants AI running inside your browser, not on some distant server. Chrome's Prompt API, documented by Google engineers Thomas Steiner and Alexandra Klepper, lets developers send natural language requests directly to the Gemini Nano model built into Chrome. The pitch is simple: AI features running locally with zero data leaving your machine. No API keys. No cloud bills. Tools like Gemma Gem are exploring how deep this local integration can go.

The catch is size. Developers report the model download is far larger than Chrome itself. Your first request won't be fast. You need at least 22 GB of free space, and if storage drops below 10 GB after installation, Chrome silently removes the model. It only runs on Windows 10/11, macOS 13+, Linux, and certain Chromebooks. No mobile support. These friction points matter when you're asking developers to bet on a browser-specific API.

Gemini Nano also struggles with conversations beyond two rounds, pushing some developers toward alternatives like transformers.js running Qwen 0.9B instead.

The browser wars angle is worth watching. Google bakes Gemini Nano directly into Chrome. Apple delegates to system frameworks like Core ML. Mozilla runs open models like Llama through WebGPU. Microsoft pushes cloud-based Copilot integration in Edge. Each approach reflects a different philosophy about who should control AI in the browser. Google's approach locks developers into Gemini. Mozilla's doesn't. That tension will shape what gets built.