Neuroscope: Real-Time LLM Interpretability via Sparse Autoencoders

Neuroscope is an open-source inference server that embeds real-time Sparse Autoencoder (SAE) feature extraction directly into a model's forward pass, allowing developers and researchers to observe which semantic concepts a language model activates as it generates each token. Built in Rust on top of the mistral.rs inference engine, the project targets Google's Gemma 2 2B IT model and hooks into layer 20 of the transformer, routing activations through a pre-trained SAE from Google's Gemma Scope suite. The result is a dual-API server: a standard OpenAI-compatible chat endpoint on port 8080 and a Server-Sent Events stream on port 8081 that emits per-token concept labels such as "geography or place names" or "European countries and capitals" in near real time.

The project builds directly on Google DeepMind's Gemma Scope, a comprehensive open suite of sparse autoencoders released for Gemma 2 2B and 9B and documented in an accompanying arXiv paper (2408.05147). Neuroscope operationalizes that research artifact for local, interactive inference rather than post-hoc analysis. Feature labels — the human-readable descriptions of what each SAE feature represents — can be generated automatically via LLM APIs, with DeepSeek V3 through OpenRouter as the default labeler and fallbacks to Anthropic's Claude or other OpenAI-compatible endpoints. Pre-generated labels can also be pulled in roughly one minute via a built-in CLI command, with Neuronpedia serving as a secondary fallback source.

What distinguishes Neuroscope from prior interpretability tooling is its positioning as a streaming side-channel alongside a live inference API rather than an offline analysis tool. By exposing activations in real time, it opens up workflows that were previously impractical: debugging model reasoning token by token, auditing concept drift across prompts, or building interactive visualizers for educational use. The project supports macOS Metal, NVIDIA CUDA, and CPU-only builds, and requires only a HuggingFace account with Gemma 2 access and around 5 GB of disk space, making it accessible to hobbyist researchers on consumer hardware.

The creator describes Neuroscope as a weekend project built for fun — experimental, vibe-coded, and not production-hardened. The scope is intentionally narrow: one model, one layer. Adding support for a different architecture would mean porting both the inference hook and sourcing a compatible SAE suite. The creator's README is upfront about this: the project works, but it was not designed for extension. What it does demonstrate concretely is that wiring SAE activations into a live inference loop is tractable on a laptop — the harder engineering question for anyone who wants to take the idea further.