Utilyze: nvidia-smi has been lying about your GPU utilization

Systalyze just open-sourced Utilyze, a GPU monitoring tool that exposes a dirty secret in AI infrastructure: your dashboards are probably lying to you. The standard utilization metric reported by nvidia-smi, nvtop, and every major cloud monitoring service doesn't measure how hard your GPU is actually working. It measures whether it's doing anything at all. If a single kernel is running, even one barely using the silicon, that reads as 100% utilization. Systalyze says real compute throughput can be as low as 1% while standard tools report full saturation.

The problem is structural. According to Systalyze, nvidia-smi samples a binary signal asking "is at least one kernel scheduled right now?" and averages it over a window. An NVIDIA H100 has 132 Streaming Multiprocessors packed with 17,424 cores total. The standard metric treats one busy CUDA core the same as thousands running full tilt. That misleading number drives real waste: unnecessary GPU purchases, inflated energy bills, and teams thinking they're hardware-constrained when they're not.

Utilyze reads hardware performance counters instead, measuring actual compute throughput with what Systalyze claims is negligible overhead. The tool is still early at version 0.1.3. Hacker News users point out it lacks basics like memory usage, temperature, and fan speed readings. Questions about NVIDIA Jetson support remain unanswered. But for teams trying to squeeze more out of expensive GPU hardware, Utilyze fills a measurement gap that's been quietly costing the industry real money.