MI300X vs H100 vs H200 Training Benchmarks: CUDA Moat Persists

After five months of independent benchmarking, SemiAnalysis delivered a blunt verdict on AMD's MI300X. It can't compete with Nvidia's H100 or H200 for AI training. Better specs on paper, lower total cost of ownership. Doesn't matter. The software stack is too broken.

The testing team, led by Dylan Patel, Daniel Nishball, and Reyk Knuhtsen, gave AMD every chance. They identified and fixed AMD software bugs over months of work. Shared code and results with both companies. Held debugging calls with engineers. The Nvidia setup worked from day one. AMD's hardware needed custom builds and hand-holding from their own engineering team to get anywhere near competitive performance.

On large language model training workloads, even the fixed MI300X couldn't match H200 throughput. The gap was big enough that the lower price doesn't make up for it. And customers can't access those custom fixes. They get the public stable release, which still falls short.

AMD knows the software is the problem. CEO Lisa Su appointed Vamsi Boppana to lead a unified AI Group. The company now employs more software engineers than hardware engineers. Acquisitions of Nod.ai, Mipsology, and Xilinx were supposed to accelerate the catch-up. But merging those separate codebases into something that works like CUDA is a massive, unfinished job.

CUDA has years of momentum. Nvidia extends its lead while AMD fills gaps. For anyone buying AI training infrastructure, the math is simple. Nvidia costs more but actually works.