Stop Using Ollama

Ollama built its reputation as the easiest way to run local LLMs. That reputation deserves an asterisk. According to a detailed takedown by Zetaphor at Sleeping Robots, Ollama spent over a year refusing to credit llama.cpp, the Georgi Gerganov-created inference engine that actually powered everything. The MIT license requires including a copyright notice. Ollama didn't bother. GitHub issues requesting compliance went unanswered for hundreds of days. When co-founder Michael Chiang finally responded, he added a single line to the README's bottom and mentioned plans to move away from llama.cpp entirely.

They did move away in mid-2025, forking to a custom backend built on ggml. It went poorly. Bugs that llama.cpp had solved years ago reappeared. Models like GPT-OSS 20B broke. Gerganov himself flagged that Ollama had made bad changes to GGML. Benchmarks shared by the community show llama.cpp running 1.8x faster on identical hardware, 161 tokens per second versus Ollama's 89. The CPU gap stretches to 30-50%.

Then there's the naming problem. When DeepSeek released its R1 model family, Ollama listed small distilled versions as simply "DeepSeek-R1" in their library. Running "ollama run deepseek-r1" pulls an 8B Qwen-derived distillate, not the real 671B parameter model. DeepSeek themselves used the "R1-Distill" prefix. Hugging Face got it right. Ollama stripped the distinction, and social media flooded with confused users wondering why "DeepSeek-R1" performed poorly on their laptop.

The founders, Jeffrey Morgan and Michael Chiang, previously built Kitematic, a Docker GUI that Docker Inc. acquired in 2015. The playbook reads the same. Find a powerful open-source tool with bad UX. Wrap it. Build a user base. Take VC money through Y Combinator. Then shift toward proprietary offerings, like the closed-source desktop app they shipped in July 2025 without a license. If you're running local models, use llama.cpp directly. Or try LM Studio. Build your own agents locally with AMD's GAIA SDK. AMD's Lemonade is another solid alternative for local LLM server needs.