infrastructure
Ollama
by Ollama
Ollama is an open-source tool that lets developers and users run large language models locally on their own hardware with a single command. It provides a simple CLI, a REST API compatible with the OpenAI API format, and a curated model library covering hundreds of models including Llama, Mistral, Gemma, Qwen, and more. Ollama supports GPU acceleration via NVIDIA CUDA, AMD ROCm, and Apple Metal, making local inference fast and accessible across macOS, Linux, and Windows.
9 Overall Score
Scores
Capability 8
Ease of Use 9
Documentation 8
Reliability 8
Value 10
Momentum 9
Details
- Status
- active
- Pricing
- open-source
- Launch Date
- Website
- https://ollama.com
- Last Updated
Key Features
- One-command local LLM serving (ollama run llama3)
- OpenAI-compatible REST API for easy integration
- Large model library with 200+ models (Llama, Mistral, Gemma, Qwen, etc.)
- GPU acceleration across NVIDIA, AMD, and Apple Silicon
- Modelfile system for customizing and creating derivative models
- Cross-platform support: macOS, Linux, and Windows