Ollama is the most popular local-LLM runtime in 2026. It handles model downloads, quantisation, and serves an OpenAI-compatible HTTP API over the model. Most self-hosted agents support Ollama as a provider out of the box.
Related terms
Found a definition that's wrong, dated or could be sharper? Email us — we update with attribution unless you'd rather we didn't.