PocketClawvol. 1 · 2026

GGUF

File format for storing quantised LLM weights, used by llama.cpp and Ollama.

GGUF ("GGML Universal File") is the standard quantised model format in 2026. A typical Llama 3.3 70B Q4 GGUF file is around 40 GB. Hugging Face hosts thousands of pre-quantised GGUFs for popular models.

Related terms

llama.cppOllamaQuantisation

Found a definition that's wrong, dated or could be sharper? Email us — we update with attribution unless you'd rather we didn't.