Issue 004 — GPU vs CPU, and why most people pick wrong

The headline finding

For most self-hosted AI workloads, a GPU is the wrong answer. Idle power matters more than people realise; tokens-per-watt-hour matters more than tokens-per-second. The Mac Mini M4 wins on €/M-tokens at moderate steady traffic. Even the Raspberry Pi 5 wins on absolute monthly bill at low traffic. The single-GPU box only wins when it's busy.

Where GPUs actually pay off

Batch inference. Big models. Hot workloads. If your box is busy more than ~12 hours a day on 7B+ models, the GPU is the right call. If it sits idle 90% of the time, you're paying €75/year in electricity for the privilege.

Two new tools

We shipped a cost calculator that shows the crossover month between self-hosting and OpenAI/Anthropic given your monthly token volume. And a hardware sizer that recommends the cheapest of our four hosts that hits a target tokens-per-second on a chosen model size. Both are at /calculator. No signup, no email gate.

Mentioned

Like this issue? Subscribe to get the next one in your inbox Thursday morning UTC.