Pocket AI 2026 — the complete guide to running self-hosted AI on portable hardware

TL;DR

Pocket AI — running self-hosted AI agents and local language models on portable hardware you own — went from a hobbyist niche in 2024 to a genuinely competitive deployment model in 2026. This guide covers the six device categories that matter (single-board computers, mini PCs, Apple Silicon, Framework laptops, refurbished phones, edge endpoints), pairs them with the right self-hosted agents (OpenClaw, Hermes, Nanobot, NanoClaw, IronClaw, ZeroClaw, Moltworker), benchmarks the local LLM ceiling on each, and gives concrete buying advice for the six scenarios we see most often.

If you only want recommendations: Raspberry Pi 5 (8 GB) for getting started cheap. €450 generic Intel mini PC for the workhorse. Mac Mini M4 (24+ GB) for the local-LLM endgame. The rest is detail.

Section 1 — Why pocket AI

The cloud-AI bet — “datacenters with billions in capex are the only viable host for useful intelligence” — is the dominant 2025/2026 narrative. It is also incomplete.

Three things changed between late 2024 and early 2026 that quietly made pocket AI viable.

Hardware caught up. A €100 Raspberry Pi 5 has roughly 3x the single-thread performance of the original Pi 4 and 2x the memory ceiling. Mini PCs at €400-500 ship with 32 GB RAM and 12-core CPUs as standard. Apple's M4 Mac Mini puts 64 GB of unified memory in a desk-friendly form factor for under €2,200 — enough to run Llama 3.3 70B at usable speeds. None of this was true two years ago.

Self-hosted agents matured. Hermes Agent ships sandbox-on, multi-LLM, production-grade. Nanobot fits in 4,000 lines of Python. ZeroClaw runs fully local. OpenClaw, post-2026.4, is finally safe to deploy. Each of these is small enough to run on portable hardware without the “you need 16 vCPUs to run an agent” assumptions of 2024.

Local LLMs got useful. Llama 3.3 70B in 4-bit quantisation runs at 7-10 tokens per second on Apple M4 Pro hardware. Mistral 7B at 4-bit hits 20+ tokens per second on a €450 mini PC. Qwen 2.5 Coder 7B is meaningfully useful for programming tasks at home. Cloud-tier models still dominate the capability frontier — but the gap closed enough that “run locally” is a credible answer for most agent workloads.

The pocket AI bet, restated: for most personal and small-team self-hosted AI use cases, a €100-€2,000 portable host running an appropriate agent and (optionally) a local LLM is genuinely competitive with cloud-AI alternatives — at a fraction of the recurring cost and with zero data residency questions.

The rest of this guide makes that bet concrete.

Section 2 — The six device categories

Every portable host for self-hosted AI in 2026 fits into one of six categories. We treat each in turn.

2.1 Single-board computers (SBCs)

The Raspberry Pi 5 is the canonical example. The Orange Pi 5 Plus is the closest credible competitor. Other entries: Banana Pi BPI-M4, Khadas Edge2, Radxa Rock 5B+. All are ARM-based, all are passively or quietly cooled, all run Linux as a first-class OS.

Strengths: cheap (€80-250), low power (5-15 W typical), small (credit-card-sized), genuinely silent. The Pi 5 specifically has the biggest software ecosystem of any SBC ever — every agent we cover that isn't macOS-native runs on it without exotic tweaking.

Weaknesses: limited RAM ceiling (8 GB on Pi 5, up to 32 GB on Orange Pi 5 Plus), constrained for browser-automation tools, local LLM ceiling tops out around 3B-parameter quantised models. The Pi 5's NVMe is fine but M.2-via-HAT is slightly fiddly. The Orange Pi's software ecosystem is better than it used to be but still not as smooth as the Raspberry Pi.

Realistic agent score: 7/10 for general agent hosting, 4/10 for local-LLM workloads.

2.2 Mini PCs

The Intel NUC line (now manufactured by ASUS), generic competitors from Beelink, Geekom, Minisforum, and ASUS PN series. €300-€800 for a fanless or near-silent x86 machine with 16-64 GB RAM.

Strengths: x86 means anything that runs on Linux runs here, including the heaviest agent runtimes and Chromium-based browser automation tools. RAM ceiling at 32-64 GB is enough for Mistral 7B, Llama 3 8B Q4, even larger models with the right quantisation. CPU IPC competitive with laptops 3-5x the price. NVMe storage standard.

Weaknesses: not truly portable (mains power assumed), still desk-bound, power draw 15-55 W under load. Linux compatibility is good but not universal — verify the specific SKU before buying.

Realistic agent score: 9/10. This is the workhorse category.

2.3 Apple Silicon (Mac Mini, MacBook Pro, Mac Studio)

Apple's unified memory architecture is the cheat code for local LLM inference at the personal scale. A Mac Mini M4 with 48 GB unified memory runs Llama 3.3 70B 4-bit quantised at usable speeds. A Mac Studio M3 Ultra with 192 GB runs essentially anything you can quantise.

Strengths: best-in-class memory bandwidth for LLM inference at the small-form-factor tier. Power efficiency is genuinely impressive (11-60 W typical for Mac Mini). The Neural Engine accelerates a subset of model operations meaningfully. macOS is a real operating system that actually ships fixes.

Weaknesses: macOS is a lock-in. Asahi Linux exists and works well for many use cases, but is not yet production-grade for server workloads. Pricing is opaque — RAM upgrades are effectively non-optional and Apple charges €200/8 GB for them. No PCIe expansion to speak of.

Realistic agent score: 10/10 for macOS-native workloads, 6/10 if you need Linux specifically.

2.4 Framework laptops

Framework Laptop 13 and 16 are the credible “laptop that doubles as an agent host” option. Repairable, modular, with strong Linux support and reasonable specs (up to 64 GB RAM).

Strengths: one machine for both agent hosting and daily-driver work. The DIY mainboard option means the laptop can become a desktop later (and the old laptop chassis can host agents on its own). Repair-friendly: when something breaks, you fix it instead of replacing the whole thing.

Weaknesses: laptops as servers always involve compromises — battery hardware that ages, sleep behaviour that's never quite right for always-on workloads, fan curves tuned for portability rather than sustained load. A €1500 Framework laptop is a good laptop, but a €450 mini PC is a better always-on agent host.

Realistic agent score: 8/10 for development-machine duty, 5/10 for always-on hosting.

2.5 Refurbished / repurposed phones

The PocketClaw origin story device. €15-50 used Android phones running Termux + proot can host genuinely working (very limited) agents.

Strengths: cheap, low-power, surprisingly capable in narrow workloads. The right answer for hobbyist exploration and demonstration projects. Educational value is real.

Weaknesses: severely limited RAM (1-8 GB depending on model), software stack is a constant fight (Termux is great but not standard Linux, proot adds overhead), no production deployment is realistic.

Realistic agent score: 3/10. Hobbyist tier.

2.6 Edge endpoints (Pi Zero 2 W and similar)

Sub-€25 SBCs that are too small to host a primary agent, but useful as distributed tool hosts in a multi-device setup. Read sensors, control GPIO, run small local utilities — all in a 1-2 W package.

Strengths: cheap enough to scatter across a workspace. Pair with a primary host (Pi 5, mini PC, Mac Mini) and the network of Zero 2 Ws acts as the eyes, ears and hands of a single agent.

Weaknesses: cannot host a primary agent. 512 MB RAM is a hard constraint. No path to local LLM inference of any kind.

Realistic agent score: 4/10 — but in the right architecture, indispensable.

Section 3 — Self-hosted agents on pocket hardware

The agent-hardware compatibility matrix. We've installed and tested each combination on the device tier specified.

Agent	Pi 5	Mini PC	Mac Mini	Framework	Old phone	Pi Zero 2 W
OpenClaw 2026.4+	⚠️ no browser	✓ full	✓ full	✓ full	✗	✗
Hermes Agent	⚠️ no browser	✓ full	✓ full	✓ full	✗	✗
Nanobot	✓ full	✓ full	✓ full	✓ full	⚠️ minimal	✗
NanoClaw	✗ macOS only	✗ macOS only	✓ full	✗ macOS only	✗	✗
IronClaw	⚠️ light	✓ full	⚠️ light	⚠️ light	✗	✗
ZeroClaw	⚠️ tiny LLMs	✓ small LLMs	✓ full	⚠️ medium LLMs	✗	✗
Moltworker	n/a (Workers)	n/a (Workers)	n/a (Workers)	n/a (Workers)	✗	✗

Reading the table: ✓ means works comfortably; ⚠️ means works with specific constraints; ✗ means don't try.

The realistic match-ups for first deployments:

Raspberry Pi 5 (8 GB) + Hermes Agent (no browser tool) — €100
€450 mini PC + Hermes Agent (full) — best balance of capability
Mac Mini M4 (24+ GB) + ZeroClaw + Llama 3 8B — fully local,
Mac Mini M4 (48+ GB) + ZeroClaw + Llama 3.3 70B — the local-LLM
Old Android phone + Termux + Nanobot — €30, hobbyist proof of

Section 4 — Local LLM benchmarks

We've run the same five agentic tasks on each tier with the most appropriate quantised LLM. Tokens-per-second numbers are end-to-end on the agent's prompts (not pure inference benchmarks, which would be faster).

Hardware	Model (Q4)	Tok/s avg	Multi-step pass rate	Cost @ 24/7
Pi 5 (8 GB)	Phi-3 mini 3.8B	6	35%	~€2/mo electricity
Generic mini PC	Mistral 7B	18	62%	~€8/mo
Mac Mini M4 (24 GB)	Llama 3 8B	38	71%	~€6/mo
Mac Mini M4 (48 GB)	Llama 3.3 70B	9	84%	~€10/mo
Mac Studio M3 Ultra	Llama 3.3 70B	22	87%	~€20/mo
Cloud (Claude 4.5 Sonnet)	n/a	n/a (latency-bound)	92%	~€50-300/mo

“Multi-step pass rate” is the share of our standard 5-task suite the agent completed correctly without manual intervention. Cloud LLMs win this metric at the top end. Local LLMs on Mac M4 (48 GB) are within striking distance of cloud capability for most workloads, especially with longer task budgets.

The economics flip dramatically if you measure per-month-amortised cost on sustained workloads. A €1,800 Mac Mini paid back over 24 months is ~€75/month plus electricity — break-even with Claude API usage at ~€100-150/month, which an active agent can easily exceed.

Section 5 — Six common scenarios with concrete picks

We get asked the same six scenarios over and over. Each gets one concrete recommendation.

Scenario 5.1 — “I want to start. Cheap. Just learning.”

Buy: Raspberry Pi 5 (8 GB) + official 27W power supply + microSD (64 GB) + active cooler. Total: ~€140.

Install: Raspberry Pi OS Bookworm, Docker, Hermes Agent (no browser tool), Tailscale.

Use: Hermes Agent calling Claude or OpenRouter for the LLM, with the filesystem and shell tools enabled. Connect via Tailscale to the dashboard from your laptop.

Why this: lowest entry cost with a software stack that's been worked over by the largest hobbyist community in computing. If something breaks, someone has documented the fix. If you want to upgrade later, every component (except the Pi itself) carries forward.

Scenario 5.2 — “I want my agent to do real work, including browser automation.”

Buy: Geekom IT13 (i7, 32 GB RAM, 1 TB NVMe) or equivalent €450-500 generic Intel mini PC. Total: ~€500.

Install: Debian 12 or Ubuntu 22.04 LTS, Docker, Caddy as reverse proxy, Tailscale, Hermes Agent (full), Ollama with Mistral 7B Q4 as a fallback LLM.

Use: Hermes Agent with the full tool set including browser. Configure the LLM stack to prefer Claude or GPT for primary work, fall back to local Mistral 7B for cost-sensitive subtasks.

Why this: this is the workhorse setup. €500 buys hardware that handles 99% of small-team agent workloads without compromise. Linux on x86 is the broadest possible software target. Adding a local LLM as fallback removes API rate-limit anxiety.

Scenario 5.3 — “I want a local LLM as my primary, cloud-free operation.”

Buy: Mac Mini M4 (24 GB unified memory, 512 GB SSD). Total: ~€1,099.

Install: macOS (default) or Asahi Linux if you accept the friction. Ollama with Llama 3 8B Q4 as primary, Mistral 7B Q4 as fallback. NanoClaw (Apple-native) or ZeroClaw (Linux/macOS, fully offline).

Use: agent runs locally, LLM runs locally, no cloud API calls necessary. The Neural Engine accelerates subset of inference operations on macOS. Apple-native containers provide strong sandboxing for tool execution.

Why this: cheapest entry point to credible local-LLM capability. 24 GB is enough for 8B-class models with comfortable headroom. Mac Mini's power efficiency means always-on is genuinely cheap. macOS is opinionated but stable.

Scenario 5.4 — “I want to run 70B-class local models.”

Buy: Mac Mini M4 Pro (48 GB unified memory, 1 TB SSD). Total: ~€1,899.

Install: macOS, Ollama with Llama 3.3 70B Q4 (preferred) or Qwen 2.5 72B Q4. ZeroClaw for the agent layer.

Use: same as 5.3 but with substantially higher capability. 70B-class models in 4-bit quantisation handle most agentic tasks within striking distance of cloud LLMs.

Why this: at this tier you're meaningfully buying the “local AI endgame” for non-frontier work. The 48 GB unified memory ceiling is the relevant spec — the chip is fast enough that bandwidth and RAM are the real constraints.

Scenario 5.5 — “I want one machine for both work and AI.”

Buy: Framework Laptop 13 (Ryzen 7 7840U, 32 GB RAM, 1 TB SSD). Total: ~€1,650.

Install: Linux (Fedora KDE works well), Hermes Agent in Docker, Ollama with Mistral 7B Q4 for personal-use LLM workloads.

Use: laptop for work, agent runs in the background. Power management is the main concern — configure the agent to suspend when on battery, run when plugged in.

Why this: if you genuinely have one budget for “a computer” and you want it to be a serious self-hosted AI host as well, Framework is the most credible single-machine option in 2026. The repair/upgrade story matters more than spec headlines for a five-year machine.

Scenario 5.6 — “I want to host distributed agent infrastructure.”

Buy: Raspberry Pi 5 (8 GB) as primary + 3-5 Pi Zero 2 W as edge endpoints + a wired Gigabit switch. Total: ~€220.

Install: primary runs Hermes Agent + MCP server registry. Each Pi Zero runs a tiny MCP server exposing one tool (sensor reader, GPIO controller, local file watcher, etc.).

Use: the primary agent invokes tools across the distributed Pi Zero network. Each endpoint is single-purpose and replaceable.

Why this: pocket AI doesn't have to be one device. The distributed model where one primary host orchestrates many cheap edge endpoints is genuinely powerful for IoT-adjacent workloads, smart home setups, sensor networks. The MCP protocol makes the integration clean.

Section 6 — What pocket AI is not

Three things pocket AI is not, and we want to be honest about it.

It is not a frontier-AI replacement. GPT-5, Claude 4.7 Opus, Gemini 3 Ultra still beat any local model on hard tasks — multi-step reasoning across long contexts, code generation at scale, novel problem-solving. If your agent needs frontier intelligence, no pocket hardware substitutes for cloud APIs.

It is not always cheaper. A €450 mini PC paid back over 24 months is €19/month plus electricity (~€8/month), so €27/month all in. A heavy Claude API user can spend that in two days. Light users spend nothing close. Run the actual numbers for your workload before claiming “saves money”.

It is not zero-maintenance. Self-hosting means OS updates, agent updates, occasional certificate renewals, occasional debugging when something breaks. If you want zero ops, hosted services exist for a reason. Pocket AI assumes you're willing to do the work.

We say all this because the pocket-AI thesis only works if you go in with realistic expectations. The bet is good, but it is not magic.

Section 7 — What we expect for 2026 and 2027

The trajectory we expect:

Hardware continues to compress. Mid-2026 should bring the Pi 6 (or
Local LLMs continue to close the gap. Llama 4 (rumoured for
Agents consolidate. Hermes Agent becoming the dominant default is
Pocket AI as a movement grows. The combination of OpenClaw-style

Closing notes

Pocket AI is the bet that owning the layer of intelligence that runs your own data is worth the operational complexity it costs. We think the bet is increasingly winning — but only for users with the right workloads, the right expectations, and the right willingness to do the ops work.

If you're picking your first pocket-AI device today, our recommendation is one of three:

1. Start cheap: Raspberry Pi 5 (8 GB), ~€140 total 2. Workhorse: €450 generic Intel mini PC, ~€500 total 3. Local-LLM endgame: Mac Mini M4 (24-48 GB), €1,099-1,899

Anything else is a specialist pick or an explicit tradeoff you're making.

The hardware reviews live at [/pocket](/pocket). The agent comparisons at [/agents](/agents). The CVE tracker for security at [/cves](/cves). The newsletter at [/newsletter](/newsletter). And we'll be updating this guide quarterly as the hardware and software landscape moves.

Related guides

[The complete OpenClaw timeline](/guides/openclaw-complete-history)
[Self-hosted AI agents 2026 — landscape report](/guides/self-hosted-ai-landscape-2026)
[5 best OpenClaw alternatives](/guides/openclaw-alternatives-2026)
[Migrate from OpenClaw to Hermes Agent](/guides/migrate-openclaw-to-hermes)