State of self-hosted AI — Q2 2026

This is PocketClaw's Q2 2026 state-of-the-ecosystem report — a quarterly snapshot of self-hosted AI agents, the CVEs they ship with, the hardware they run on, and the providers around them. Numbers are from our public dashboards (live tracker on /agents and /cves) cross- referenced with project release atoms, NIST NVD, and HN/Reddit mentions over the period 1 March – 31 May 2026.

The report has six sections. Skip to whichever matters to you.

1. The agent landscape

Active projects we tracked in Q2: 12. New entries: 0 of consequence. Quietly archived: 1 (NanoClaw Lite, merged upstream into NanoClaw proper). The market is consolidating.

Agent	Stars (1 Mar)	Stars (31 May)	Δ	Releases	Open CVEs
OpenClaw	84,210	88,412	+4.2K	14	1
Hermes Agent	31,840	38,920	+7.1K	7	0
Nanobot	18,600	22,140	+3.5K	5	0
NanoClaw	9,450	11,870	+2.4K	6	1
IronClaw	4,210	7,820	+3.6K	4	0
ZeroClaw	1,840	4,260	+2.4K	9	0
Moltworker	14,720	13,210	-1.5K	2	1

Hermes Agent gained the most stars in absolute terms, IronClaw in percentage terms. OpenClaw growth has decelerated but is still positive — the post-crisis stabilisation is real and users are returning. ZeroClaw quadrupled its star count from a small base; it remains the niche-of-niches but the niche is growing. Moltworker lost stars on net, the only project to do so.

Takeaway: the “Hermes is eating OpenClaw” narrative we floated in March is half-right. Hermes is gaining, but OpenClaw is also still gaining — both at the expense of older agents and at the expense of people who otherwise would have rolled their own. The ecosystem grew by ~21K aggregate stars; new entrants account for none of it.

2. Security

Disclosed CVEs in Q2: 6 across 4 projects. Of those, 3 critical (CVSS ≥ 9.0), 2 high, 1 medium. Median time-from-disclosure-to-patch fell from 14 days in Q1 to 4 days in Q2 — a meaningful improvement we attribute to two projects (OpenClaw and Hermes) hiring dedicated security engineers in February.

Notable disclosures:

CVE-2026-25898 (Hermes Agent, RCE via plugin sandbox escape,
CVE-2026-26133 (NanoClaw, credential leak via verbose error
CVE-2026-26211 (Moltworker, hardcoded JWT secret in pre-built

The Moltworker case is the one we keep flagging. The class of vulnerability (hardcoded secrets in pre-built images) is preventable. The fact that ~200 pipelines still pull the vulnerable image is preventable. Pinning to a specific image digest is preventable. None of this is being prevented as widely as it should be.

Takeaway: the security layer is improving. The patch latency falling to 4 days is the single biggest infrastructure win of the quarter. The remaining pain is on the user side — users not upgrading, not pinning, not subscribing to advisory feeds. We are doing what we can to surface advisories more aggressively in /cves and on every agent profile page.

3. Hardware

Q2 was the quarter Apple Silicon won self-hosted AI. The Mac Mini M4 showed up in roughly 38% of the home-lab setups we surveyed (n=412, PocketClaw newsletter readers, May 2026). Raspberry Pi 5 dropped from 44% Q1 to 31% Q2. Intel NUCs and equivalent mini PCs sat steady at ~18%. Single-GPU boxes: 8%. Cloud VPS for inference: 5%.

The shift wasn't a Pi 5 problem — Pi 5 deployments are absolute counts up; it's a Mac Mini gain. The €749 Mac Mini M4 16 GB became the boring-default answer for “what should I run my agent on?” between March and May. Reasons: unified memory makes 7B-13B models genuinely usable, idle power is the lowest in class, macOS hosts Docker reasonably well, and the secondhand market is dropping the price further every month.

Hardware to watch in Q3: the rumoured Pi 5 16 GB. If Raspberry Pi ships the 16 GB SKU at the rumoured €120-130 mark, it changes the calculation for the Pi-vs-Mac-Mini decision. Eight gigs is the ceiling that holds a lot of users back.

Takeaway: if you're buying hardware in Q3 2026, the boring answer is a Mac Mini M4 16 GB. The better-for-some-users answer is “wait three months for the Pi 5 16 GB.” The wrong answer is a $1,200 GPU box that sits idle 90% of the day.

4. Providers (LLM and infrastructure)

The interesting story in Q2 is that the boring providers got more boring (in a good way) and the exciting ones got less exciting.

Anthropic shipped Claude 4.7 Opus in May. Reasoning quality improvement is real but the price kept it niche for self-hosted agents — most users stayed on Sonnet 4.5. Anthropic prompt caching saw broader adoption and is now the single biggest cost-saver for long agent system prompts.

OpenAI shipped two updates we noticed: gpt-4o-mini got a small price cut (-7%, mid-April) and the structured-output JSON schema mode got more reliable. The €/M-tokens cost gap between Anthropic Haiku and gpt-4o-mini narrowed but didn't close.

Groq remained the fastest LPU shop and started showing up as an upstream in OpenRouter — lowering the friction for self-hosted agents that want low-latency Llama 70B inference without operating GPU infrastructure.

OpenRouter itself had a quiet quarter. Stable, reliable, no major incidents. The kind of quarter that doesn't make the news but matters more than the news.

Hetzner remained the EU-hosting answer most of our self-hosted-AI readership uses. Hostinger rolled out Compute Pro tier with promising pricing in April but reliability reports are mixed.

Takeaway: providers are stable. The major cost moves of Q2 came from Anthropic prompt caching adoption, not from price cuts. The right LLM strategy for a self-hosted agent in Q3 is “Claude Haiku + prompt caching for cheap calls; Sonnet 4.5 for reasoning; OpenRouter fallback for resilience.”

5. Models (open-weight)

The open-weight model story in Q2: Qwen 2.5 Coder 7B is the price- performance winner for code workloads. Mistral Small 22B remains the mid-tier sweet spot for general workloads. Llama 3.3 70B is the capability ceiling for solo-developer hardware budgets.

New releases of note: Qwen 3.0 dropped in early May with a 32B “thinking” model that is genuinely impressive on reasoning benchmarks. We're still benchmarking it for agent suitability; preliminary numbers say it punches above its weight on long-context tasks. Expect a deep-dive next quarter.

llama.cpp / Ollama / mlx-lm: all stable. Ollama added unified memory optimisations in the May release that meaningfully help Mac Mini M4 performance on 13B+ models. mlx-lm continues to be the right choice for serious Mac users; the speed advantage over Ollama on Apple Silicon is real (~1.4× on our benchmarks).

Takeaway: there is no “right” open-weight model — there are right matches between model and workload. Code workloads → Qwen Coder. General workloads → Mistral Small. Maximum capability local → Llama 3.3 70B. New: try Qwen 3.0 32B-thinking for any reasoning-heavy task.

6. Infrastructure (the boring layer)

Tailscale: still the default mesh VPN for self-hosted AI access patterns. Free tier covers most home labs. Headscale (self-hosted coordination server) saw modest adoption gains.

Caddy: still the easiest reverse-proxy-with-TLS for self-hosted agents. Nginx remains the right call when you need rate limiting and fail2ban integration.

Docker vs Podman: Docker still dominates because tutorials assume it. Podman gained share in regulated environments where rootless-by- default matters. RHEL/Fedora users default to Podman; everyone else defaults to Docker.

Vector stores: Qdrant overtook Weaviate among PocketClaw readers in Q2 (43% vs 31%). Reason: lighter footprint, simpler REST API, BYO- embeddings being preferred to bundled vectorizers. Weaviate remains the right call for larger orgs needing built-in RBAC and multi- tenancy.

Takeaway: the infrastructure layer is settling into a clear set of defaults. Docker + Caddy + Tailscale + Qdrant covers ~70% of self-hosted AI deployments and is genuinely the right choice for those 70%. The remaining 30% needs intentional choices and a writeup about why.

Looking ahead to Q3 2026

Three things we're watching:

1. The Pi 5 16 GB. If it ships at the rumoured price, the hardware default for self-hosted AI shifts again. 2. OpenClaw 2027.0 cadence. OpenClaw's post-crisis governance overhaul promised quarterly major releases; Q3 will be the first test of whether the cadence holds. 3. Local model agents. Several agents (Hermes, IronClaw, ZeroClaw) are pushing toward local-first defaults. If two of the three actually deliver in Q3, the cost calculus on /calculator/cost shifts dramatically toward self-hosting.

We will publish the Q3 report in early September 2026. Subscribe to{" "} [the newsletter](/newsletter) to be notified.

Methodology and sources

All star counts and CVE figures are pulled from public APIs (GitHub GraphQL, NIST NVD JSON feed). The reader-survey numbers are from a PocketClaw newsletter survey conducted 12-19 May 2026 with n=412 respondents, self-selected. The hardware market shares are not representative of the broader self-hosted-AI population — they represent our specific readership.

Editorial opinion is mine ([Robin Monteiro](/author/robin)) unless attributed. Corrections welcome at contact@pocketclaw.dev — they are incorporated into a quarterly addendum.

Reference reading:

[Live agent tracker](/agents) — the data this report is built on
[CVE archive](/cves) — every disclosed CVE we tracked this quarter
[Pocket AI hardware hub](/pocket) — full hardware reviews
[Provider reviews](/providers) — long-form takes on each provider