LIVE TAPE
OpenClaw 88,412 stars·CVE-2026-25898 disclosed (HIGH, Hermes)·Hermes Agent v2026.4.7 published·Hermes Agent +182 stars (last hour)·OpenClaw v2026.4.6 — credential vault hardening·CVE-2026-26133 patched (NanoClaw)·Pi 5 16GB rumoured for Q3 — recheck guidance·Nanobot +47 stars (last hour)·ZeroClaw v0.4.2 — Apple container fixes·Mac Mini M4 wins quarterly hardware survey·OpenClaw 88,412 stars·CVE-2026-25898 disclosed (HIGH, Hermes)·Hermes Agent v2026.4.7 published·Hermes Agent +182 stars (last hour)·OpenClaw v2026.4.6 — credential vault hardening·CVE-2026-26133 patched (NanoClaw)·Pi 5 16GB rumoured for Q3 — recheck guidance·Nanobot +47 stars (last hour)·ZeroClaw v0.4.2 — Apple container fixes·Mac Mini M4 wins quarterly hardware survey·
PocketClawvol. 1 · 2026

Self-hosted document RAG on a €25 VPS

End-to-end document Q&A pipeline using Hermes Agent + Qdrant + bge-large-en-v1.5 embeddings. PDFs in, citations out. No OpenAI dependency.

Prerequisites

  • Hermes Agent running (see Hermes Agent setup guide)
  • Qdrant running (see Docker Compose stack guide)
  • Python 3.11+ on the VPS
  • Your document collection (PDFs, Markdown, or text)

Steps

  1. Install the ingestion toolchain

    We use sentence-transformers for the embedding model and pypdf for PDF parsing. Both are pure-Python on a CPU host.

    python3 -m venv ~/.venvs/rag
    source ~/.venvs/rag/bin/activate
    pip install sentence-transformers pypdf qdrant-client tiktoken markdown-it-py
  2. Pre-download the embedding model

    bge-large-en-v1.5 is ~1.3 GB; do this once. Change to bge-small if your VPS has under 4 GB RAM.

    python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('BAAI/bge-large-en-v1.5')"
  3. Write the chunker

    A naive chunker (split every 1000 chars) ruins retrieval quality on technical documents. The version below respects paragraph and table boundaries. Save as ~/agent-stack/scripts/chunk.py.

    cat > ~/agent-stack/scripts/chunk.py <<'EOF'
    import re
    
    def chunk_text(text: str, max_tokens: int = 400, overlap: int = 50):
        paragraphs = re.split(r'\n\s*\n', text)
        chunks, buf = [], []
        buf_len = 0
        for p in paragraphs:
            plen = len(p) // 4  # rough token estimate
            if buf_len + plen > max_tokens and buf:
                chunks.append("\n\n".join(buf))
                buf = buf[-1:] if overlap else []
                buf_len = sum(len(x) // 4 for x in buf)
            buf.append(p)
            buf_len += plen
        if buf:
            chunks.append("\n\n".join(buf))
        return chunks
    EOF
  4. Write the ingester

    Walks a directory, parses, chunks, embeds, upserts to Qdrant.

    cat > ~/agent-stack/scripts/ingest.py <<'EOF'
    import sys, os, glob
    from pathlib import Path
    from pypdf import PdfReader
    from sentence_transformers import SentenceTransformer
    from qdrant_client import QdrantClient
    from qdrant_client.http import models as qm
    from chunk import chunk_text
    
    COLLECTION = "docs"
    QDRANT_URL = "http://localhost:6333"
    
    def read_pdf(path):
        reader = PdfReader(path)
        return "\n".join(p.extract_text() or "" for p in reader.pages)
    
    def main(folder: str):
        model = SentenceTransformer("BAAI/bge-large-en-v1.5")
        client = QdrantClient(url=QDRANT_URL)
        if COLLECTION not in [c.name for c in client.get_collections().collections]:
            client.create_collection(
                collection_name=COLLECTION,
                vectors_config=qm.VectorParams(size=1024, distance=qm.Distance.COSINE),
            )
        points, idx = [], 0
        for path in glob.glob(f"{folder}/**/*", recursive=True):
            p = Path(path)
            if p.suffix.lower() not in (".pdf", ".md", ".txt"):
                continue
            text = read_pdf(p) if p.suffix.lower() == ".pdf" else p.read_text()
            for chunk in chunk_text(text):
                vec = model.encode(chunk).tolist()
                points.append(qm.PointStruct(id=idx, vector=vec, payload={"source": str(p), "text": chunk}))
                idx += 1
            if len(points) >= 64:
                client.upsert(COLLECTION, points)
                points = []
        if points:
            client.upsert(COLLECTION, points)
        print(f"Ingested {idx} chunks")
    
    if __name__ == "__main__":
        main(sys.argv[1])
    EOF
  5. Run the ingester nightly via cron

    Crontab edit: ingest at 03:00 every night.

    crontab -e
    # Add:
    0 3 * * * /home/$USER/.venvs/rag/bin/python /home/$USER/agent-stack/scripts/ingest.py /home/$USER/documents >> /home/$USER/agent-stack/logs/ingest.log 2>&1
  6. Wire Hermes to query the collection

    Hermes 2026.4 ships a built-in retrieval tool. Configure it to point at the Qdrant collection. The exact config is in the Hermes documentation; the relevant fields are `collection: docs`, `vector_size: 1024`, `top_k: 8`.

Troubleshooting

Embedding model OOMs on a 4 GB VPS
Switch to BAAI/bge-small-en-v1.5 (vector size 384). Quality drops a bit, RAM use drops a lot.
PDF text extraction garbled on scanned documents
pypdf handles digital PDFs only. For scanned PDFs, add an OCR step with ocrmypdf before ingestion. ocrmypdf input.pdf output.pdf.
Retrieval returns mostly irrelevant chunks
Tune top_k upward (8 → 12), and review the chunker — overly aggressive paragraph splitting on documents with lots of bullet lists is a common cause.

Where to go from here

Once retrieval works, add a re-ranker (e.g. bge-reranker-large) to improve the precision of returned chunks before they hit the LLM.

Other tutorials
intermediate
Hermes Agent on a Raspberry Pi 5
End-to-end install of Hermes Agent on a fresh Raspberry Pi 5 (8 GB), accessed via Tailscale, with Cl…
beginner
Tailscale for self-hosted AI dashboards
Set up Tailscale to access your agent dashboard from anywhere without exposing it on the public inte…
intermediate
Ollama + Phi-3 mini on a Raspberry Pi 5
Install Ollama and the smallest credible local LLM (Phi-3 mini 3.8B Q4) on a Raspberry Pi 5. Useful …
beginner
Caddy reverse proxy with HTTPS for a self-hosted AI dashboard
Front your agent dashboard with Caddy on port 443 with automatic HTTPS via Let's Encrypt — no certbo…