Document RAG over personal notes

The problem

You have years of accumulated notes, READMEs, internal wikis, code comments. Useful information lives in there. Finding it costs you 5-10 minutes per lookup. RAG turns those minutes into seconds.

Recommended setup

Agent	ZeroClaw (bundled offline RAG) or Hermes Agent + Postgres pgvector
Hardware	Mac Mini M4 (24 GB+) for fast local indexing, or any mini PC if you index occasionally
LLM	Local Mistral 7B Q4 or Llama 3 8B Q4 via Ollama for the response generation

How it works

Point ZeroClaw at your notes folder. It chunks documents, generates embeddings using BGE-small (local, no API call), stores in SQLite. On query: retrieves top-K relevant chunks, sends to local LLM with your question. Returns answer + source citations.

Reality check

I run this over 800+ notes (Obsidian vault), 12,000 chunks, indexed in 14 minutes on a Mac Mini M4. Query latency: ~2 seconds end-to-end. Answers are useful when the relevant content is in the corpus; useless when it isn't (the model won't make stuff up — that's the point of RAG over fine-tuning). Re-indexing on file changes: incremental, ~5 seconds for normal edit cadence.

What breaks

Documents in obscure formats (proprietary databases, scanned PDFs without OCR)
Queries that need synthesis across many chunks (RAG is good at retrieval, average at synthesis)
Outdated content if you don't refresh the index

Alternative setups

Hermes Agent + Postgres with pgvector if you want the vector store separately for other tools to query. Nanobot + manual SQLite implementation if you want to read every line.