What is homelab?
A local AI platform that scales transparently into a cluster.
Every machine that runs the homelab daemon speaks the OpenAI API on port 42420. On a laptop alone it is a smart proxy: it loads models on demand, manages VRAM automatically, and evicts idle models to make room for new ones. With a cluster behind it the same endpoint routes large model requests to whichever node has the space to run them — without the caller knowing or caring which machine did the work.
The hub (this server, port 42421) is the cluster control plane. It owns the node registry, placement engine, model state machine, events, and metrics. Fileserver services that are always on the LAN point here directly because they always have cluster access and want placement from the start.
IDE · coding agent · script fileserver service
│ (deep_research, Home Assistant, n8n)
│ │
▼ ▼
daemon :42420 hub :42421
/v1/chat/completions /v1/chat/completions
│ │
│ Tier 1: model hot locally │
├─ proxy to local Ollama ─────────────►│
│ │
│ Tier 2: model fits locally │ placement engine
├─ load locally + proxy ──────────────►├─ picks best node ──► cluster node :11434
│ │
│ Tier 3: model too large │
└─ delegate to hub ──────────────────►─┘
│
heartbeat + ack + commands
│
daemon (other nodes)
| Component | Port | Where it runs | What it does |
|---|---|---|---|
| daemon | 42420 | Every participating node | OpenAI /v1/ gateway, local Ollama manager, heartbeat sender |
| hub | 42421 | One machine (this server) | Cluster control plane — node registry, placement, events, metrics |
| mcp | 8092 | Any machine (usually local) | MCP bridge exposing cluster tools to AI agents |
| Who | Points at | Why |
|---|---|---|
| IDE (Cursor, Continue, avante) | daemon :42420 | Always local; daemon handles cluster routing silently |
| Coding agent (Goose, aider) | daemon :42420 | Same — local first, cluster as needed |
| Script / curl on laptop | daemon :42420 | Same — stable local endpoint |
| Fileserver service | hub :42421 | Always on LAN; wants cluster placement from the start |
| n8n, Home Assistant, AnythingLLM | hub :42421 | Always on LAN; one hop fewer, full cluster routing |
| MCP-aware agent | mcp :8092 | Full cluster visibility via structured tools |
Any tool that speaks the OpenAI API works with homelab out of the box. Set the base URL and optionally an API key — no SDK changes needed.
Base URL: http://localhost:42420/v1 API key: none (or your daemon key)
Base URL: http://<hub-ip>:42421/v1 API key: <your service key>
See Tools for per-app config snippets.
MCP-aware agents (Cursor, Claude Desktop, Goose) get structured access to the cluster: list nodes, load/unload models, inspect VRAM, read files from managed hosts.
http://localhost:8092/sse
cluster_nodes model_list model_load model_unload node_ollama_url infra_list infra_read
Tools do not need to pre-load models. The hub's /v1/ endpoint handles everything automatically.
| Cluster state | Result |
|---|---|
| Model hot on a node | Routed instantly |
| Model on disk, not loaded | Loaded on demand |
| Model currently sharded | Routed to shard planner |
| Model not installed | Non-200 error — surface to user |
| No live nodes | 503 — back off, do not retry |
External applications (like stem-worker) can integrate without touching
homelab.yaml. Drop a manifest into ~/.homelab/apps/
and the daemon discovers it automatically.
~/.homelab/apps/<name>.yaml
The daemon starts the app on demand, stops it when idle, and downloads any declared file dependencies automatically.