Connect your tools
Copy-paste config for every IDE, agent, and service — pre-filled with this hub's address. Manage API keys at Config → API Keys.
✓ 2 nodes live — cluster routing is active. Your tools automatically use the best available node.
🔓 Open mode — no API key needed.
Set
api_key in homelab.yaml to require tokens and enable per-service attribution.
Which endpoint to use?
IDE / coding agent / script on this machine
http://localhost:42420/v1
Daemon — loads models on demand, routes to cluster silently
Always-on service on the LAN (n8n, HA, Docker)
http://hub.home.browngregory.com/v1
Hub — direct cluster placement, no extra hop
Ollama-native tool (Open-WebUI, AnythingLLM)
http://hub.home.browngregory.com/ollama
Hub Ollama proxy — same placement, Ollama wire format
MCP-aware agent (Cursor, Claude Desktop, Goose)
http://localhost:8092/sse
Full cluster control — load/unload models, stream events
IDEs & coding assistants
Cursor
AI-first code editor
Cursor → Settings → Models → OpenAI API Key
none
Override URL
http://localhost:42420/v1
Models (type to add any Ollama model name)
qwen3:8b
qwen3.5:27b
qwen3.5:4b
Continue.dev
VSCode + JetBrains AI plugin
~/.continue/config.json (add to models array)
{
"models": [
{
"title": "homelab — qwen3:8b",
"provider": "openai",
"model": "qwen3:8b",
"apiBase": "http://localhost:42420/v1",
"apiKey": "none"
}
],
"tabAutocompleteModel": {
"title": "homelab — qwen3.5:4b",
"provider": "openai",
"model": "qwen3.5:4b",
"apiBase": "http://localhost:42420/v1",
"apiKey": "none"
}
}
aider
AI pair programmer (terminal)
~/.aider.conf.yml
openai-api-base: http://localhost:42420/v1
openai-api-key: none
model: openai/qwen3:8b
Or as env vars
export OPENAI_BASE_URL=http://localhost:42420/v1
export OPENAI_API_KEY=none
aider --model openai/qwen3:8b
avante.nvim
Neovim AI assistant
init.lua / lazy.nvim config
require("avante").setup({
provider = "openai",
openai = {
endpoint = "http://localhost:42420/v1",
model = "qwen3:8b",
api_key = "none",
},
})
Coding agents
Goose
Block's open-source coding agent
~/.config/goose/config.yaml
GOOSE_PROVIDER: openai
GOOSE_MODEL: qwen3:8b
OPENAI_HOST: http://localhost:42420
OPENAI_API_KEY: none
extensions:
homelab:
type: sse
url: http://localhost:8092/sse
name: homelab
timeout: 300
enabled: true
Claude Desktop
Anthropic's desktop AI app
claude_desktop_config.json → mcpServers
{
"mcpServers": {
"homelab": {
"url": "http://localhost:8092/sse",
"type": "sse"
}
}
}
Claude Desktop uses its own cloud model for inference. The homelab MCP server gives it cluster management tools (load/unload models, view nodes, run agent tasks).
Services & automation
n8n
Workflow automation
n8n → Credentials → OpenAI API
Base URL: http://hub.home.browngregory.com/v1
API Key: none
Use the hub URL (not daemon) since n8n always has LAN access. The hub's placement engine picks the best cluster node per request.
Home Assistant
Home automation platform
configuration.yaml
conversation:
agent: homelab
homeassistant_conversation:
chat_model: qwen3:8b
api_base: http://hub.home.browngregory.com/v1
api_key: none
max_tokens: 1024
AnythingLLM
Document chat + RAG platform
Settings → LLM Provider → Ollama
Ollama Base URL: http://hub.home.browngregory.com/ollama
# Uses the hub's Ollama proxy — routes to best cluster node.
# Then pick any model from the dropdown.
Open-WebUI
Full-featured web chat interface
Docker env / Settings → Connections
OLLAMA_BASE_URL=http://hub.home.browngregory.com/ollama
# All models on all live nodes appear in the model selector.
Python (openai library)
Direct API calls from scripts
Script on this machine — point at the daemon
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:42420/v1",
api_key="none",
)
response = client.chat.completions.create(
model="qwen3:8b",
messages=[{"role":"user","content":"hello"}],
)
Always-on service on the LAN — point at the hub
client = OpenAI(base_url="http://hub.home.browngregory.com/v1", api_key="none")
httpx / curl (any HTTP client)
Scripts, agents, custom integrations
Set once as env vars, use everywhere
export OPENAI_BASE_URL=http://hub.home.browngregory.com/v1
export OPENAI_API_KEY=none
# Then in your code, just use the SDK or call the URL directly.
# X-Homelab-Source lets the hub attribute the request to your service:
import httpx
resp = httpx.post(
f"{OPENAI_BASE_URL}/chat/completions",
headers={
"Authorization": f"Bearer {OPENAI_API_KEY}",
"X-Homelab-Source": "my-service",
},
json={"model": "qwen3:8b", "messages": [...]},
)
Quick test
Check hub health
curl http://hub.home.browngregory.com/health
# → {"ok":true,"port":42421,...}
List available models
curl http://hub.home.browngregory.com/v1/models | python3 -m json.tool
Run a chat completion
curl http://hub.home.browngregory.com/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"qwen3:8b","messages":[{"role":"user","content":"hello"}]}'