Docs
RunSpace is an OpenAI-compatible API for open models, plus connectors that let Claude — Code, Desktop, or the chat app — reach those models and your own machine. Base URL: run-space.com.
Get a key
Connect your wallet, then go to Dashboard → API Keys → Create key. You see the key (rsk_live_…) once — copy it. Every request authenticates with it.
export RUNSPACE_KEY=rsk_live_...The API
OpenAI-compatible chat completions. Point any OpenAI client at the base URL — streaming works the same way.
curl https://run-space.com/v1/chat/completions \
-H "Authorization: Bearer $RUNSPACE_KEY" \
-d '{"model":"llama-3.3-70b","messages":[{"role":"user","content":"hi"}]}'from openai import OpenAI
client = OpenAI(base_url="https://run-space.com/v1", api_key="$RUNSPACE_KEY")
resp = client.chat.completions.create(
model="llama-3.3-70b",
messages=[{"role": "user", "content": "ship it"}],
)
print(resp.choices[0].message.content)List the model ids you can pass:
curl https://run-space.com/v1/models -H "Authorization: Bearer $RUNSPACE_KEY"Streaming, errors & limits
Streaming works like OpenAI — set "stream": true and read the SSE chunks. Errors come back with standard codes:
- 401 — invalid or revoked key
- 402 — out of credits (daily allowance spent)
- 429 — rate limited (per key)
Output is capped per call (a server-side max_tokensceiling) so a single request can't blow the day's allowance. Usage is metered per token, wholesale-plus.
Use with Claude Code
Run Claude Code on open models, and let it hand background work to a local model. Install the router and point Claude Code through it:
ollama pull qwen2.5-coder:7b
npm i -g @musistudio/claude-code-routerDrop in this config (swap in your key), then run ccr code:
// ~/.claude-code-router/config.json
{
"Providers": [
{ "name": "runspace", "api_base_url": "https://run-space.com/v1/chat/completions",
"api_key": "$RUNSPACE_KEY",
"models": ["llama-3.3-70b","qwen3-235b","deepseek-v4"] },
{ "name": "ollama", "api_base_url": "http://localhost:11434/v1/chat/completions",
"api_key": "ollama", "models": ["qwen2.5-coder:7b"] }
],
"Router": {
"default": "runspace,llama-3.3-70b",
"background": "ollama,qwen2.5-coder:7b",
"longContext": "runspace,qwen3-235b"
}
}Background and sub-agent tasks run on your machine; reasoning and long context go to the cloud. Force a route with /model ollama,qwen2.5-coder:7b.
Use with Claude Desktop (local + cloud)
The RunSpace MCP server gives Claude Desktop two tools — list_models and run_model — so it can call cloud open models and your local Ollama on demand. Add it to your Desktop config and restart:
// ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"runspace": {
"command": "node",
"args": ["/absolute/path/to/runspace-mcp/index.js"],
"env": { "RUNSPACE_KEY": "rsk_live_..." }
}
}
}Then ask: “use runspace to run X on deepseek-v4”, or “…on local/qwen2.5-coder:7b” to run it on your machine (it never leaves your network). Local routing requires Ollama running with the model pulled.
Use with claude.ai chat
For the web/mobile chat, add RunSpace as a custom connector in Settings → Connectors → Add custom connector. Cloud models only (a hosted server can't see your localhost).
Recommended — sign in (OAuth): use the plain URL, leave the OAuth fields blank. claude.ai walks you through a RunSpace sign-in (connect your wallet → approve); no key to paste.
https://run-space.com/mcpOr key-in-URL: skip the sign-in by putting your key in the path. Quick, but the key is visible in your connector settings — revoke and re-add if it leaks.
https://run-space.com/mcp/rsk_live_YOURKEYThen in chat: “list runspace models”, then “run a prompt on glm-5”.
Models
Open weights only, verified available. Pass the id as model.
Billing
Token-gated. Holding the token in your connected wallet unlocks a daily compute allowance by tier; it resets every day and is metered per token at wholesale-plus. No subscription, no top-ups. Calls 402 when the day's allowance is spent.