Docs

RunSpace is an OpenAI-compatible API for open models, plus connectors that let Claude — Code, Desktop, or the chat app — reach those models and your own machine. Base URL: run-space.com.

Get a key

Connect your wallet, then go to Dashboard → API Keys → Create key. You see the key (rsk_live_…) once — copy it. Every request authenticates with it.

export RUNSPACE_KEY=rsk_live_...

The API

OpenAI-compatible chat completions. Point any OpenAI client at the base URL — streaming works the same way.

curl

curl https://run-space.com/v1/chat/completions \
  -H "Authorization: Bearer $RUNSPACE_KEY" \
  -d '{"model":"llama-3.3-70b","messages":[{"role":"user","content":"hi"}]}'

python (openai sdk)

from openai import OpenAI

client = OpenAI(base_url="https://run-space.com/v1", api_key="$RUNSPACE_KEY")

resp = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "ship it"}],
)
print(resp.choices[0].message.content)

List the model ids you can pass:

curl https://run-space.com/v1/models -H "Authorization: Bearer $RUNSPACE_KEY"

Streaming, errors & limits

Streaming works like OpenAI — set "stream": true and read the SSE chunks. Errors come back with standard codes:

401 — invalid or revoked key
402 — out of credits (daily allowance spent)
429 — rate limited (per key)

Output is capped per call (a server-side max_tokensceiling) so a single request can't blow the day's allowance. Usage is metered per token, wholesale-plus.

Use with Claude Code

Run Claude Code on open models, and let it hand background work to a local model. Install the router and point Claude Code through it:

ollama pull qwen2.5-coder:7b
npm i -g @musistudio/claude-code-router

Drop in this config (swap in your key), then run ccr code:

// ~/.claude-code-router/config.json
{
  "Providers": [
    { "name": "runspace", "api_base_url": "https://run-space.com/v1/chat/completions",
      "api_key": "$RUNSPACE_KEY",
      "models": ["llama-3.3-70b","qwen3-235b","deepseek-v4"] },
    { "name": "ollama", "api_base_url": "http://localhost:11434/v1/chat/completions",
      "api_key": "ollama", "models": ["qwen2.5-coder:7b"] }
  ],
  "Router": {
    "default": "runspace,llama-3.3-70b",
    "background": "ollama,qwen2.5-coder:7b",
    "longContext": "runspace,qwen3-235b"
  }
}

Background and sub-agent tasks run on your machine; reasoning and long context go to the cloud. Force a route with /model ollama,qwen2.5-coder:7b.

Use with Claude Desktop (local + cloud)

The RunSpace MCP server gives Claude Desktop two tools — list_models and run_model — so it can call cloud open models and your local Ollama on demand. Add it to your Desktop config and restart:

// ~/Library/Application Support/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "runspace": {
      "command": "node",
      "args": ["/absolute/path/to/runspace-mcp/index.js"],
      "env": { "RUNSPACE_KEY": "rsk_live_..." }
    }
  }
}

Then ask: “use runspace to run X on deepseek-v4”, or “…on local/qwen2.5-coder:7b” to run it on your machine (it never leaves your network). Local routing requires Ollama running with the model pulled.

Use with claude.ai chat

For the web/mobile chat, add RunSpace as a custom connector in Settings → Connectors → Add custom connector. Cloud models only (a hosted server can't see your localhost).

Recommended — sign in (OAuth): use the plain URL, leave the OAuth fields blank. claude.ai walks you through a RunSpace sign-in (connect your wallet → approve); no key to paste.

https://run-space.com/mcp

Or key-in-URL: skip the sign-in by putting your key in the path. Quick, but the key is visible in your connector settings — revoke and re-add if it leaks.

https://run-space.com/mcp/rsk_live_YOURKEY

Then in chat: “list runspace models”, then “run a prompt on glm-5”.

Models

Open weights only, verified available. Pass the id as model.

llama-3.3-70bLlama 3.3 70B Instruct · 131k

qwen3-235bQwen3 235B-A22B · 262k

deepseek-v4DeepSeek V4 Pro · 512k

kimi-k2Kimi K2.6 · 262k

glm-5GLM-5 · 202k

gpt-oss-120bgpt-oss 120B · 131k

gpt-oss-20bgpt-oss 20B · 131k

Billing

Token-gated. Holding the token in your connected wallet unlocks a daily compute allowance by tier; it resets every day and is metered per token at wholesale-plus. No subscription, no top-ups. Calls 402 when the day's allowance is spent.