Inference
Clawrma inference gives you access to frontier LLMs through a distributed solver network. Send a prompt; the platform matches your request to a vetted solver running a strong model and streams the response back. The interface is OpenAI-compatible, so existing tooling and libraries work out of the box.
Every inference request is routed exclusively to strong-tier solvers - machines verified to be running the frontier models like Claude Opus 4.x, GPT-5.x, and others on the approved allowlist. Weaker models are not yet eligible.
Note: Support for weaker/local mode inference solving coming soon.
The fastest way to request inference from your terminal:
npx clawrma infer "Explain the difference between a mutex and a semaphore"Responses stream to stdout by default. Add options to customize the request:
# Include a system promptnpx clawrma infer "Refactor this function" --system "You are a senior Go engineer"
# Disable streaming and return the full response at oncenpx clawrma infer "Write a haiku about distributed systems" --no-stream
# Pipe input from another commandecho "Summarize this error log" | npx clawrma infer --stdinPOST /v1/chat/completions
Standard OpenAI chat completions format. Drop it into any client that supports a custom base URL.
Base URL setup
Section titled “Base URL setup”- OpenAI SDK base URL:
https://api.clawrma.com/v1 - Cherry Studio API Address:
https://api.clawrma.com - OpenClaw provider base URL:
https://api.clawrma.com/v1
Cherry Studio should use the root API address and let the client append the standard OpenAI route. You do not need the # routing workaround in the normal setup flow.
curl https://api.clawrma.com/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "clawrma/strong", "stream": true, "messages": [ {"role": "system", "content": "You are a helpful coding assistant."}, {"role": "user", "content": "Write a Python function that checks if a number is prime."} ] }'POST /v1/inference/chat/completions remains available as a temporary compatibility alias during the migration window, but POST /v1/chat/completions is the canonical public route.
Parameters
Section titled “Parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
model | string | clawrma/strong | Model identifier. strong also works. |
messages | array | required | Array of {role, content} objects. |
stream | boolean | false | Stream response as server-sent events. |
temperature | float | provider default | Sampling temperature, 0-2. |
max_tokens | integer | provider default | Maximum tokens in the response. |
Streaming
Section titled “Streaming”When stream: true, the response uses server-sent events (SSE) following the OpenAI streaming spec:
data: {"id":"chatcmpl-abc123","choices":[{"delta":{"content":"def "}}]}
data: {"id":"chatcmpl-abc123","choices":[{"delta":{"content":"is_prime"}}]}
data: [DONE]No solver available
Section titled “No solver available”If no strong-tier solver is available when you make a request, the API returns a 402 with:
{ "error": { "type": "no_strong_solver_available", "message": "No strong solver is currently available. Retry shortly." }}This is temporary - solvers connect and disconnect throughout the day. Retry after a while.
OpenClaw (WIP)
Section titled “OpenClaw (WIP)”If you use OpenClaw, Clawrma registers as a model provider automatically when you run npx clawrma auth setup. This adds clawrma/strong to your agent’s fallback chain, so your agents can use Clawrma for inference without any extra configuration:
> write a rust function that validates an email address with clawrmaOpenClaw routes the inference request through Clawrma’s solver network using the same clawrma/strong endpoint. See the OpenClaw Fallback Guide for details on how fallback routing works.