You can now run Claude Code on GLM through BharatRouter

Claude Code is Anthropic’s coding client. GLM is an open-weight (MIT) model family. Until now those didn’t mix on BharatRouter — the gateway spoke the OpenAI wire, and Claude Code speaks Anthropic’s. That changed: BharatRouter now serves the Anthropic Messages API (/v1/messages) and translates non-Anthropic models on it. So you can point Claude Code at the gateway and run GLM as the brain — with one br- key in your client and GLM set once as platform BYOK.

How it works

Claude Code sends the Anthropic wire. The gateway translates each request to the OpenAI wire, routes it to a GLM host (Zhipu, Baseten or OpenRouter), then translates the response back into Anthropic messages — streaming and non-streaming both, with GLM’s reasoning surfaced as Anthropic thinking blocks alongside the text. We proved it end-to-end against real GLM-4.7: valid Anthropic responses, thinking + text blocks, stop_reason: end_turn, and translated token usage.

One key in the client, GLM on the platform

The auth model is deliberately simple. Your client carries only a BharatRouter br- key. The GLM provider key is saved once as BYOK on the dashboard — server-side, per org — at Provider keys. You never put a GLM key into Claude Code. Routing, residency and per-key ₹ budgets apply to every call, the same as any other request through the gateway.

Point Claude Code at it

Add this to ~/.claude/settings.json. The br- key rides x-br-api-key because Claude Code reserves Authorization for its own session. Default coding model is glm-4.7; glm-4.7-flash is a free tier:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.bharatrouter.com",
    "ANTHROPIC_CUSTOM_HEADERS": "x-br-api-key: br-...",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "glm-4.7",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-4.7",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "glm-4.7-flash"
  }
}

If no GLM key is saved on the platform yet, the gateway returns a clear error telling you to set BYOK at the dashboard — save it once and re-run, no client change needed. The full walkthrough is in the cookbook: Run Claude Code on GLM.

Honest about the trade-off

Is open GLM good enough to drive a real coding loop? On our 14-task, execution-checked benchmark, GLM held ~99% of frontier quality at roughly a third of the cost. The catch is latency: GLM’s deep reasoning makes it noticeably slower than a frontier model — the speed gap, not quality or price, is the real cost. That makes this a strong fit for cost-sensitive, quality-sensitive coding where you can spare the seconds, and a poor one for latency-critical, interactive work. See the benchmark for the numbers and method.

This is GLM — open weights, MIT-licensed — running through BharatRouter. Claude Code is Anthropic’s client; we describe wire compatibility, not endorsement. If you’d rather keep real Claude on this surface (subscription-first, with metered overflow), that path still exists: Govern Claude Code. And if you want an open coding agent that speaks the OpenAI wire instead, see Run an open coding agent on GLM.

Recipe and config: Run Claude Code on GLM → · Cookbook scripts: View on GitHub →

Was this helpful?