You can now run Claude Code on GLM through BharatRouter
Claude Code is Anthropic’s coding client. GLM is an open-weight (MIT) model family.
Until now those didn’t mix on BharatRouter — the gateway spoke the OpenAI wire, and
Claude Code speaks Anthropic’s. That changed: BharatRouter now serves the
Anthropic Messages API (/v1/messages) and translates
non-Anthropic models on it. So you can point Claude Code at the gateway and run
GLM as the brain — with one br- key in your client and
GLM set once as platform BYOK.
How it works
Claude Code sends the Anthropic wire. The gateway translates each request to the
OpenAI wire, routes it to a GLM host (Zhipu, Baseten or OpenRouter), then translates
the response back into Anthropic messages — streaming and non-streaming both,
with GLM’s reasoning surfaced as Anthropic thinking blocks alongside the
text. We proved it end-to-end against real GLM-4.7: valid Anthropic
responses, thinking + text blocks, stop_reason: end_turn,
and translated token usage.
One key in the client, GLM on the platform
The auth model is deliberately simple. Your client carries only a
BharatRouter br- key. The GLM provider key is saved
once as BYOK on the dashboard —
server-side, per org — at
Provider keys. You never
put a GLM key into Claude Code. Routing, residency and per-key ₹ budgets apply to every
call, the same as any other request through the gateway.
Point Claude Code at it
Add this to ~/.claude/settings.json. The br- key rides
x-br-api-key because Claude Code reserves Authorization for its
own session. Default coding model is glm-4.7; glm-4.7-flash is a
free tier:
{
"env": {
"ANTHROPIC_BASE_URL": "https://api.bharatrouter.com",
"ANTHROPIC_CUSTOM_HEADERS": "x-br-api-key: br-...",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "glm-4.7",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-4.7",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "glm-4.7-flash"
}
} If no GLM key is saved on the platform yet, the gateway returns a clear error telling you to set BYOK at the dashboard — save it once and re-run, no client change needed. The full walkthrough is in the cookbook: Run Claude Code on GLM.
Honest about the trade-off
Is open GLM good enough to drive a real coding loop? On our 14-task, execution-checked benchmark, GLM held ~99% of frontier quality at roughly a third of the cost. The catch is latency: GLM’s deep reasoning makes it noticeably slower than a frontier model — the speed gap, not quality or price, is the real cost. That makes this a strong fit for cost-sensitive, quality-sensitive coding where you can spare the seconds, and a poor one for latency-critical, interactive work. See the benchmark for the numbers and method.
This is GLM — open weights, MIT-licensed — running through BharatRouter. Claude Code is Anthropic’s client; we describe wire compatibility, not endorsement. If you’d rather keep real Claude on this surface (subscription-first, with metered overflow), that path still exists: Govern Claude Code. And if you want an open coding agent that speaks the OpenAI wire instead, see Run an open coding agent on GLM.
Recipe and config: Run Claude Code on GLM → · Cookbook scripts: View on GitHub →
Related: zero to a GLM coding agent in one command · can open GLM match a frontier model?