Esc to close · ⌘K / Ctrl-K opens search anywhere
India's AI inference gateway
Indic and global models behind one API, with data-residency you can enforce per request, INR billing, and routing that optimizes for price, latency or uptime. Swap one line, keep your code — and it’s built for AI agents as much as for people.
50 providers 1000s of models Indic & global ₹ INR billing residency enforced
Sangam · ensemble routing
Ask once; a panel of models deliberate in parallel and a synthesizer reconciles one best answer. Open-weight, reproducible, and BYOK — run it on your own provider keys.
An open-weight panel — Qwen2.5 · Llama 3.1 · Qwen2.5-VL — answers in parallel; transparent and reproducible.
A synthesizer resolves consensus and contradictions into a single best reply.
Pin every stage to India with data_policy: india_only. Or run the panel anywhere, switch to a frontier panel with bharatrouter/sangam, or bring your own keys.
Agent identity · Ekam
An API key is a long-lived shared secret — fine for one person, wrong for a fleet of agents spun up and torn down all day. On BharatRouter an agent authenticates with a verifiable, short-lived identity issued by Ekam, Krutrim's agent-identity control plane — no static key sitting in an env var. And the default takes zero setup: create an agent and we provision the identity for you.
A minutes-long token, minted per run and auto-expiring. Nothing durable to leak — one stolen secret isn't a standing breach.
Kill the identity and every live token dies cluster-wide — not one API key at a time, by hand.
Spend, rate limits and budgets resolve to this agent acting for this org — not one shared key you can't tell apart.
Three steps
Point your OpenAI base_url at BharatRouter and keep your SDK and code exactly as they are.
Add data_policy or optimize and we pick the best healthy route, failing over automatically when one errors.
Prepaid credits at per-token rates — ₹100 free to start, no card on file, no surprise bills.
One knob · optimize
Set optimize once as a default or override it per request. Every mode reorders
the same pool of healthy routes by what you care about — and data_policy: india_only
sits on top as a hard filter, never a trade-off.
Lowest ₹/Mtok route first. The default — most calls don’t need anything fancier.
Orders by observed latency, so interactive paths stay snappy; untried routes still get a turn.
Favours the lowest failure rate — sticks with whatever has been answering cleanly.
Reliability first, then latency and price trade off against each other. One knob, sensible defaults.
The full scoring and the auto formula are in the routing docs.
Drop-in replacement
Point your base URL at BharatRouter and add data_policy and optimize.
Same call from any stack.
# curl
curl https://api.bharatrouter.com/v1/chat/completions \
-H "Authorization: Bearer br-..." \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5-7b-instruct",
"messages": [{"role":"user","content":"namaste"}],
"data_policy": "india_only",
"optimize": "price"
}' // Node — npm i openai
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.bharatrouter.com/v1",
apiKey: "br-...",
});
const r = await client.chat.completions.create({
model: "qwen2.5-7b-instruct",
messages: [{ role: "user", content: "namaste" }],
data_policy: "india_only", optimize: "price", // extras
}); Disable a provider mid-stream and watch requests re-route to a healthy model — Krutrim, a global provider, or a self-hostable endpoint you control — in milliseconds. No code change, no downtime, no foreign kill-switch wired into your roadmap.
The whole gateway
Send data_policy: india_only and requests route only to India-resident endpoints — enforced per request, not promised in a policy doc.
Optimize each call for price, latency or uptime; pin a provider when you must; automatic failover with circuit breakers when you don’t.
Shareable, versioned routing chains — publish, fork and import a tested fallback recipe into your routing in one call.
Plug your own OpenAI-compatible deployment in as a ₹0 fallback step, admitted only after an automated compliance test.
Save a Krutrim or OpenAI key once — encrypted, never shown again — and ride your own account and rates. Free during beta.
Transcribe (Parakeet) and synthesize (Kokoro) speech behind the same API — first-party, India-resident, INR-priced.
Invite teammates with roles, share one prepaid wallet, and group keys and usage by environment.
An MCP server, llms.txt, OpenAPI, a live-health catalog, and ephemeral scoped keys your agent mints itself.
Agents authenticate with a verifiable, short-lived Ekam identity — central revocation, per-agent attribution, billed to the right org. Zero setup by default.
New here? The guides walk through each of these in under two minutes.
Live from the gateway
Loading catalog…
Bring your own keys
Save a provider key once and requests ride your own account and rates, with our routing and failover on top. fallback providers can ride our keys when yours hit limits; the rest route through your key only. Free during beta.
Open GLM matched frontier coding quality at about a third of the cost in our benchmark. Real numbers, reproducible on GitHub.
GLM 99% vs Opus & GPT-5.5 100%, at ~⅓ the cost — frontier wins on speed. 14 tasks × 10 runs, fully reproducible.
Baseten vs OpenRouter vs Zhipu — throughput, cost, and reliability via failover routing.
One command — installs the agent, sets up your keys, runs a first query.
One API for Indic and global models — with residency you can prove, INR pricing, and routing you control.