Circuit breaker resilience · Kimchi CLI-native · RTK token compression · 40+ providers · 214 tests
🚀 Quick Start • 💡 Features • 🔄 Architecture • 📊 Comparison • 🧭 Beginner Guide • 🙏 Credits
🇮🇩 Bahasa Indonesia • 🇻🇳 Tiếng Việt • 🇨🇳 中文 • 🇯🇵 日本語 • 🇷🇺 Русский
Stop wasting money, tokens and hitting limits:
- ❌ Subscription quota expires unused every month
- ❌ Rate limits stop you mid-coding
- ❌ Tool outputs (git diff, grep, ls...) burn tokens fast
- ❌ 1 provider error → all accounts blocked
- ❌ Manual reactivation after quota reset
VansRoute solves this:
- ✅ RTK Token Saver — Auto-compress tool_result content, save 20-40% tokens
- ✅ Circuit Breaker — If a provider dies, trip the breaker and route to the next one. No cascade.
- ✅ Account Semaphore — Per-account concurrency limiter prevents hammering one account
- ✅ Kimchi CLI-native — VansRoute's Kimchi provider mimics the official Kimchi CLI exactly
- ✅ Quota Auto-Reactivation — Kimchi credits run out → auto-parked until next month → auto-reactivated
- ✅ Proxy-Aware Resilience — One dead proxy doesn't block accounts on other proxies
- ✅ Per-API-key ACL — Hand out keys that can only touch specific providers/combos/kinds
- ✅ 40+ Providers — Kimchi, GitHub Copilot, Codex, Gemini, OpenCode, Kilo, NVIDIA, AgentRouter, + more
| Feature | 9Router | OmniRoute | VansRoute |
|---|---|---|---|
| Circuit breaker | ❌ | ✅ TypeScript + DB persistence | ✅ JS, in-memory (no DB dependency) |
| Account semaphore | ❌ | ✅ TypeScript | ✅ JS, ported + proxy-aware |
| Provider-level failure tracking | ❌ | ✅ with 5s dedup | ✅ ported, with dedup bound 10K |
| Provider exhaustion detection | ❌ | ✅ isProviderExhaustedReason() |
✅ ported + tightened regex |
| 429 excluded from breaker | ❌ N/A | ❌ (429 counts) | ✅ Only 5xx/timeout counts |
| Proxy-aware resilience | ❌ | ❌ | ✅ per-proxy breaker + semaphore |
| Kimchi CLI alignment | ❌ | ❌ | ✅ 5 models, per-model caps from models.dev |
| Kimchi quota auto-reactivation | ❌ | ❌ | ✅ monthly auto-reset via instrumentation hook |
| AgentRouter provider | ❌ | ✅ | ✅ |
| Model lockout | ✅ DB flat field | ✅ in-memory Map | ✅ DB flat field (inherited) |
| RTK + Caveman + Ponytail | ✅ | ✅ | ✅ inherited |
| NVIDIA Kimi stream coercion | ✅ | ✅ | ✅ inherited |
| Per-API-key ACL | ✅ fork-only | ❌ | ✅ inherited |
| Format translation | ✅ OpenAI↔Claude↔Gemini↔Kiro | ✅ | ✅ inherited |
| Kimi native tool parser | ✅ | ✅ | ✅ inherited + hardened |
| Combo strategies | 4 (fallback/RR/fusion/capacity) | 17 | 4 (inherited) |
| Settings cache (TPS) | ❌ (3 sync DB reads/req) | ❌ | ✅ 5s TTL cache |
| Connections cache (TPS) | ❌ (1 sync DB read/req) | ❌ | ✅ 2s TTL cache + invalidation |
| Per-provider mutex | ❌ (global mutex) | ❌ | ✅ per-provider parallel selection |
| Provider count | 40+ | 231+ | 40+ + AgentRouter |
- Kimchi CLI alignment — 9router's Kimchi provider masquerades as the official Kimchi CLI, with exactly the same 5 models, capabilities, and temperature rules
- Kimchi quota auto-reactivation — accounts deactivated due to quota exhaustion automatically reactivate at the 1st of each month
- In-memory circuit breaker without DB — simpler than OmniRoute's DB-backed version, no
domainState.jsdependency - Proxy-aware resilience — circuit breaker and semaphore keyed per
provider:proxyHashso one dead proxy doesn't block others - TPS optimization — cached settings + cached connections + per-provider mutex = fewer sync DB reads per request
- OmniRoute's 17 combo strategies (VansRoute has 4)
- OmniRoute's
sessionPoolwith fingerprint rotation - OmniRoute's
autoCombowith complexity routing and task fitness scoring - OmniRoute's 231 providers (VansRoute has 40+)
┌─────────────┐
│ Your CLI │ (Claude Code, Codex, Cursor, Cline, OpenCode...)
│ Tool │
└──────┬──────┘
│ POST http://localhost:20127/v1/chat/completions
↓
┌──────────────────────────────────────────────────────────┐
│ VansRoute Engine │
│ │
│ 1. Auth & ACL check (cached API key validation) │
│ 2. Circuit Breaker check (skip dead proxy buckets) │
│ 3. Account Semaphore (queue if at concurrency cap) │
│ 4. RTK / Caveman / Ponytail token compression │
│ 5. Format translation (OpenAI ↔ Claude ↔ Gemini ↔ Kiro) │
│ 6. Kimchi CLI-aligned param stripping │
│ 7. Executor → upstream provider (via proxy if configured) │
│ 8. Response translation back to client format │
│ │
│ On success: clearProviderFailure() + clearAccountError │
│ On error: recordProviderFailure() → breaker counts │
│ markAccountUnavailable() → try next account │
│ Kimchi quota? → deactivate until month end │
└──────────────────────────────────────────────────────────┘
│
├─→ Kimchi (5 CLI models, quota auto-reactivation)
├─→ AgentRouter ($200 free credits, passthrough)
├─→ GitHub Copilot / Codex (subscription tier)
├─→ Gemini CLI / OpenCode (free tier)
├─→ OpenRouter / NVIDIA / others
└─→ Combo (fallback / round-robin / fusion / capacity)
git clone https://github.com/Vanszs/VansRouter.git
cd VansRouter
pnpm install
pnpm run build
cp -r public .next/standalone/public
cp -r .next/static .next/standalone/.next/static
PORT=20127 node .next/standalone/server.jsOpen http://localhost:20127/dashboard, add provider connections, generate a VansRoute API key, and point your CLI to http://localhost:20127/v1.
npm install -g pm2
PORT=3003 pm2 start .next/standalone/server.js --name vansroute
pm2 savedocker run -d \
-p 20127:20127 \
-v vansroute-data:/home/node/.vansroute \
--name vansroute \
ghcr.io/vanszs/vansroute:latest🪟 Windows (from zero)
- Go to https://nodejs.org → download LTS (e.g.
node-v22.x.x-x64.msi) - Run installer, check "Add to PATH"
- Verify:
node --versionandnpm --version
npm install -g pnpm
pnpm --versionDownload from https://git-scm.com/download/win, install with defaults.
git clone https://github.com/Vanszs/VansRouter.git
cd VansRouter
copy .env.example .env
pnpm installJWT_SECRET=ganti-dengan-string-acak-panjang
INITIAL_PASSWORD=passwordmu
DATA_DIR=C:\Users\NamaKamu\.vansroute
PORT=20127
NODE_ENV=production
NEXT_PUBLIC_BASE_URL=http://localhost:20127
pnpm run buildset PORT=20127
node .next\standalone\server.jsOpen: http://localhost:20127
npm install -g pm2 pm2-windows-startup
pm2 start .next\standalone\server.js --name vansroute
pm2-startup install
pm2 save🐧 Linux / WSL (from zero)
# Install Node.js 20+
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs
# Install pnpm
npm install -g pnpm
# Clone
git clone https://github.com/Vanszs/VansRouter.git
cd VansRouter
cp .env.example .env
pnpm install
# Configure
nano .env # set JWT_SECRET, INITIAL_PASSWORD, PORT
# Build & run
pnpm run build
cp -r public .next/standalone/public
cp -r .next/static .next/standalone/.next/static
PORT=20127 node .next/standalone/server.jsnpm install -g pm2
PORT=20127 pm2 start .next/standalone/server.js --name vansroute
pm2 startup
pm2 save🐳 Docker
docker run -d \
-p 20127:20127 \
-v vansroute-data:/home/node/.vansroute \
--name vansroute \
ghcr.io/vanszs/vansroute:latestOr with docker-compose:
docker-compose up -d| Tool | Status | Config Path |
|---|---|---|
| Claude Code | ✅ Full support | ~/.claude/settings.json |
| Codex | ✅ Full support | ~/.codex/config.json |
| Cursor | ✅ Full support | Settings → Models → OpenAI API |
| Cline | ✅ Full support | VS Code extension settings |
| OpenCode | ✅ Full support | opencode.json |
| OpenClaw | ✅ Full support | CLI config |
| GitHub Copilot | ✅ Full support | hub.github.com |
| Gemini CLI | ✅ Full support | ~/.gemini/config.json |
| Antigravity | ✅ Full support | Antigravity settings |
| Copilot | ✅ Full support | VS Code extension |
| Kilo | ✅ Full support | Kilo settings |
| Cline | ✅ Full support | Cline settings |
| DeepSeek TUI | ✅ Full support | DeepSeek config |
| Jcode | ✅ Full support | Jcode settings |
| Droid | ✅ Full support | Droid config |
| Hermes | ✅ Full support | Hermes config |
| Provider | Free Quota | Models |
|---|---|---|
| Kimchi (CLI-aligned) | ~1.6B tokens/month | kimi-k2.7, minimax-m3, kimi-k2.6, nemotron-3-ultra-fp4, glm-5.2-fp8 |
| AgentRouter | $200 free credits | Passthrough (any model) |
| Kilo | Free tier | Various |
| OpenCode | Free tier | Various |
| Gemini CLI | Free tier | Gemini models |
| Provider | Auth | Models |
|---|---|---|
| GitHub Copilot | OAuth | Claude, GPT, Gemini |
| Codex | OAuth | GPT models |
| Antigravity | OAuth | Various |
| Gemini CLI | OAuth | Gemini models |
| Provider | Type | Notable Models |
|---|---|---|
| OpenRouter | Passthrough | 100+ models |
| NVIDIA NIM | Per-model | Kimi K2.6, Llama, GLM |
| SiliconFlow | Per-model | GLM, DeepSeek, Qwen |
| Z.AI | Per-model | GLM series |
| + 30+ more | Various | Various |
| Feature | What it does |
|---|---|
| Circuit Breaker | Per-proxy-bucket state machine: CLOSED → DEGRADED → OPEN → HALF_OPEN. After 5 failures (5xx/timeout only, not 429), provider is skipped until probe succeeds. |
| Account Semaphore | Per provider:account:proxyHash FIFO queue with 30s timeout. Default maxConcurrency = 3. |
| Provider Failure Tracking | recordProviderFailure() with 5s dedup per connection, bounded to 10K entries. |
| Proxy-Aware | Breaker and semaphore keyed by provider:proxyHash. One dead proxy doesn't block others. |
| Quota Auto-Reactivation | Kimchi accounts auto-deactivated on credit exhaustion, auto-reactivated on the 1st of each month. |
| Model | Reasoning | Vision | Temperature | Structured Output |
|---|---|---|---|---|
kimchi/kimi-k2.7 |
✅ toggle off | ✅ + video | ❌ (CLI: false) | ✅ |
kimchi/kimi-k2.6 |
✅ toggle | ✅ + video | ✅ | ✅ |
kimchi/minimax-m3 |
✅ | ✅ | ✅ | — |
kimchi/nemotron-3-ultra-fp4 |
❌ | ❌ | ✅ | — |
kimchi/glm-5.2-fp8 |
✅ (OpenAI) | ❌ | ✅ | — |
| Saver | What it does | Savings |
|---|---|---|
| RTK | Compresses tool_result content before sending upstream |
20-40% |
| Caveman | Instructs the model to respond in ultra-terse mode | 15-75% |
| Ponytail | Forces the laziest (simplest) code output | 10-50% |
| Headroom | External compression service (optional) | Variable |
| Strategy | Description |
|---|---|
| Fallback | Try provider A → if fail → try B → if fail → try C |
| Round Robin | Rotate across providers evenly |
| Fusion | Send to multiple providers in parallel, judge picks best response |
| Capacity | Auto-switch based on provider availability and load |
- Per-API-key ACL — scope each key to specific providers, combos, and model kinds
- Secured
/v1by default — safe to expose on a VPS - Kiro format support — with provider/combo/kind ACLs
- NVIDIA Kimi stream coercion — forces
stream:falseupstream, translates back to SSE - Kimi native tool parser — handles leaked
functions.NAME:ID {JSON}markup - Compact request handling — trims redundant context before dispatch
- Settings cache — 5s TTL, eliminates sync DB reads per request
- Connections cache — 2s TTL with invalidation on writes
| Tier | Providers | Cost | Best For |
|---|---|---|---|
| Free | Kimchi, AgentRouter, Kilo, OpenCode | $0 | Hobby / testing |
| Subscription | GitHub Copilot, Codex, Antigravity | $10-20/mo | Professional coding |
| Pay-per-use | OpenRouter, NVIDIA, SiliconFlow | $0.2-3/1M tokens | Production scale |
| Combo | Mix of above | $0-20/mo | Best of all worlds |
- Connect Claude Code to Kimchi (free) + GitHub Copilot (subscription)
- RTK saves 30% tokens on tool outputs
- Circuit breaker prevents downtime when Kimchi rate-limits
- Per-API-key ACL: dev keys can only use Kimchi (free), prod keys use Copilot
- Combo: fallback from Copilot → Kimchi → AgentRouter
- Usage dashboard tracks spend per key
- PM2 + standalone build = production-grade
- Proxy-aware resilience: if one proxy dies, others keep working
- Kimchi quota auto-reactivation: no manual babysitting
Why is VansRoute faster than 9Router?
VansRoute caches getSettings() (5s TTL) and getProviderConnections() (2s TTL), eliminating 4+ synchronous better-sqlite3 reads per request. The per-provider mutex (replacing the global mutex) allows parallel credential selection across different providers.
What happens when Kimchi credits run out?
The account is automatically deactivated (isActive: false, testStatus: "quota_exhausted", rateLimitedUntil set to the 1st of next month). On the 1st, reactivateExpiredKimchiAccounts() runs via the instrumentation hook and reactivates all exhausted accounts. No manual intervention needed.
What happens when a proxy dies?
The circuit breaker for that specific proxy bucket (provider:proxyHash) trips OPEN after 5 failures. Accounts using that proxy are skipped. Accounts on other proxies continue working normally. The breaker automatically enters HALF_OPEN after the reset timeout to probe recovery.
Does 429 trip the circuit breaker?
No. 429 (rate limit) is per-account, not per-provider. Only 408, 500, 502, 503, 504 count toward the circuit breaker. This prevents false trips when multiple accounts hit rate limits simultaneously.
How do I add a new provider?
- Create
open-sse/providers/registry/{providerId}.jsfollowing theREGISTRY_TEMPLATE - Add models to the provider entry
- Add the provider ID to
src/shared/constants/providers.js - Regenerate the registry index if needed
Is VansRoute compatible with 9Router configs?
Yes. VansRoute retains full format compatibility with 9Router. The DB schema, API endpoints, and CLI tool configurations are identical. You can migrate by copying your .vansroute/ data directory.
# Required
JWT_SECRET=your-random-secret-at-least-32-chars
INITIAL_PASSWORD=your-dashboard-password
PORT=20127
# Optional
DATA_DIR=~/.vansroute # Default: ~/.vansroute
NODE_ENV=production
NEXT_PUBLIC_BASE_URL=http://localhost:20127
ANTIGRAVITY_TIMEOUT_MS=15000 # Antigravity quota fetch timeoutpnpm install
pnpm run build
cp -r public .next/standalone/public
cp -r .next/static .next/standalone/.next/static
PORT=20127 node .next/standalone/server.js- Open
http://localhost:20127/masuk(login) - Go to Providers → add API keys / OAuth
- Go to Settings → generate VansRoute API key
- Go to CLI Tools → configure your tool (Claude Code, Codex, etc.)
- Point your CLI to
http://localhost:20127/v1
VansRoute's Kimchi provider exposes exactly 5 models matching the Kimchi CLI:
{
"model": "kimchi/kimi-k2.6",
"baseURL": "http://localhost:20127/v1",
"apiKey": "your-vansroute-key"
}When credits run out, accounts are auto-parked until the 1st of next month.
Create a combo in the dashboard to enable fallback:
combo/my-fallback:
1. kimchi/kimi-k2.6 (free)
2. openrouter/anthropic/claude-3.5-sonnet (paid)
3. nvidia/moonshotai/kimi-k2.6(high) (free)
| Model ID | Family | Context | Max Output | Reasoning |
|---|---|---|---|---|
kimchi/kimi-k2.7 |
kimi-k2 | 262K | 262K | ✅ (toggle off) |
kimchi/kimi-k2.6 |
kimi-k2 | 262K | 262K | ✅ (toggle) |
kimchi/minimax-m3 |
minimax | 1M | 512K | ✅ |
kimchi/nemotron-3-ultra-fp4 |
nemotron | 128K | 8K | ❌ |
kimchi/glm-5.2-fp8 |
glm | 128K | 128K | ✅ (OpenAI) |
| Model ID | Upstream ID | Max Tokens |
|---|---|---|
nvidia/moonshotai/kimi-k2.6(none) |
moonshotai/kimi-k2.6 | 16384 |
nvidia/moonshotai/kimi-k2.6(low) |
moonshotai/kimi-k2.6 | 16384 |
nvidia/moonshotai/kimi-k2.6(medium) |
moonshotai/kimi-k2.6 | 16384 |
nvidia/moonshotai/kimi-k2.6(high) |
moonshotai/kimi-k2.6 | 16384 |
Any model ID works — AgentRouter routes to the upstream provider.
| Error | Cause | Solution |
|---|---|---|
EADDRINUSE |
Port already in use | Change PORT=20128 or kill the process |
EACCES |
Permission denied | Run as admin (Windows) or check folder permissions |
Model not found |
Wrong model ID | Check /v1/models for available models |
expanded is not defined |
— | ✅ Fixed in VansRoute |
handleViewDetail is not defined |
— | ✅ Fixed in VansRoute |
| Kimchi always 503 | All accounts rate-limited | Wait for cooldown or add more accounts |
| Kimchi quota exhausted | Credits used up | Auto-reactivates on the 1st of next month |
| Provider circuit breaker open | 5+ failures (5xx) | Wait 30s for HALF_OPEN probe |
| Layer | Technology |
|---|---|
| Framework | Next.js 16 (standalone output) |
| Runtime | Node.js 20+ |
| Database | better-sqlite3 (synchronous, cached) |
| Process Manager | PM2 |
| Container | Docker / docker-compose |
| Testing | Vitest (214 tests) |
| Language | JavaScript ESM (.js) |
| Package Manager | pnpm |
POST /v1/chat/completions
Authorization: Bearer <your-vansroute-key>
Content-Type: application/json
{
"model": "kimchi/kimi-k2.6",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}GET /v1/models
Authorization: Bearer <your-vansroute-key>POST /v1/embeddings
Authorization: Bearer <your-vansroute-key>
Content-Type: application/json
{
"model": "nvidia/nv-embedqa-e5-v5",
"input": "text to embed"
}POST /v1/images/generations
Authorization: Bearer <your-vansroute-key>
Content-Type: application/json
{
"model": "openai/gpt-image-1",
"prompt": "a cat"
}POST /v1/messages
Authorization: Bearer <your-vansroute-key>
Content-Type: application/json
{
"model": "kimchi/kimi-k2.6",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello"}]
}# All tests
pnpm test
# Unit tests only
pnpm test tests/unit/
# Translator tests
pnpm test tests/translator/
# Targeted resilience + Kimchi audit suite (214 tests)
npx vitest run --config tests/vitest.config.js --pool=forks \
tests/unit/kimi-max-tokens.test.js \
tests/unit/kimchi-cli-derived.test.js \
tests/unit/kimchi-cli-config.test.js \
tests/unit/kimchi-quota-reactivation.test.js \
tests/unit/loop-guard.test.js \
tests/unit/loop-guard-wiring.test.js \
tests/unit/termination-prompt.test.js \
tests/unit/kimi-nvidia-hardening.test.js \
tests/unit/kimi-native-tool-parser.test.js \
tests/translator/thinking-unified.test.js \
tests/unit/circuit-breaker.test.js \
tests/unit/account-semaphore.test.js \
tests/unit/proxy-aware-resilience.test.jsVansRouter/
├── src/
│ ├── shared/utils/
│ │ └── circuitBreaker.js # Circuit breaker (CLOSED→OPEN→HALF_OPEN)
│ ├── sse/
│ │ ├── handlers/chat.js # Retry loop with breaker + semaphore
│ │ └── services/
│ │ └── kimchiQuotaReactivation.js # Monthly auto-reactivation
│ ├── lib/
│ │ ├── network/connectionProxy.js # getProxyHash() for proxy-aware resilience
│ │ └── db/repos/
│ │ ├── settingsRepo.js # Cached getSettings() (5s TTL)
│ │ └── connectionsRepo.js # Cached getProviderConnections() (2s TTL)
│ ├── instrumentation.js # Next.js startup hook
│ └── app/ # Dashboard + API routes
├── open-sse/
│ ├── services/
│ │ ├── accountFallback.js # + provider failure tracking (proxy-aware)
│ │ └── accountSemaphore.js # Per-account concurrency limiter (proxy-aware)
│ ├── utils/
│ │ └── circuitBreaker.js # Circuit breaker
│ ├── providers/registry/
│ │ ├── kimchi.js # 5 CLI-aligned models
│ │ └── agentrouter.js # Passthrough ($200 free)
│ └── handlers/chatCore.js # Core SSE engine
├── tests/unit/
│ ├── circuit-breaker.test.js # 16 tests
│ ├── account-semaphore.test.js # 17 tests
│ ├── proxy-aware-resilience.test.js # 16 tests
│ ├── kimchi-cli-config.test.js # 10 tests
│ ├── kimchi-quota-reactivation.test.js # 10 tests
│ └── ... # Full audit suite
└── .docs/audit/ # Audit documentation
- GitHub Issues: Vanszs/VansRouter/issues
VansRoute builds on the work of two excellent open-source projects:
-
9Router by @decolua — the foundation: provider registry, RTK token saver, format translation, combo strategies, per-API-key ACL, Kimi native tool parser, NVIDIA stream coercion. VansRoute started as a hardened fork of 9Router and retains full format compatibility.
-
OmniRoute by @diegosouzapw — the resilience inspiration: circuit breaker pattern, account semaphore, provider-level failure tracking, provider exhaustion detection. VansRoute ported these concepts from OmniRoute's TypeScript implementation to plain JavaScript ESM, simplified the persistence layer (in-memory instead of DB-backed), and added Kimchi-specific quota auto-reactivation.
Full credit to both projects. VansRoute stands on their work.
MIT — see LICENSE