feat(llm): multi-model routing with availability fallback by qiankunli · Pull Request #217 · alibaba/open-code-review

qiankunli · 2026-06-25T04:21:05Z

Add an ordered model pool so a review falls over to another provider/model when the primary is rate-limited, down, or timing out — instead of failing the file.

config: new routing namespace — routing.models ([{provider, model}], priority order, reusing the existing providers map for credentials) and routing.policy (only "priority" today; reserved for future policies, an unknown value is rejected rather than silently ignored). Namespacing under routing keeps it distinct from providers..models (a provider's model catalog) and gives future routing knobs a home.
LLMRouter implements LLMClient: tries members in order, advances on availability errors (429/5xx/network), short-circuits on client-side errors (400/413/422) and context cancellation. A per-run shared cooldown parks a throttled model so concurrent per-file subtasks skip it.
router members use a low SDK retry budget so a rate-limited model fails fast to the next instead of burning the full backoff (MaxRetries now configurable; default 5 preserved).
docs: README.md / README.zh-CN.md config reference + Multi-model fallback.

No routing.models keeps the current single-model behavior; --model pins a single endpoint. Tests cover fallover / short-circuit / exhaustion / cooldown, error classification, config chain resolution, and policy validation.

Add an ordered model pool so a review falls over to another provider/model when the primary is rate-limited, down, or timing out — instead of failing the file. - config: new `routing` namespace — `routing.models` ([{provider, model}], priority order, reusing the existing `providers` map for credentials) and `routing.policy` (only "priority" today; reserved for future policies, an unknown value is rejected rather than silently ignored). Namespacing under `routing` keeps it distinct from providers.<name>.models (a provider's model catalog) and gives future routing knobs a home. - LLMRouter implements LLMClient: tries members in order, advances on availability errors (429/5xx/network), short-circuits on client-side errors (400/413/422) and context cancellation. A per-run shared cooldown parks a throttled model so concurrent per-file subtasks skip it. - router members use a low SDK retry budget so a rate-limited model fails fast to the next instead of burning the full backoff (MaxRetries now configurable; default 5 preserved). - docs: README.md / README.zh-CN.md config reference + Multi-model fallback. No `routing.models` keeps the current single-model behavior; `--model` pins a single endpoint. Tests cover fallover / short-circuit / exhaustion / cooldown, error classification, config chain resolution, and policy validation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

CLAassistant · 2026-06-25T04:21:11Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ qiankunli
❌ liqiankun1111
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

github-actions

🔍 OpenCodeReview found 4 issue(s) in this PR.

✅ 4 posted as inline comment(s)
📝 0 posted as summary

- resolveModelRef: clear sub.Model so a top-level `model` cannot leak into a routing entry that omits its own model (model now comes only from ref.Model or the provider default). - LLMRouter: when a call fails, stop and return ctx.Err() if the shared context is canceled or past its deadline — every member uses that ctx, so none can succeed; avoids wasted fallover attempts and misleading logs. A per-request timeout (ctx still live) still falls over. - order(): delete expired cooldown entries so the map stays bounded. - ResolvedEndpoint.MaxRetries: clarify it is internal/router-set, not read from config. Adds a router test for the context-done short-circuit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

A web edit (5bbe6e9) accidentally pasted the for/if/if header twice in LLMRouter.order(), leaving unbalanced braces that broke the build. Remove the duplicate; the intended if/else cooldown handling is preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread internal/llm/client.go

Comment thread internal/llm/client.go

Comment thread internal/llm/resolver.go Outdated

Comment thread internal/llm/resolver.go

liqiankun1111 and others added 3 commits June 25, 2026 14:03

Update internal/llm/client.go

5bbe6e9

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(llm): multi-model routing with availability fallback#217

feat(llm): multi-model routing with availability fallback#217
qiankunli wants to merge 4 commits into
alibaba:mainfrom
qiankunli:feat/llm-multi-model

qiankunli commented Jun 25, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Jun 25, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

qiankunli commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qiankunli commented Jun 25, 2026 •

edited

Loading

CLAassistant commented Jun 25, 2026 •

edited

Loading