feat(pricing): automatic gap-fill from models.dev and OpenRouter#457
Open
iamtoruk wants to merge 1 commit into
Open
feat(pricing): automatic gap-fill from models.dev and OpenRouter#457iamtoruk wants to merge 1 commit into
iamtoruk wants to merge 1 commit into
Conversation
Keep model pricing automatic instead of hand-coding new models. The bundler now layers three sources in priority order: LiteLLM (broad list prices), hand-curated MANUAL_ENTRIES overrides, then a separate last-resort fallback file gap-filled from models.dev first-party makers (official direct prices) and OpenRouter (resale backstop). New models such as MiniMax-M3 ($0.6/$2.4) now price correctly with no per-model code. The fallback is written to its own pricing-fallback.json and consulted only case-insensitively as the final step in getModelCosts, so a reseller variant name can never shadow a canonical or aliased match. Fixes surfaced while building and verifying this: - Alias precedence: LiteLLM ships snowflake/claude-4-opus ($5), which the bundler strips to a bare claude-4-opus key that shadowed the curated alias to claude-opus-4 ($15 official). An explicit alias for a bare name now wins over a coincidental stripped reseller key; the prefixed gateway price is still returned for the fully-qualified id. - Zero-stub guard: LiteLLM [0,0] price stubs (e.g. GigaChat-2-Max) are excluded from the case-insensitive index so a case-mismatched query stays null and keeps firing the unknown-model warning instead of silently reporting $0. - Negative-sentinel guard: OpenRouter returns -1 for variable/BYOK-priced models. The bundler now rejects any non-positive rate pair (and strips the sentinel from cache fields) so a negative per-token cost can never ship and subtract from spend totals. Bundler hardening: bareKey strips @pin and date suffixes to match the runtime canonical form, seen-set dedupes on both full and bare key shapes, and it logs MANUAL_ENTRIES now covered upstream plus models.dev allowlist drift. Extracted buildCosts so the cache-cost heuristics live in one place. Added a data-hygiene test that fails CI if a rebundle reintroduces negative, free, or unreachable fallback entries.
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Make model pricing automatic instead of hand-coding new models. The bundler layers three sources in priority order:
pricing-fallback.jsonand consulted only case-insensitively as the last resort, so a reseller variant name can never shadow a canonical/aliased match.New models like MiniMax-M3 ($0.6/$2.4) now price correctly with zero per-model code.
Bugs fixed while building and verifying
snowflake/claude-4-opus($5 gateway) was stripping to a bareclaude-4-opuskey that shadowed the curated alias toclaude-opus-4($15 official). An explicit alias for a bare name now wins; the prefixed gateway price is still returned for the fully-qualified id.[0,0]stubs (e.g.GigaChat-2-Max) are excluded from the case-insensitive index, so a case-mismatched query staysnulland keeps firing the unknown-model warning instead of silently reporting $0.-1for variable/BYOK-priced models; the bundler now rejects any non-positive rate pair so a negative per-token cost can never subtract from spend totals.Verification
Supersedes #424 (MiniMax-M3 now auto-prices to the same $0.6/$2.4 with no hardcoding).