-
-
Notifications
You must be signed in to change notification settings - Fork 64
feat(bailian): add native support for Alibaba Cloud Bailian (百炼) #392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -128,5 +128,5 @@ Full reference: `.env.template` and `config/config.yaml` | |||||||||
| - **Resilience:** Configured via `config/config.yaml` - global `resilience.retry.*` and `resilience.circuit_breaker.*` defaults with optional per-provider overrides under `providers.<name>.resilience.retry.*` and `providers.<name>.resilience.circuit_breaker.*`. Retry defaults: `max_retries` (3), `initial_backoff` (1s), `max_backoff` (30s), `backoff_factor` (2.0), `jitter_factor` (0.1). Circuit breaker defaults: `failure_threshold` (5), `success_threshold` (2), `timeout` (30s) | ||||||||||
| - **Metrics:** `METRICS_ENABLED` (false), `METRICS_ENDPOINT` (/metrics) | ||||||||||
| - **Guardrails:** Configured via `config/config.yaml` only (except `GUARDRAILS_ENABLED` env var) | ||||||||||
| - **Providers:** `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, `USE_GOOGLE_GEMINI_NATIVE_API` (true by default; false uses Gemini's OpenAI-compatible chat API), `XAI_API_KEY`, `GROQ_API_KEY`, `OPENROUTER_API_KEY`, `ZAI_API_KEY`, `ZAI_BASE_URL` (optional Z.ai endpoint override), `MINIMAX_API_KEY`, `MINIMAX_BASE_URL` (optional MiniMax endpoint override), `XIAOMI_API_KEY`, `XIAOMI_BASE_URL` (optional Xiaomi MiMo endpoint override), `AZURE_API_KEY`, `AZURE_BASE_URL` (Azure OpenAI deployment base URL), `AZURE_API_VERSION` (optional Azure API version), `ORACLE_API_KEY` (Oracle API key), `ORACLE_BASE_URL` (Oracle OpenAI-compatible base URL), `<PROVIDER>[_SUFFIX]_MODELS` (comma-separated configured model list for any provider type), `OLLAMA_BASE_URL`, `VLLM_BASE_URL`, `VLLM_API_KEY` (optional upstream vLLM bearer token) | ||||||||||
| - **Providers:** `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, `USE_GOOGLE_GEMINI_NATIVE_API` (true by default; false uses Gemini's OpenAI-compatible chat API), `XAI_API_KEY`, `GROQ_API_KEY`, `OPENROUTER_API_KEY`, `ZAI_API_KEY`, `ZAI_BASE_URL` (optional Z.ai endpoint override), `MINIMAX_API_KEY`, `MINIMAX_BASE_URL` (optional MiniMax endpoint override), `XIAOMI_API_KEY`, `XIAOMI_BASE_URL` (optional Xiaomi MiMo endpoint override), `BAILIAN_API_KEY`, `BAILIAN_BASE_URL` (optional Bailian base URL for region switching; default `https://dashscope.aliyuncs.com/compatible-mode/v1`), `AZURE_API_KEY`, `AZURE_BASE_URL` (Azure OpenAI deployment base URL), `AZURE_API_VERSION` (optional Azure API version), `ORACLE_API_KEY` (Oracle API key), `ORACLE_BASE_URL` (Oracle OpenAI-compatible base URL), `<PROVIDER>[_SUFFIX]_MODELS` (comma-separated configured model list for any provider type), `OLLAMA_BASE_URL`, `VLLM_BASE_URL`, `VLLM_API_KEY` (optional upstream vLLM bearer token) | ||||||||||
| - **Provider model metadata:** `providers.<name>.models` accepts either model IDs (strings) or `{id, metadata}` objects. When `metadata` is supplied (`display_name`, `context_window`, `max_output_tokens`, `modes`, `capabilities`, `pricing`, …) it is merged onto the remote ai-model-list entry during enrichment, with operator values winning per-field. Primary use case: advertising context windows, capabilities, and pricing for local models (Ollama) and other custom endpoints whose IDs are not in the upstream registry. | ||||||||||
|
Comment on lines
+131
to
132
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add This list is still incomplete without the embedding allowlist variable that the new Bailian docs and env template already advertise. ✍️ Suggested update- - `BAILIAN_BASE_URL` (optional Bailian base URL for region switching; default `https://dashscope.aliyuncs.com/compatible-mode/v1`), `AZURE_API_KEY`, ...
+ - `BAILIAN_BASE_URL` (optional Bailian base URL for region switching; default `https://dashscope.aliyuncs.com/compatible-mode/v1`),
+ `BAILIAN_MODELS` (optional model allowlist for embeddings), `AZURE_API_KEY`, ...📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1125,7 +1125,7 @@ func TestLoad_ConfigExample_UsesNestedModelCacheSettings(t *testing.T) { | |
| t.Fatalf("expected Cache.Model.Redis to be nil in example config, got %+v", result.Config.Cache.Model.Redis) | ||
| } | ||
| gotProviders := result.Config.Server.EnabledPassthroughProviders | ||
| wantProviders := []string{"openai", "anthropic", "openrouter", "zai", "vllm", "deepseek"} | ||
| wantProviders := []string{"openai", "anthropic", "openrouter", "zai", "vllm", "deepseek", "bailian"} | ||
| if !reflect.DeepEqual(gotProviders, wantProviders) { | ||
| t.Fatalf("Server.EnabledPassthroughProviders = %v, want %v", gotProviders, wantProviders) | ||
|
Comment on lines
1127
to
1130
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Keep the config tests isolated from Bailian env leakage.
🔧 Suggested fix "OLLAMA_API_KEY", "OLLAMA_BASE_URL", "OLLAMA_MODELS",
+ "BAILIAN_API_KEY", "BAILIAN_BASE_URL", "BAILIAN_MODELS",🤖 Prompt for AI Agents |
||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,104 @@ | ||||||||||||||||||||||||||||
| --- | ||||||||||||||||||||||||||||
| title: "Alibaba Cloud Bailian" | ||||||||||||||||||||||||||||
| description: "Configure Alibaba Cloud Bailian (百炼 / DashScope) in GoModel, including the max_tokens compatibility shim for models like Qwen." | ||||||||||||||||||||||||||||
| icon: "cloud" | ||||||||||||||||||||||||||||
| --- | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Bailian (百炼) is Alibaba Cloud's model-as-a-service platform for the Qwen | ||||||||||||||||||||||||||||
| family of models. GoModel routes to Bailian through its OpenAI-compatible | ||||||||||||||||||||||||||||
| endpoint (`/compatible-mode/v1`). | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Because Bailian deprecated `max_tokens` in April 2026 in favor of | ||||||||||||||||||||||||||||
| `max_completion_tokens`, GoModel automatically maps the standard | ||||||||||||||||||||||||||||
| `max_tokens` field to `max_completion_tokens` for every request — no | ||||||||||||||||||||||||||||
| client change required. | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ## Configure | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ```bash | ||||||||||||||||||||||||||||
| BAILIAN_API_KEY=... | ||||||||||||||||||||||||||||
| ``` | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Or in `config.yaml`: | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ```yaml | ||||||||||||||||||||||||||||
| providers: | ||||||||||||||||||||||||||||
| bailian: | ||||||||||||||||||||||||||||
| type: bailian | ||||||||||||||||||||||||||||
| api_key: "${BAILIAN_API_KEY}" | ||||||||||||||||||||||||||||
| # base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1" | ||||||||||||||||||||||||||||
| ``` | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ## Base URLs | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Bailian's OpenAI-compatible API is available in multiple regions. Set | ||||||||||||||||||||||||||||
| `BAILIAN_BASE_URL` to switch: | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| | Region | URL | | ||||||||||||||||||||||||||||
| | ------ | --- | | ||||||||||||||||||||||||||||
| | Beijing (default) | `https://dashscope.aliyuncs.com/compatible-mode/v1` | | ||||||||||||||||||||||||||||
| | Singapore | `https://{workspace-id}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1` | | ||||||||||||||||||||||||||||
| | Frankfurt | `https://{workspace-id}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1` | | ||||||||||||||||||||||||||||
| | Hong Kong | `https://{workspace-id}.cn-hongkong.maas.aliyuncs.com/compatible-mode/v1` | | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ## Model IDs | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Common Qwen model identifiers — check the [Bailian model | ||||||||||||||||||||||||||||
| list](https://www.alibabacloud.com/help/zh/model-studio/model-list) for the | ||||||||||||||||||||||||||||
| current catalog: | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| | Model | Example ID | | ||||||||||||||||||||||||||||
| | ----- | ---------- | | ||||||||||||||||||||||||||||
| | Qwen 3.7 Max | `qwen3.7-max` | | ||||||||||||||||||||||||||||
| | Qwen 3.7 Plus | `qwen3.7-plus` | | ||||||||||||||||||||||||||||
| | Qwen 3.6 Flash | `qwen3.6-flash` | | ||||||||||||||||||||||||||||
| | Qwen 3 Max | `qwen3-max` | | ||||||||||||||||||||||||||||
| | Qwen 3 Plus | `qwen3-plus` | | ||||||||||||||||||||||||||||
| | Qwen 3 Flash | `qwen3-flash` | | ||||||||||||||||||||||||||||
| | Qwen 3 Coder Plus | `qwen3-coder-plus` | | ||||||||||||||||||||||||||||
| | Text Embedding | `text-embedding-v3` | | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ## `max_tokens` compatibility | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Bailian deprecated `max_tokens` on 2026-04-20 (effective 2026-05-30). | ||||||||||||||||||||||||||||
| All Bailian models now require `max_completion_tokens` instead. | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| GoModel transparently maps the standard `max_tokens` parameter to | ||||||||||||||||||||||||||||
| `max_completion_tokens` for every bailian model — send `max_tokens` as you | ||||||||||||||||||||||||||||
| normally would, and GoModel rewrites it before forwarding to Bailian. | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ```bash | ||||||||||||||||||||||||||||
| # max_tokens=4096 is automatically sent as max_completion_tokens=4096 | ||||||||||||||||||||||||||||
| curl http://localhost:8080/v1/chat/completions \ | ||||||||||||||||||||||||||||
| -H "Content-Type: application/json" \ | ||||||||||||||||||||||||||||
| -d '{ | ||||||||||||||||||||||||||||
| "model": "qwen3-max", | ||||||||||||||||||||||||||||
| "messages": [{"role": "user", "content": "Hello"}], | ||||||||||||||||||||||||||||
| "max_tokens": 4096 | ||||||||||||||||||||||||||||
| }' | ||||||||||||||||||||||||||||
| ``` | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ## Supported features | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| | Feature | Supported | | ||||||||||||||||||||||||||||
| | ------- | :-------: | | ||||||||||||||||||||||||||||
| | Chat completions | ✅ | | ||||||||||||||||||||||||||||
| | Streaming chat | ✅ | | ||||||||||||||||||||||||||||
| | Responses (`/v1/responses`) | ✅ (translated to chat) | | ||||||||||||||||||||||||||||
| | Embeddings | ✅ (configure model IDs via `BAILIAN_MODELS`) | | ||||||||||||||||||||||||||||
| | Files (`/v1/files`) | ✅ | | ||||||||||||||||||||||||||||
| | Batches (`/v1/batches`) | ✅ | | ||||||||||||||||||||||||||||
| | Passthrough (`/p/bailian/...`) | ✅ | | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| <Note> | ||||||||||||||||||||||||||||
| Embedding models (`text-embedding-v3`, `text-embedding-v4`) are served by | ||||||||||||||||||||||||||||
| the compatible-mode API but are **not** auto-discovered from the upstream | ||||||||||||||||||||||||||||
| `/v1/models` endpoint. Set `BAILIAN_MODELS=text-embedding-v3,text-embedding-v4` | ||||||||||||||||||||||||||||
| or use `CONFIGURED_PROVIDER_MODELS_MODE=allowlist` to make them available. | ||||||||||||||||||||||||||||
| </Note> | ||||||||||||||||||||||||||||
|
Comment on lines
+93
to
+98
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't make
✍️ Suggested wording- Set `BAILIAN_MODELS=text-embedding-v3,text-embedding-v4`
- or use `CONFIGURED_PROVIDER_MODELS_MODE=allowlist` to make them available.
+ Set `BAILIAN_MODELS=text-embedding-v3,text-embedding-v4`.
+ Optionally use `CONFIGURED_PROVIDER_MODELS_MODE=allowlist` if you want
+ only the configured models exposed.📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ## References | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| - [Bailian documentation](https://www.alibabacloud.com/help/zh/model-studio/) | ||||||||||||||||||||||||||||
| - [OpenAI-compatible API reference](https://www.alibabacloud.com/help/zh/model-studio/compatibility-with-openai-responses-api) | ||||||||||||||||||||||||||||
| - [Qwen model list](https://www.alibabacloud.com/help/zh/model-studio/model-list) | ||||||||||||||||||||||||||||
Uh oh!
There was an error while loading. Please reload this page.