Skip to content

Implement fallback LLM provider logic#248

Open
palakjaiswal16 wants to merge 1 commit into
rishabh0510rishabh:mainfrom
palakjaiswal16:fix/llm-fallback-provider
Open

Implement fallback LLM provider logic#248
palakjaiswal16 wants to merge 1 commit into
rishabh0510rishabh:mainfrom
palakjaiswal16:fix/llm-fallback-provider

Conversation

@palakjaiswal16

@palakjaiswal16 palakjaiswal16 commented May 26, 2026

Copy link
Copy Markdown

Description

Implemented automatic fallback logic for LLM providers so the troubleshoot flow can try backup providers when the primary provider raises LLMProviderError. This prevents the AI troubleshoot endpoint from failing immediately when a configured provider such as OpenRouter is unavailable.

Related Issues

Fixes #194

Changes Made

  • Added FallbackProvider to try providers sequentially until one succeeds.
  • Added configurable ENVFORGE_LLM_PROVIDER_FALLBACKS support with default real-provider order: openrouter -> openai -> ollama.
  • Updated troubleshoot audit metadata to record the provider/model that actually succeeded after fallback.
  • Added unit tests for fallback behavior and provider-chain construction.

Verification

  • Added unit tests
  • Ran pytest tests/unit/ai successfully
  • Manually tested via the API / CLI
  • (If applicable) Generated scripts pass SafetyFilter

Documentation

  • Updated docs/FEATURES.md (if adding a feature/profile)
  • Updated CHANGELOG.md
  • Code is fully documented and type-hinted

Summary by CodeRabbit

Release Notes

  • New Features
    • Added multi-provider fallback orchestration: when the primary LLM provider fails, the system automatically attempts configured fallback providers in sequence, improving overall reliability and reducing service disruptions.
    • Added new configuration option to customize the order of fallback providers, with sensible defaults provided if not specified.

Review Change Stack

@vercel

vercel Bot commented May 26, 2026

Copy link
Copy Markdown

@palakjaiswal16 is attempting to deploy a commit to the rishabhmishra0510-5147's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This PR implements automatic fallback logic for multiple LLM providers. It adds a FallbackProvider wrapper that tries providers sequentially, configuration for custom fallback order, and service integration to expose provider metadata. Tests validate both the fallback orchestration and chain configuration behavior.

Changes

Multi-Provider Fallback Orchestration

Layer / File(s) Summary
FallbackProvider abstraction and configuration
backend/app/ai/providers/__init__.py, backend/app/config.py
FallbackProvider class wraps multiple LLMProvider instances and exposes provider_name, model, and last_token_usage attributes. complete() tries providers sequentially and aggregates failures into a single LLMProviderError. stream() yields from the first successful provider or re-raises if failure occurs mid-stream. Module constants define SUPPORTED_PROVIDERS and DEFAULT_FALLBACK_ORDER. Settings adds envforge_llm_provider_fallbacks field for custom fallback order.
Provider chain configuration and factory
backend/app/ai/providers/__init__.py
get_provider() builds a provider chain using _provider_chain(), which validates the primary provider, applies optional comma-separated fallback override, computes defaults (openrouter → openai → ollama), and deduplicates. _build_provider() factory instantiates concrete provider types (MockProvider, OpenRouterProvider, OpenAIProvider, OllamaProvider) with settings and raises LLMProviderError for unknown names. Returns single provider or FallbackProvider wrapper when multiple are available.
Service integration for provider attributes
backend/app/ai/service.py
AITroubleshootService.troubleshoot() derives provider_name from optional provider.provider_name attribute using getattr() (falling back to class name). Ensures provider_name and model_name are assigned inside the LLM call try block for audit and persistence logic.
Fallback provider test coverage
backend/tests/unit/ai/test_fallback_provider.py
DummyProvider test double conditionally fails or returns responses and tracks calls and token usage. SettingsStub enables fallback configuration testing. Tests verify FallbackProvider.complete() switches to next provider on error and exposes attributes from successful fallback. Tests verify aggregated error with all provider names and reasons when all fail. Tests verify _provider_chain() applies default and configured fallback order.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

level:advanced, type:refactor, quality:clean

Suggested reviewers

  • rishabh0510rishabh

Poem

🐰 Behold the fallback chain so grand,
When one provider fails to stand,
The next in line steps up to try,
No troublesome request will die!
Sequential guardians, strong and true,
Always ready with provider two! 🔗

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Implement fallback LLM provider logic' directly and concisely describes the main change in the changeset.
Linked Issues check ✅ Passed The PR successfully implements all requirements from issue #194: a FallbackProvider wrapper with sequential provider fallback, configurable fallback chain, and integration with the troubleshoot service.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing fallback LLM provider logic as specified in issue #194; no out-of-scope modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
backend/tests/unit/ai/test_fallback_provider.py (1)

36-45: ⚡ Quick win

Add fallback stream-path tests to cover this new behavior.

DummyProvider.stream() is implemented, but this suite doesn’t currently verify fallback orchestration for FallbackProvider.stream(). Add at least one success-after-failure stream test (and ideally all-fail stream test) to lock in parity with complete() behavior.

Proposed test additions
+@pytest.mark.asyncio
+async def test_fallback_provider_stream_tries_next_provider_after_error():
+    primary = DummyProvider("primary", should_fail=True)
+    fallback = DummyProvider("fallback")
+    provider = FallbackProvider([primary, fallback])
+
+    chunks = [chunk async for chunk in provider.stream("system", "user", DummyResponse)]
+
+    assert chunks == ["ok from fallback"]
+    assert primary.calls == 1
+    assert fallback.calls == 1
+    assert provider.provider_name == "DummyProvider"
+    assert provider.model == "fallback-model"
+
+
+@pytest.mark.asyncio
+async def test_fallback_provider_stream_raises_after_all_providers_fail():
+    provider = FallbackProvider([
+        DummyProvider("primary", should_fail=True),
+        DummyProvider("fallback", should_fail=True),
+    ])
+
+    with pytest.raises(LLMProviderError):
+        [chunk async for chunk in provider.stream("system", "user", DummyResponse)]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/tests/unit/ai/test_fallback_provider.py` around lines 36 - 45, Add
unit tests that exercise FallbackProvider.stream orchestration similar to the
existing complete() tests: create two DummyProvider instances with .should_fail
toggled so the first raises LLMProviderError from its stream() and the second
yields "ok from ..." and assert the FallbackProvider.stream() AsyncIterator
yields the expected string and that the failing provider's .calls incremented
once and the succeeding provider was called; also add an all-fail stream test
where every DummyProvider has should_fail=True and assert the
FallbackProvider.stream() raises the propagated LLMProviderError and each
provider's .calls was incremented. Use the same helper/setup used by existing
tests and reference DummyProvider.stream, FallbackProvider.stream, and
LLMProviderError when locating code to modify.
backend/app/ai/providers/__init__.py (1)

207-213: 💤 Low value

Consider adding openai_base_url to Settings for configuration parity.

The openai_base_url setting referenced here is not defined in Settings (config.py), so the getattr default will always be used. This works for standard OpenAI but prevents users from configuring custom endpoints (Azure OpenAI, proxies).

Suggested addition to config.py
# In Settings class, add alongside other OpenAI settings:
openai_base_url: str = "https://api.openai.com/v1"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/ai/providers/__init__.py` around lines 207 - 213, The Settings
class is missing an openai_base_url attribute while OpenAIProvider is
instantiated using getattr(settings, "openai_base_url", ...), so add a new
attribute openai_base_url: str = "https://api.openai.com/v1" to the Settings
class (config) to allow configuration of custom endpoints (Azure/proxy); update
any relevant type hints/tests that reference Settings and ensure OpenAIProvider
construction (the code calling getattr(settings, "openai_base_url")) will use
the configured value instead of always falling back to the default.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@backend/app/ai/providers/__init__.py`:
- Around line 207-213: The Settings class is missing an openai_base_url
attribute while OpenAIProvider is instantiated using getattr(settings,
"openai_base_url", ...), so add a new attribute openai_base_url: str =
"https://api.openai.com/v1" to the Settings class (config) to allow
configuration of custom endpoints (Azure/proxy); update any relevant type
hints/tests that reference Settings and ensure OpenAIProvider construction (the
code calling getattr(settings, "openai_base_url")) will use the configured value
instead of always falling back to the default.

In `@backend/tests/unit/ai/test_fallback_provider.py`:
- Around line 36-45: Add unit tests that exercise FallbackProvider.stream
orchestration similar to the existing complete() tests: create two DummyProvider
instances with .should_fail toggled so the first raises LLMProviderError from
its stream() and the second yields "ok from ..." and assert the
FallbackProvider.stream() AsyncIterator yields the expected string and that the
failing provider's .calls incremented once and the succeeding provider was
called; also add an all-fail stream test where every DummyProvider has
should_fail=True and assert the FallbackProvider.stream() raises the propagated
LLMProviderError and each provider's .calls was incremented. Use the same
helper/setup used by existing tests and reference DummyProvider.stream,
FallbackProvider.stream, and LLMProviderError when locating code to modify.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a2b7716e-fdac-4734-a099-d89fb626075b

📥 Commits

Reviewing files that changed from the base of the PR and between 3330691 and 3ef0e44.

📒 Files selected for processing (4)
  • backend/app/ai/providers/__init__.py
  • backend/app/ai/service.py
  • backend/app/config.py
  • backend/tests/unit/ai/test_fallback_provider.py

@rishabh0510rishabh

Copy link
Copy Markdown
Owner

hey @palakjaiswal16 some tests are failing plese address them carefully

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AI]: Implement automatic fallback LLM provider logic

2 participants