fix(backend): persist troubleshooting sessions during streaming#571
fix(backend): persist troubleshooting sessions during streaming#571MayurKharat0390 wants to merge 3 commits into
Conversation
|
@MayurKharat0390 is attempting to deploy a commit to the rishabhmishra0510-5147's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
Warning Review limit reached
More reviews will be available in 49 minutes and 1 second. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis PR introduces Python 3.11 compatibility shims for older Python versions, implements database persistence with error handling in the AI troubleshooting service, and extends the CLI verify command to support multiple ML frameworks (TensorFlow, JAX, PyTorch) with framework-specific GPU/CUDA checks, replacing hardcoded PyTorch-only logic. ChangesPython 3.11 Compatibility Shims
AI Session Persistence with Error Handling
Framework-Agnostic Verify Command
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
# Conflicts: # cli/envforge_agent/audit/sources.py
| # Consume the stream generator (should still yield chunks despite persistence error) | ||
| chunks = [c async for c in service.stream_troubleshoot(request, db)] | ||
| assert chunks == [llm_result_json] |
There was a problem hiding this comment.
Bug: A trailing comma in _persist_session creates a tuple for AISuggestion.safe_commands, causing a database type error when persisting to an ARRAY(String) column which expects a list.
Severity: HIGH
Suggested Fix
In backend/app/ai/service.py, remove the trailing comma in the assignment to AISuggestion.safe_commands within the _persist_session method. The line should be safe_commands=fix.safe_commands if fix.safe_commands else None to ensure a list or None is passed, not a tuple.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: backend/tests/unit/ai/test_persist_session.py#L261-L263
Potential issue: In the `_persist_session` method, the `safe_commands` field of an
`AISuggestion` object is assigned a value using `(fix.safe_commands if fix.safe_commands
else None,)`. The trailing comma incorrectly creates a 1-element tuple instead of a list
or `None`. When SQLAlchemy attempts to persist this tuple to the PostgreSQL database, it
will raise a type error because the corresponding `ARRAY(String)` column expects a flat
list of strings or `None`. This issue is not caught by unit tests because
`_persist_session` is mocked.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
cli/envforge_agent/audit/sources.py (1)
13-22: ⚡ Quick winRelying on
pip._vendor.tomliis fragile.The fallback to
pip._vendor.tomli(lines 19-20) depends on pip's internal vendored packages, which are not a stable public API and may be removed or relocated across pip versions.📦 Recommended fix: Declare `tomli` as a conditional dependency
Instead of relying on
pip._vendor, addtomlias a conditional dependency for Python < 3.11 in your project's dependency specification:# pyproject.toml or similar [project.dependencies] tomli = {version = ">=2.0.0", python = "<3.11"}Then simplify the import chain to:
-try: - import tomllib -except ImportError: - try: - import tomli as tomllib - except ImportError: - try: - from pip._vendor import tomli as tomllib - except ImportError: - tomllib = None +try: + import tomllib +except ImportError: + try: + import tomli as tomllib + except ImportError: + tomllib = NoneThis approach ensures reliable TOML support across Python versions without depending on pip internals.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cli/envforge_agent/audit/sources.py` around lines 13 - 22, Replace the fragile pip._vendor fallback: stop importing from pip._vendor.tomli and instead require the third‑party tomli as a conditional dependency for Python <3.11; change the import logic around the tomllib/tomli resolution (the tomllib variable in this module) to only try: import tomllib except ImportError: import tomli as tomllib except ImportError: tomllib = None, and add tomli >=2.0.0 to your project dependencies with a Python marker (python < "3.11") so TOML support is deterministic without relying on pip internals.cli/envforge_agent/cli.py (2)
469-471: 💤 Low valueGeneric error message doesn't explain GPU check failure.
Line 470 sets a generic error message "GPU device check returned False" which doesn't help users diagnose the root cause (e.g., missing drivers, wrong library variant, CUDA toolkit mismatch).
Consider including framework-specific troubleshooting hints in the error message:
res = { "status": "FAIL", "message": f"{framework_name} installed but CUDA not available", - "error": "GPU device check returned False", + "error": data.get("error") or "GPU device check returned False. Verify CUDA drivers and toolkit are installed.", }This preserves any detailed error captured during the device check (e.g., from the JAX devices call) or provides a more actionable fallback message.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cli/envforge_agent/cli.py` around lines 469 - 471, The error dict currently uses a generic "GPU device check returned False"; update the code that builds this response (where framework_name is used) to include the actual device-check failure detail (e.g. a captured exception or returned error variable such as device_check_error) and append framework-specific troubleshooting hints (for example: missing drivers, wrong package variant, CUDA toolkit mismatch) as a fallback if no detailed error is available; ensure the final "error" field preserves any detailed message from the device-check routine and the "message" field keeps the existing f"{framework_name} installed but CUDA not available" context plus one-line actionable advice.
372-379: ⚡ Quick winProfile name parsing may be ambiguous for profiles containing multiple framework keywords.
The current substring matching checks for "tensorflow"/"tf" first, then "jax", but a profile named something like "jax-tensorflow-comparison" would incorrectly match TensorFlow due to the order of checks.
💡 Suggested improvement: Use more specific matching or explicit mapping
Consider using a more explicit approach that prioritizes exact or prefix matches:
# Determine target framework based on profile target_framework = "torch" if profile: p_lower = profile.lower() - if "tensorflow" in p_lower or "tf" in p_lower: + if p_lower.startswith("tensorflow") or p_lower.startswith("tf-"): target_framework = "tensorflow" - elif "jax" in p_lower: + elif p_lower.startswith("jax"): target_framework = "jax"Alternatively, maintain an explicit profile-to-framework mapping if the set of supported profiles is known.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cli/envforge_agent/cli.py` around lines 372 - 379, The profile parsing for target_framework (variable target_framework, input profile and p_lower) is ambiguous because substring checks like "tf" can accidentally match in multi-keyword names; replace the fragile substring logic with an explicit mapping or more specific matching: either build a PROFILE_TO_FRAMEWORK dict and look up exact or prefix keys from p_lower (falling back to default "torch"), or use regex word-boundary/prefix checks (e.g., match whole tokens "jax" or "tensorflow" rather than substrings) so profiles like "jax-tensorflow-comparison" map deterministically to the intended framework; update the code that sets target_framework to use this mapping or bounded matching.backend/tests/unit/ai/test_persist_session.py (1)
265-270: ⚡ Quick winAssert the failed-persistence audit clears
session_id.This test still passes if the service forwards the generated session UUID after
_persist_session()fails, but that value no longer has a validai_sessionsparent. Adding thesession_id is Noneassertion here would catch the FK break in the new failure path.Suggested assertion
mock_persist.assert_awaited_once() mock_log_audit.assert_awaited_once() call_kwargs = mock_log_audit.call_args.kwargs + assert call_kwargs["session_id"] is None assert call_kwargs["safety_passed"] is False assert call_kwargs["safety_violation"] == "DB persistence failure"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/tests/unit/ai/test_persist_session.py` around lines 265 - 270, Add an assertion to ensure the audit entry clears the session_id when persistence fails: after obtaining call_kwargs = mock_log_audit.call_args.kwargs (used already for safety_passed and safety_violation checks), assert that call_kwargs["session_id"] is None so the test fails if the service forwards a non-parented UUID; update the test in backend/tests/unit/ai/test_persist_session.py where mock_persist and mock_log_audit are asserted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/__init__.py`:
- Around line 1-9: Extract the duplicated Python-version compatibility shims
(the block that sets datetime.UTC and defines enum.StrEnum/StrEnum) into a
single shared module (e.g., compat.py) that applies the shim when
sys.version_info < (3, 11); then replace the duplicated blocks in __init__.py
and tests/conftest.py with a single import of that compat module so datetime.UTC
and enum.StrEnum are provided from one source of truth. Ensure the compat module
contains the same logic that defines datetime.UTC and class StrEnum(str,
enum.Enum) and that both places simply import it (so references to datetime.UTC
and enum.StrEnum continue to work).
In `@backend/app/ai/service.py`:
- Around line 275-280: When persisting fails we must not pass the rolled-back
session's foreign-key to the audit insert; change the call to _log_audit so that
when persist_failed is True it uses session_id=None instead of the generated
session_id (e.g., in the call inside _persist_session / post-persistence logic
where you currently pass session_id=session_id). Update the argument
construction so session_id = None if persist_failed else session_id, and keep
safety_passed and safety_violation as-is so the "DB persistence failure" audit
record can be inserted without FK constraint errors.
---
Nitpick comments:
In `@backend/tests/unit/ai/test_persist_session.py`:
- Around line 265-270: Add an assertion to ensure the audit entry clears the
session_id when persistence fails: after obtaining call_kwargs =
mock_log_audit.call_args.kwargs (used already for safety_passed and
safety_violation checks), assert that call_kwargs["session_id"] is None so the
test fails if the service forwards a non-parented UUID; update the test in
backend/tests/unit/ai/test_persist_session.py where mock_persist and
mock_log_audit are asserted.
In `@cli/envforge_agent/audit/sources.py`:
- Around line 13-22: Replace the fragile pip._vendor fallback: stop importing
from pip._vendor.tomli and instead require the third‑party tomli as a
conditional dependency for Python <3.11; change the import logic around the
tomllib/tomli resolution (the tomllib variable in this module) to only try:
import tomllib except ImportError: import tomli as tomllib except ImportError:
tomllib = None, and add tomli >=2.0.0 to your project dependencies with a Python
marker (python < "3.11") so TOML support is deterministic without relying on pip
internals.
In `@cli/envforge_agent/cli.py`:
- Around line 469-471: The error dict currently uses a generic "GPU device check
returned False"; update the code that builds this response (where framework_name
is used) to include the actual device-check failure detail (e.g. a captured
exception or returned error variable such as device_check_error) and append
framework-specific troubleshooting hints (for example: missing drivers, wrong
package variant, CUDA toolkit mismatch) as a fallback if no detailed error is
available; ensure the final "error" field preserves any detailed message from
the device-check routine and the "message" field keeps the existing
f"{framework_name} installed but CUDA not available" context plus one-line
actionable advice.
- Around line 372-379: The profile parsing for target_framework (variable
target_framework, input profile and p_lower) is ambiguous because substring
checks like "tf" can accidentally match in multi-keyword names; replace the
fragile substring logic with an explicit mapping or more specific matching:
either build a PROFILE_TO_FRAMEWORK dict and look up exact or prefix keys from
p_lower (falling back to default "torch"), or use regex word-boundary/prefix
checks (e.g., match whole tokens "jax" or "tensorflow" rather than substrings)
so profiles like "jax-tensorflow-comparison" map deterministically to the
intended framework; update the code that sets target_framework to use this
mapping or bounded matching.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 36350f60-22c1-4f2f-a4d3-a726baac6357
📒 Files selected for processing (7)
backend/app/__init__.pybackend/app/ai/service.pybackend/tests/conftest.pybackend/tests/unit/ai/test_persist_session.pycli/envforge_agent/audit/sources.pycli/envforge_agent/cli.pycli/tests/test_agent.py
| import sys | ||
| if sys.version_info < (3, 11): | ||
| import datetime | ||
| datetime.UTC = datetime.timezone.utc | ||
|
|
||
| import enum | ||
| class StrEnum(str, enum.Enum): | ||
| pass | ||
| enum.StrEnum = StrEnum |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Extract duplicated compatibility shims to a shared module.
This exact compatibility code is duplicated in backend/tests/conftest.py (lines 3-11). Maintaining identical shims in multiple locations increases the risk of inconsistent fixes and violates DRY.
♻️ Proposed refactor to eliminate duplication
Create a shared compatibility module:
backend/app/compat.py
"""Python version compatibility shims."""
import sys
if sys.version_info < (3, 11):
import datetime
datetime.UTC = datetime.timezone.utc
import enum
class StrEnum(str, enum.Enum):
pass
enum.StrEnum = StrEnumThen replace this file's content:
-import sys
-if sys.version_info < (3, 11):
- import datetime
- datetime.UTC = datetime.timezone.utc
-
- import enum
- class StrEnum(str, enum.Enum):
- pass
- enum.StrEnum = StrEnum
+# Apply compatibility shims for Python < 3.11
+from app.compat import * # noqa: F403, F401And update backend/tests/conftest.py:
-import sys
-if sys.version_info < (3, 11):
- import datetime
- datetime.UTC = datetime.timezone.utc
-
- import enum
- class StrEnum(str, enum.Enum):
- pass
- enum.StrEnum = StrEnum
-
+# Apply compatibility shims for Python < 3.11 before any app imports
+import sys
+if sys.version_info < (3, 11):
+ from app.compat import * # noqa: F403, F401🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/__init__.py` around lines 1 - 9, Extract the duplicated
Python-version compatibility shims (the block that sets datetime.UTC and defines
enum.StrEnum/StrEnum) into a single shared module (e.g., compat.py) that applies
the shim when sys.version_info < (3, 11); then replace the duplicated blocks in
__init__.py and tests/conftest.py with a single import of that compat module so
datetime.UTC and enum.StrEnum are provided from one source of truth. Ensure the
compat module contains the same logic that defines datetime.UTC and class
StrEnum(str, enum.Enum) and that both places simply import it (so references to
datetime.UTC and enum.StrEnum continue to work).
| await self._log_audit( | ||
| db, | ||
| session_id=session_id, | ||
| input_hash=input_hash, | ||
| safety_passed=True, | ||
| safety_violation=None, | ||
| safety_passed=not persist_failed, | ||
| safety_violation="DB persistence failure" if persist_failed else None, |
There was a problem hiding this comment.
Use session_id=None on the persistence-failure audit path.
AIAuditLog.session_id is a foreign key to ai_sessions.id, and _persist_session() rolls back before re-raising. When persist_failed is True, this generated session_id has no parent row, so the audit insert will fail on the next flush/commit and the "DB persistence failure" audit record never lands.
Suggested fix
await self._log_audit(
db,
- session_id=session_id,
+ session_id=None if persist_failed else session_id,
input_hash=input_hash,
safety_passed=not persist_failed,
safety_violation="DB persistence failure" if persist_failed else None,
provider=provider_name,
tokens_used=0,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| await self._log_audit( | |
| db, | |
| session_id=session_id, | |
| input_hash=input_hash, | |
| safety_passed=True, | |
| safety_violation=None, | |
| safety_passed=not persist_failed, | |
| safety_violation="DB persistence failure" if persist_failed else None, | |
| await self._log_audit( | |
| db, | |
| session_id=None if persist_failed else session_id, | |
| input_hash=input_hash, | |
| safety_passed=not persist_failed, | |
| safety_violation="DB persistence failure" if persist_failed else None, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/ai/service.py` around lines 275 - 280, When persisting fails we
must not pass the rolled-back session's foreign-key to the audit insert; change
the call to _log_audit so that when persist_failed is True it uses
session_id=None instead of the generated session_id (e.g., in the call inside
_persist_session / post-persistence logic where you currently pass
session_id=session_id). Update the argument construction so session_id = None if
persist_failed else session_id, and keep safety_passed and safety_violation
as-is so the "DB persistence failure" audit record can be inserted without FK
constraint errors.
| " result['cuda_version'] = getattr(jaxlib, '__version__', 'unknown')\n" | ||
| " except Exception:\n" | ||
| " pass\n" |
There was a problem hiding this comment.
JAX CUDA version extraction uses jaxlib.version instead of CUDA version.
Lines 410-411 set result['cuda_version'] to jaxlib.__version__, but this is the jaxlib library version, not the CUDA toolkit version. This mislabels the data and could confuse users comparing CUDA versions across frameworks.
🔧 Suggested fix: Extract actual CUDA version or use a different field name
Either rename the field to reflect what's being captured:
try:
import jaxlib
- result['cuda_version'] = getattr(jaxlib, '__version__', 'unknown')
+ result['jaxlib_version'] = getattr(jaxlib, '__version__', 'unknown')
except Exception:
passOr attempt to extract the actual CUDA version from jaxlib's build configuration if available. For consistency with PyTorch and TensorFlow, the field name should accurately reflect whether it contains a CUDA version or a library version.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| " result['cuda_version'] = getattr(jaxlib, '__version__', 'unknown')\n" | |
| " except Exception:\n" | |
| " pass\n" | |
| " result['jaxlib_version'] = getattr(jaxlib, '__version__', 'unknown')\n" | |
| " except Exception:\n" | |
| " pass\n" |
|
Hey @MayurKharat0390 the backend and CLI tests are failing please fix them and consider coderabit review as well |
🔍 PR Action RequiredHi @MayurKharat0390, We detected some items on this Pull Request that require attention: ❌ Failing CI ChecksThe following check runs or commit statuses are failing (ignoring vercel): Please resolve the issues above to proceed. Last updated: Fri, 05 Jun 2026 05:16:41 GMT |
Description
This PR fixes a critical data loss bug where AI troubleshooting sessions and recommended fixes generated via the SSE streaming endpoint (
stream_troubleshootinsidebackend/app/ai/service.py) were never persisted to theai_sessionsorai_suggestionsdatabase tables.Clients using streaming (default in frontend UI) suffered from complete session history loss.
Changes
AI Streaming Session Persistence (
backend/app/ai/service.py):_persist_sessioninside a try-except block to gracefully handle database write failures without terminating the SSE response stream.safety_passed=Falseand logssafety_violation="DB persistence failure", matching the synchronous endpoint.Added Unit Tests (
backend/tests/unit/ai/test_persist_session.py):test_stream_troubleshoot_persists_session: Verifies session persistence and successful audit logging when streaming.test_stream_troubleshoot_persists_session_failure: Verifies that if DB persistence fails during streaming, the SSE chunks are still yielded to the client, the error is handled, and the audit log records safety failure.Python 3.10 Compatibility Patches (
backend/app/__init__.py&backend/tests/conftest.py):datetime.UTCto fallback todatetime.timezone.utcon Python < 3.11.enum.StrEnumto fallback to a custom string-based enum subclass on Python < 3.11.Verification
uv run --extra dev pytest tests/):354 passed, 44 warnings in 122.32sSummary by CodeRabbit
Release Notes
New Features
Improvements