AI analysis of migration failures by OmkarDeshpande7 · Pull Request #1998 · platform9/vjailbreak

OmkarDeshpande7 · 2026-06-05T10:46:07Z

What this PR does / why we need it

Which issue(s) this PR fixes

(optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged)

fixes #

Special notes for your reviewer

Testing done

Screen.Recording.2026-06-05.at.4.16.27.PM.mov

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ndpoint, and tests Implements Phase 2 of the Analyse with AI feature: adds analyzer.py with extract_error_keywords, build_user_message, build_github_issue, parse_claude_response, query_rag, and analyze_migration; adds /analyze-migration POST endpoint to server.py; adds error_catalog.md with known vJailbreak error patterns; updates Dockerfile and docker-compose.yml for the vjailbreak-ai service. All 16 tests pass (13 analyzer + 3 server). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…or context ConfigMap

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…8s Secret Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…Issue fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… tests (T051-T053)

…ote to deployment

Add graphify-out/ to .gitignore and untrack all 833 files. graphify regenerates these on every session, dirtying git status and polluting unrelated commits. Files remain on disk for local use. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

AI calls used /vpw/v1/... directly. In production, these bypassed the nginx ingress rewrite rule (/dev-api/sdk/(.*) → /$1) and fell through to the UI nginx location / block which requires Basic Auth, causing the htpasswd popup on GlobalSettings and AI Analysis pages. All other vpwned API calls (helpers.ts, vddk.ts, version.ts) already use the /dev-api/sdk/vpw/v1/... pattern correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…pod logs Status.AgentName stores the node name (pod.Spec.NodeName), not the pod name. Spec.PodRef is the correct field holding the v2v-helper pod name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…lhost vpwned pod has no nginx on localhost:80. Debug logs served by vjailbreak-ui-service.migration-system.svc.cluster.local/debug-logs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Pydantic rejects null for dict[str, Any] fields. migration_plan, migration_template, network_mapping, storage_mapping are nil when absent, causing 422 from vjailbreak-ai. nilToEmptyMap converts nil → {} before JSON serialization. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Go: log full payload at DEBUG level before sending to vjailbreak-ai - Go: log response body when vjailbreak-ai returns non-200 - Python: add RequestValidationError handler that logs request body + Pydantic errors at ERROR level Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…l in JSON Go nil slice marshals to null; Pydantic rejects null for list[str]. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…history - Track followUpLoading separately from loading so the result panel stays visible (no full-screen spinner) while a follow-up request is in flight; Send button shows inline spinner instead - Include initial user turn in history so the model has context for follow-up questions (was only storing assistant turn) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Per-source log budget strategy: - v2v pod logs: raw tail 100k chars (failures always at end) - controller logs: error extraction context=5/tail=200, cap 50k chars - debug logs: error extraction context=3/no-tail, cap 50k chars, max 5 files Worst-case total ~100k tokens, well under the 200k API limit. Also adds deploy/vjailbreak-ai-context-configmap.yaml with default operator context covering vJailbreak architecture, common failure patterns, and analysis guidelines for more accurate AI responses. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Backend: detect follow-up (question + non-empty history) and switch to a conversational system prompt instead of the JSON-forcing analysis prompt. Return is_followup:true so the frontend knows not to replace the structured result. Frontend: only call setResult() on initial analysis. Follow-up responses are appended to history only. Render history.slice(2) as a Q&A thread below the initial analysis so the conversation is visible. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Claude Haiku was returning JSON for follow-up questions because the conversation history stored the raw JSON response as the assistant's message, causing it to pattern-match the format. Two fixes: 1. Frontend: store human-readable text in history (root cause + fix steps + summary) instead of raw JSON for the initial analysis turn 2. Backend: strengthen FOLLOWUP_SYSTEM_PROMPT with explicit "NO JSON" instruction and clearer prose-only directive Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Backend: build_github_issue is now called for all confidence levels. should_open stays true only for none/low confidence (AI recommendation), but prefill_url is always present. Frontend: add a subtle "Still having issues? Open a GitHub Issue" link at the bottom of the structured analysis for high/medium confidence results, right-aligned in small secondary text. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Backend may not have been rebuilt yet. Use prefill_url from response when available, otherwise fall back to a minimal prefilled title constructed from migrationName. Link now unconditionally visible. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…vels Both the confidence=none section and the structured result section now use an outlined Button with endIcon instead of a plain text link, making the GitHub Issue action more discoverable and easier to click. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Makefile: - Add AI_IMG variable (quay.io/platform9/vjailbreak-ai) - Add `vjailbreak-ai` target: docker build vjailbreak-ai/ packer.yml: - Add AI_IMG env var and ai_img output to determine-release - Add build-vjailbreak-ai job (parallel to other builds) - Wire build-vjailbreak-ai into push-images needs/conditions - Wire build-vjailbreak-ai into post-build needs/conditions - Download and push vjailbreak-ai artifact in push-images - envsubst vjailbreak-ai manifest → image_builder/deploy/08vjailbreak-ai.yaml download_images.sh: - Pull and export vjailbreak-ai image tar for VM baking vjailbreak-ai/deploy/vjailbreak-ai.yaml: - Deployment + Service + PVC for vjailbreak-ai in migration-system - Reads ANTHROPIC_API_KEY + ADMIN_API_KEY from vjailbreak-ai-secret - Mounts /data PVC for ChromaDB and context.md persistence Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…et key names install.sh: after applying yamls (migration-system namespace exists), generate a random admin-key with openssl rand -hex 32 and store in vjailbreak-ai-secret. No manual step required. ANTHROPIC_API_KEY is left empty — user sets it via Settings UI (ai_key_handler stores it as api-key in the same secret). vjailbreak-ai.yaml: fix secretKeyRef key names to match what ai_key_handler.go writes: api-key (ANTHROPIC_API_KEY) and admin-key (ADMIN_API_KEY). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-05T10:47:42Z

🚨 Security Vulnerability Summary

Security posture degraded

📊 Overall Changes

Metric	Count
Total Added	3
Total Fixed	0
Net Change	+3

🔍 Detailed Breakdown

📦 Gosec (Static Analysis)

Current	Baseline	Added	Fixed	Method
0	0	0	0	artifact

📦 Trivy (Dependency Scan)

Current	Baseline	Added	Fixed	Method
32	29	3	0	artifact

📋 Baseline Methods

📦 artifact: Used stored report from main branch
🔄 live_scan: Scanned base branch in real-time
⚠️ no_baseline: No baseline available (all vulnerabilities treated as new)

🚨 Added Vulnerabilities

Trivy (Dependencies) - 3 Added

Target: vjailbreak-ai/requirements.txt
Package: python-multipart 0.0.12
Vulnerability: CVE-2024-53981
Severity: HIGH
Title: python-multipart: python-multipart has a DoS via deformation multipart/form-data boundary

Target: vjailbreak-ai/requirements.txt
Package: python-multipart 0.0.12
Vulnerability: CVE-2026-24486
Severity: HIGH
Title: python-multipart: Python-Multipart: Arbitrary file write via path traversal vulnerability

Target: vjailbreak-ai/requirements.txt
Package: python-multipart 0.0.12
Vulnerability: CVE-2026-42561
Severity: HIGH
Title: Python-Multipart is a streaming multipart parser for Python. Prior to ...

Only HIGH and CRITICAL severity vulnerabilities are tracked
Baseline: e10fa8b19e5f84f5ec23fbcf2a8382df95f8f142

devin-ai-integration

Devin Review found 4 potential issues.

View 7 additional findings in Devin Review.

devin-ai-integration · 2026-06-05T10:50:28Z

+    if not error_keywords or chroma_client is None:
+        return ""
+    try:
+        collection = chroma_client.get_collection("vjailbreak_docs")


🔴 ChromaDB collection name mismatch between server.py and analyzer.py causes RAG retrieval to always fail

The server creates a ChromaDB collection named "vjailbreak" at vjailbreak-ai/server.py:69, but analyzer.py:170 tries to retrieve from a collection named "vjailbreak_docs" via chroma_client.get_collection("vjailbreak_docs"). Since the collection vjailbreak_docs is never created, get_collection raises an exception, which is silently caught by the except Exception at line 180. This means the RAG pipeline — the key differentiator for providing relevant documentation context to the AI — silently produces empty results on every analysis call. The AI will never receive relevant documentation snippets (virt-v2v docs, troubleshooting guides, error catalog) during migration failure analysis.

Suggested change

collection = chroma_client.get_collection("vjailbreak_docs")

collection = chroma_client.get_collection("vjailbreak")

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-06-05T10:50:30Z

+    if not ANTHROPIC_API_KEY:
+        raise RuntimeError("ANTHROPIC_API_KEY is not set")
+    if not ADMIN_API_KEY:
+        raise RuntimeError("ADMIN_API_KEY is not set — generate one with: python -c \"import secrets; print(secrets.token_hex(32))\"")


🔴 vjailbreak-ai server crashes on startup when API keys are not yet configured, contradicting optional: true deployment

The lifespan function in server.py:61-64 raises RuntimeError if ANTHROPIC_API_KEY or ADMIN_API_KEY environment variables are empty. However, the Kubernetes deployment manifests (vjailbreak-ai/deploy/vjailbreak-ai.yaml:54,60 and deploy/vjailbreak-ai/deployment.yaml:40,46) mark both secret key references as optional: true, which means the env vars will be empty strings when the secret key hasn't been set yet. The install.sh:222-226 only creates the secret with admin-key (no api-key). So immediately after installation, the pod reads an empty ANTHROPIC_API_KEY and crash-loops. The design doc explicitly states the pod should start without the Secret, but this code prevents that.

Prompt for agents

The lifespan function raises RuntimeError when ANTHROPIC_API_KEY or ADMIN_API_KEY are empty, but the K8s deployment uses optional: true on the secret references, meaning the pod should start without them. The install.sh script only creates the secret with admin-key, not api-key. The fix should allow the server to start gracefully without these keys, logging a warning instead of crashing. The /health endpoint should still work. Analysis endpoints should return a clear error (e.g. 503 with 'API key not configured') when ANTHROPIC_API_KEY is missing. Admin endpoints should return 401 when ADMIN_API_KEY is missing. This matches the design doc's intent: 'The Deployment uses optional: true so it starts without the Secret, but AI analysis calls will fail until the key is configured.'

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-06-05T10:50:31Z

+  const handleAnalyse = useCallback(() => {
+    setResult(null)
+    setHistory([])
+    runAnalysis()
+  }, [runAnalysis])


🟡 Stale closure in handleAnalyse sends old conversation history when re-running analysis

handleAnalyse at AIAnalysisTab.tsx:98-102 calls setHistory([]) to reset history, then immediately calls runAnalysis(). However, runAnalysis (line 63) reads history from its closure, which still holds the previous value because React state updates are batched and not yet applied. On a re-analysis click, the old conversation history is sent to the backend instead of an empty array. This means the AI receives stale context from a prior analysis, potentially confusing the response.

Stale closure mechanism

setHistory([]) schedules a state update but doesn't change the history variable captured in runAnalysis's closure. Since runAnalysis depends on [migrationName, namespace, history], the version called by handleAnalyse still sees the old history.

Prompt for agents

In AIAnalysisTab.tsx, the handleAnalyse callback calls setHistory([]) then runAnalysis(), but runAnalysis captures the old history value from its closure. Fix this by either: (1) passing the empty history as a parameter to runAnalysis so it doesn't read from the stale closure, e.g. refactor runAnalysis to accept an optional historyOverride parameter and use it instead of the state variable, or (2) use a ref to track history so the latest value is always available. The simplest fix is approach (1): add a parameter like `historyToSend?: ConversationTurn[]` to runAnalysis, default to `history`, and in handleAnalyse pass `[]` explicitly.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-06-05T10:50:32Z

+httpx==0.27.2
+python-multipart==0.0.12
+pydantic==2.9.2
+pytest==8.3.2


🔴 Missing beautifulsoup4 dependency in requirements.txt causes /crawl endpoint to fail

crawler.py:10 imports from bs4 import BeautifulSoup, but beautifulsoup4 is not listed in vjailbreak-ai/requirements.txt. When the /crawl admin endpoint is called, it lazy-imports crawler.py (server.py:261), which will fail with ModuleNotFoundError: No module named 'bs4'. The Docker build installs only packages from requirements.txt, so the crawl functionality is completely broken in the container.

Suggested change

pytest==8.3.2

pytest==8.3.2

beautifulsoup4==4.12.3

Was this helpful? React with 👍 or 👎 to provide feedback.

OmkarDeshpande7 and others added 30 commits June 5, 2026 16:15

docs: Add AI log analysis design spec

7827739

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(vjailbreak-ai): bootstrap service from vjailbreak-chat backend

9ca3d75

feat(deploy): add k8s manifests for vjailbreak-ai service with operat…

22b9a4b

…or context ConfigMap

feat(vpwned): add log line extractor for AI analysis

e2d7b33

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(vpwned): add AI analyze handler with log extraction and CR assembly

4fba8f4

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(vpwned): add AI key handler — stores Anthropic + admin keys in k…

b39b939

…8s Secret Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(vpwned): register /vpw/v1/ai/analyze and /vpw/v1/ai/key routes

2f09466

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ui): add vitest + @testing-library for component tests

3cf7ec0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ui): add AI analysis TypeScript types and API client

4f58f1f

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ui): add AIAnalysisTab component with follow-up chat and GitHub …

40b0413

…Issue fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ui): add optional AI Analysis tab slot to BaseLogsDrawer

5cf2cba

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ui): wire AI Analysis tab into PodLogsDrawer for failed migrations

36195ec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ui): add Anthropic API key configuration to global settings

7e80526

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(ui): add follow-up chat edge case tests for AIAnalysisTab

5f522de

test(vjailbreak-ai): add GitHub Issue URL encoding verification test

2cc3e1e

test(vpwned): add ConfigMap additional_context forwarding integration…

685a8bf

… tests (T051-T053)

docs: fix spec namespace/endpoint paths and add Secret prerequisite n…

3dfd688

…ote to deployment

chore: add graphify-out/ to .gitignore

a23a446

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(ai_handler): use Spec.PodRef instead of Status.AgentName for v2v …

f9b851f

…pod logs Status.AgentName stores the node name (pod.Spec.NodeName), not the pod name. Spec.PodRef is the correct field holding the v2v-helper pod name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(ai_handler): fix debug logs URL to use UI service instead of loca…

e1bd5c9

…lhost vpwned pod has no nginx on localhost:80. Debug logs served by vjailbreak-ui-service.migration-system.svc.cluster.local/debug-logs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(ai_handler): initialize fetchWarnings as empty slice to avoid nul…

8ec8b28

…l in JSON Go nil slice marshals to null; Pydantic rejects null for list[str]. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

OmkarDeshpande7 and others added 5 commits June 5, 2026 16:15

devin-ai-integration Bot reviewed Jun 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI analysis of migration failures#1998

AI analysis of migration failures#1998
OmkarDeshpande7 wants to merge 35 commits into
mainfrom
hackathon/ai-analyse

OmkarDeshpande7 commented Jun 5, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 5, 2026

Uh oh!

devin-ai-integration Bot Jun 5, 2026

Uh oh!

devin-ai-integration Bot Jun 5, 2026

Uh oh!

devin-ai-integration Bot Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	collection = chroma_client.get_collection("vjailbreak_docs")
	collection = chroma_client.get_collection("vjailbreak")

Conversation

OmkarDeshpande7 commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it

Which issue(s) this PR fixes

Special notes for your reviewer

Testing done

Uh oh!

github-actions Bot commented Jun 5, 2026

🚨 Security Vulnerability Summary

📊 Overall Changes

🔍 Detailed Breakdown

📦 Gosec (Static Analysis)

📦 Trivy (Dependency Scan)

📋 Baseline Methods

🚨 Added Vulnerabilities

Trivy (Dependencies) - 3 Added

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

OmkarDeshpande7 commented Jun 5, 2026 •

edited

Loading