Skip to content

feat: sandboxed verification worker (closes #68)#73

Merged
tcconnally merged 1 commit into
mainfrom
feat/68-sandboxed-verification
Jun 18, 2026
Merged

feat: sandboxed verification worker (closes #68)#73
tcconnally merged 1 commit into
mainfrom
feat/68-sandboxed-verification

Conversation

@tcconnally

Copy link
Copy Markdown
Collaborator

Implements sandboxed verification (#68).

The SandboxVerifier clones the PR branch into a temp directory, applies all generated patches, detects the test framework (pytest, unittest, jest, cargo, go, npm, make), runs real tests, and reports evidence-backed pass/fail results.

When SANDBOX_ENABLED=true, the Verifier agent uses real sandbox verification first; falls back to LLM-only otherwise. VERIFIED_AUTO_APPROVE now gates actual GitHub APPROVE on sandbox evidence passing.

Wired through webhook → pr_context → AgentChain → Verifier.

Add SandboxVerifier that actually clones the PR branch, applies generated
patches, and runs the project's real test suite in an isolated temp
directory. Supports 7 test frameworks: pytest, unittest, jest, cargo,
go, npm, and make.

Dual-mode in Verifier agent: sandbox first when SANDBOX_ENABLED=true
(with github_token, repo_clone_url, branch_name in context), LLM-only
fallback otherwise. The VERIFIED_AUTO_APPROVE flag now gates actual
GitHub APPROVE on real sandbox evidence.
@tcconnally tcconnally merged commit 17976fc into main Jun 18, 2026
2 of 4 checks passed

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 PR Pilot Review

🔍 Reviewer (Agent 1)

This pull request introduces a sandboxed verification worker, a significant and valuable feature. However, it contains a critical security vulnerability that leaks repository access tokens in error logs and a high-severity bug in the patch application logic. These issues must be resolved before this can be merged.

📋 Issues Found

  • [CRITICAL] security — The _clone_repo method may leak the GitHub access token into the result details. If the git clone command fails, its stderr, which can contain the full clone URL with the embedded token, is captured and returned. This sensitive information could then be logged or exposed. (src/sandbox_verifier.py:210)
  • [HIGH] bug — The fallback logic for applying a patch when the patch command is not found is incorrect. It writes the patch content (a unified diff) directly to the target file, which will corrupt the file instead of patching it. This will lead to incorrect verification failures. (src/sandbox_verifier.py:307)
  • [MEDIUM] bug — The _run_tests method contains redundant logic for detecting test frameworks. It first calls _detect_test_framework, which already checks for package.json and Makefile, and then has a separate if/elif block to check for these files again as a fallback. This makes the code harder to maintain. (src/sandbox_verifier.py:353)
  • [LOW] style — The code uses asyncio.get_event_loop() to get the event loop for running the cleanup task. In modern Python (3.7+), asyncio.get_running_loop() is preferred as it's context-aware and safer. (src/agents/verifier.py:120)

🔧 Fixer (Agent 2)

Fixed a security vulnerability that could leak GitHub tokens, removed incorrect patch fallback logic, consolidated redundant test framework detection, and updated to modern asyncio usage.

🧪 Tester (Agent 3)

Tester failed: Unterminated string starting at: line 7 column 18 (char 141)

✅ Verifier (Agent 4)

Verification skipped.

👤 Decision: ESCALATE TO HUMAN

This PR requires human review — too complex or risky for autonomous handling.

Pipeline duration: ~356.3s


Review by PR Pilot — AI-native code quality service

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant