test(plugins): add live install verification scripts for each agent#184
Conversation
| claude --plugin-dir "$TARGET_DIR" \ | ||
| -p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \ | ||
| --safe-mode --print 2>&1 | cat |
There was a problem hiding this comment.
Bug: The --safe-mode flag in the verify-install.sh script disables all plugins, rendering the --plugin-dir loading test ineffective and potentially causing false positives.
Severity: MEDIUM
Suggested Fix
Remove the --safe-mode flag from the claude command within the verify-install.sh script. This will allow the plugin specified by --plugin-dir to be correctly loaded and verified, ensuring the test is effective.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: plugin-src/claude/verify-install.sh#L34-L36
Potential issue: The `verify-install.sh` script uses the `claude` command with both the
`--safe-mode` and `--plugin-dir` flags. According to the documentation, `--safe-mode`
disables all plugins, which conflicts with the script's intent to test plugin loading
via `--plugin-dir`. The test prompt asks Claude to list skills from the plugin, but with
plugins disabled, Claude will answer based on its general training data rather than the
loaded plugin's content. This can result in a false positive, where the test passes even
if the plugin is broken, potentially allowing a faulty plugin to be deployed.
Did we get this right? 👍 / 👎 to inform future reviews.
| echo "=== One-shot load test via --plugin-dir (exercises the loader) ===" | ||
| claude --plugin-dir "$TARGET_DIR" \ | ||
| -p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \ | ||
| --safe-mode --print 2>&1 | cat |
There was a problem hiding this comment.
Claude install path is skipped
Medium Severity
The Claude script adds the marketplace and then uses --plugin-dir, but never runs claude plugin install. A candidate with a broken marketplace install can still pass.
Reviewed by Cursor Bugbot for commit d808783. Configure here.
| echo "Sample skills present:" | ||
| ls "$TARGET_DIR/skills/" | head -8 | cat | ||
| echo "Sample commands present:" | ||
| ls "$TARGET_DIR/commands/" 2>/dev/null | cat || true |
There was a problem hiding this comment.
Cursor checks ignore failures
Medium Severity
The Cursor layout check suppresses ls failures with || true, so missing .cursor-plugin, mcp.json, or commands still reaches the success message.
Reviewed by Cursor Bugbot for commit d808783. Configure here.
| echo "=== Ensuring codex CLI is available ===" | ||
| if ! command -v codex >/dev/null 2>&1; then | ||
| echo "codex CLI not found; installing via official installer..." | ||
| CODEX_NON_INTERACTIVE=1 curl -fsSL https://chatgpt.com/codex/install.sh | sh |
There was a problem hiding this comment.
Codex installer flag is mis-scoped
Medium Severity
CODEX_NON_INTERACTIVE=1 is scoped only to curl, so the installer sh never sees it. Fresh CI runs can still prompt or hang.
Reviewed by Cursor Bugbot for commit d808783. Configure here.
Each `plugin-src/<agent>/verify-install.sh` installs the built distribution with that agent's real CLI and runs a basic load check, validating the exact tree shape published to the per-agent distribution repos. The deploy workflow runs these against the candidate tree before publishing.
d808783 to
35c22e7
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 4 total unresolved issues (including 3 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 35c22e7. Configure here.
| echo "=== One-shot load test via --plugin-dir (exercises the loader) ===" | ||
| claude --plugin-dir "$TARGET_DIR" \ | ||
| -p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \ | ||
| --safe-mode --print 2>&1 | cat |
There was a problem hiding this comment.
Claude prompt fails without API
High Severity
The deploy workflow runs this script with set -euo pipefail and does not provide Anthropic credentials, yet the one-shot claude --plugin-dir … -p … --print step has no fallback. A model/auth failure aborts the whole Claude deploy job even when marketplace add and static validation succeeded.
Reviewed by Cursor Bugbot for commit 35c22e7. Configure here.
| # into the harness (from the exact tree shape that will be published) | ||
| # and performs basic introspection. We do this on the runner before | ||
| # deciding whether to push the tree. | ||
| # |
There was a problem hiding this comment.
Unpinned curl|bash installers run with GH_TOKEN in deploy step (supply-chain exposure)
Each new plugin-src/<agent>/verify-install.sh pipes an unverified remote installer (e.g. curl -fsSL https://x.ai/cli/install.sh | bash) and is invoked from the deploy workflow's Build and deploy step, which exports GH_TOKEN (a GitHub App token with contents:write on the matching plugin repo). If any of the four vendor install endpoints is compromised, the malicious installer inherits GH_TOKEN and the tokenized git remote already on the runner, allowing pushes to getsentry/plugin-<agent>. Pin the CLI install to a verified version/checksum or run the install in a step that does not have the deploy token in scope.
Evidence
.github/workflows/deploy-plugins.ymlstepBuild and deploy plugin-${{ matrix.agent }}setsGH_TOKEN: ${{ steps.token.outputs.token }}(env block) and the minted app token is scopedcontents:writetoplugin-${{ matrix.agent }}.- The same step clones
https://x-access-token:${GH_TOKEN}@github.com/getsentry/${TARGET_REPO}.gitinto the worktree and then runsplugin-src/${AGENT}/verify-install.sh "$WORKTREE", so child processes inherit GH_TOKEN and the tokenized remote. - All four scripts pipe an external URL into a shell with no checksum/signature:
claude.ai/install.sh | bash,cursor.com/install | bash,chatgpt.com/codex/install.sh | sh,x.ai/cli/install.sh | bash. - Exploit requires one of these vendor endpoints to serve a malicious installer; impact is bounded to exfiltrating the short-lived token and pushing to a single plugin repo, hence medium rather than high.
Also found at 2 additional locations
plugin-src/cursor/verify-install.sh:28plugin-src/grok/verify-install.sh:21
Identified by Warden security-review · TJL-YS2


Each
plugin-src/<agent>/verify-install.shinstalls the built distribution with that agent's real CLI and runs a basic load check, validating the exact tree shape published to the per-agent distribution repos. The deploy workflow runs these against the candidate tree before publishing.