test(plugins): add live install verification scripts for each agent by evanpurkhiser · Pull Request #184 · getsentry/sentry-for-ai

evanpurkhiser · 2026-06-16T23:26:09Z

Each plugin-src/<agent>/verify-install.sh installs the built distribution with that agent's real CLI and runs a basic load check, validating the exact tree shape published to the per-agent distribution repos. The deploy workflow runs these against the candidate tree before publishing.

sentry · 2026-06-16T23:29:06Z

+claude --plugin-dir "$TARGET_DIR" \
+    -p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \
+    --safe-mode --print 2>&1 | cat


Bug: The --safe-mode flag in the verify-install.sh script disables all plugins, rendering the --plugin-dir loading test ineffective and potentially causing false positives.
_{Severity: MEDIUM}

Suggested Fix

Remove the --safe-mode flag from the claude command within the verify-install.sh script. This will allow the plugin specified by --plugin-dir to be correctly loaded and verified, ensuring the test is effective.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: plugin-src/claude/verify-install.sh#L34-L36 Potential issue: The `verify-install.sh` script uses the `claude` command with both the `--safe-mode` and `--plugin-dir` flags. According to the documentation, `--safe-mode` disables all plugins, which conflicts with the script's intent to test plugin loading via `--plugin-dir`. The test prompt asks Claude to list skills from the plugin, but with plugins disabled, Claude will answer based on its general training data rather than the loaded plugin's content. This can result in a false positive, where the test passes even if the plugin is broken, potentially allowing a faulty plugin to be deployed.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

cursor · 2026-06-16T23:30:05Z

+echo "=== One-shot load test via --plugin-dir (exercises the loader) ==="
+claude --plugin-dir "$TARGET_DIR" \
+    -p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \
+    --safe-mode --print 2>&1 | cat


Claude install path is skipped

Medium Severity

The Claude script adds the marketplace and then uses --plugin-dir, but never runs claude plugin install. A candidate with a broken marketplace install can still pass.

^{Reviewed by Cursor Bugbot for commit d808783. Configure here.}

cursor · 2026-06-16T23:30:05Z

+echo "Sample skills present:"
+ls "$TARGET_DIR/skills/" | head -8 | cat
+echo "Sample commands present:"
+ls "$TARGET_DIR/commands/" 2>/dev/null | cat || true


Cursor checks ignore failures

Medium Severity

The Cursor layout check suppresses ls failures with || true, so missing .cursor-plugin, mcp.json, or commands still reaches the success message.

^{Reviewed by Cursor Bugbot for commit d808783. Configure here.}

cursor · 2026-06-16T23:30:05Z

+echo "=== Ensuring codex CLI is available ==="
+if ! command -v codex >/dev/null 2>&1; then
+    echo "codex CLI not found; installing via official installer..."
+    CODEX_NON_INTERACTIVE=1 curl -fsSL https://chatgpt.com/codex/install.sh | sh


Codex installer flag is mis-scoped

Medium Severity

CODEX_NON_INTERACTIVE=1 is scoped only to curl, so the installer sh never sees it. Fresh CI runs can still prompt or hang.

^{Reviewed by Cursor Bugbot for commit d808783. Configure here.}

Each `plugin-src/<agent>/verify-install.sh` installs the built distribution with that agent's real CLI and runs a basic load check, validating the exact tree shape published to the per-agent distribution repos. The deploy workflow runs these against the candidate tree before publishing.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 4 total unresolved issues (including 3 from previous reviews).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 35c22e7. Configure here.}

cursor · 2026-06-17T15:01:24Z

+echo "=== One-shot load test via --plugin-dir (exercises the loader) ==="
+claude --plugin-dir "$TARGET_DIR" \
+    -p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \
+    --safe-mode --print 2>&1 | cat


Claude prompt fails without API

High Severity

The deploy workflow runs this script with set -euo pipefail and does not provide Anthropic credentials, yet the one-shot claude --plugin-dir … -p … --print step has no fallback. A model/auth failure aborts the whole Claude deploy job even when marketplace add and static validation succeeded.

^{Reviewed by Cursor Bugbot for commit 35c22e7. Configure here.}

sentry-warden · 2026-06-17T15:04:28Z

+          # into the harness (from the exact tree shape that will be published)
+          # and performs basic introspection. We do this on the runner before
+          # deciding whether to push the tree.
+          #


Unpinned curl|bash installers run with GH_TOKEN in deploy step (supply-chain exposure)

Each new plugin-src/<agent>/verify-install.sh pipes an unverified remote installer (e.g. curl -fsSL https://x.ai/cli/install.sh | bash) and is invoked from the deploy workflow's Build and deploy step, which exports GH_TOKEN (a GitHub App token with contents:write on the matching plugin repo). If any of the four vendor install endpoints is compromised, the malicious installer inherits GH_TOKEN and the tokenized git remote already on the runner, allowing pushes to getsentry/plugin-<agent>. Pin the CLI install to a verified version/checksum or run the install in a step that does not have the deploy token in scope.

Evidence

.github/workflows/deploy-plugins.yml step Build and deploy plugin-${{ matrix.agent }} sets GH_TOKEN: ${{ steps.token.outputs.token }} (env block) and the minted app token is scoped contents:write to plugin-${{ matrix.agent }}.

The same step clones https://x-access-token:${GH_TOKEN}@github.com/getsentry/${TARGET_REPO}.git into the worktree and then runs plugin-src/${AGENT}/verify-install.sh "$WORKTREE", so child processes inherit GH_TOKEN and the tokenized remote.

All four scripts pipe an external URL into a shell with no checksum/signature: claude.ai/install.sh | bash, cursor.com/install | bash, chatgpt.com/codex/install.sh | sh, x.ai/cli/install.sh | bash.

Exploit requires one of these vendor endpoints to serve a malicious installer; impact is bounded to exfiltrating the short-lived token and pushing to a single plugin repo, hence medium rather than high.

Also found at 2 additional locations

plugin-src/cursor/verify-install.sh:28

plugin-src/grok/verify-install.sh:21

_{Identified by Warden security-review · TJL-YS2}

sentry Bot reviewed Jun 16, 2026

View reviewed changes

cursor Bot reviewed Jun 16, 2026

View reviewed changes

evanpurkhiser force-pushed the evanpurkhiser/test-plugins-add-live-install-verification-scripts-for-each-agent branch from d808783 to 35c22e7 Compare June 17, 2026 15:00

cursor Bot reviewed Jun 17, 2026

View reviewed changes

sentry-warden Bot reviewed Jun 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(plugins): add live install verification scripts for each agent#184

test(plugins): add live install verification scripts for each agent#184
evanpurkhiser wants to merge 1 commit into
mainfrom
evanpurkhiser/test-plugins-add-live-install-verification-scripts-for-each-agent

evanpurkhiser commented Jun 16, 2026

Uh oh!

sentry Bot Jun 16, 2026

Uh oh!

Uh oh!

cursor Bot Jun 16, 2026

Uh oh!

cursor Bot Jun 16, 2026

Uh oh!

cursor Bot Jun 16, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 17, 2026

Uh oh!

sentry-warden Bot Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

evanpurkhiser commented Jun 16, 2026

Uh oh!

sentry Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor Bot Jun 16, 2026

Choose a reason for hiding this comment

Claude install path is skipped

Uh oh!

cursor Bot Jun 16, 2026

Choose a reason for hiding this comment

Cursor checks ignore failures

Uh oh!

cursor Bot Jun 16, 2026

Choose a reason for hiding this comment

Codex installer flag is mis-scoped

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 17, 2026

Choose a reason for hiding this comment

Claude prompt fails without API

Uh oh!

sentry-warden Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant