Skip to content

test(plugins): add live install verification scripts for each agent#184

Open
evanpurkhiser wants to merge 1 commit into
mainfrom
evanpurkhiser/test-plugins-add-live-install-verification-scripts-for-each-agent
Open

test(plugins): add live install verification scripts for each agent#184
evanpurkhiser wants to merge 1 commit into
mainfrom
evanpurkhiser/test-plugins-add-live-install-verification-scripts-for-each-agent

Conversation

@evanpurkhiser

Copy link
Copy Markdown
Member

Each plugin-src/<agent>/verify-install.sh installs the built distribution with that agent's real CLI and runs a basic load check, validating the exact tree shape published to the per-agent distribution repos. The deploy workflow runs these against the candidate tree before publishing.

Comment on lines +34 to +36
claude --plugin-dir "$TARGET_DIR" \
-p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \
--safe-mode --print 2>&1 | cat

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The --safe-mode flag in the verify-install.sh script disables all plugins, rendering the --plugin-dir loading test ineffective and potentially causing false positives.
Severity: MEDIUM

Suggested Fix

Remove the --safe-mode flag from the claude command within the verify-install.sh script. This will allow the plugin specified by --plugin-dir to be correctly loaded and verified, ensuring the test is effective.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: plugin-src/claude/verify-install.sh#L34-L36

Potential issue: The `verify-install.sh` script uses the `claude` command with both the
`--safe-mode` and `--plugin-dir` flags. According to the documentation, `--safe-mode`
disables all plugins, which conflicts with the script's intent to test plugin loading
via `--plugin-dir`. The test prompt asks Claude to list skills from the plugin, but with
plugins disabled, Claude will answer based on its general training data rather than the
loaded plugin's content. This can result in a false positive, where the test passes even
if the plugin is broken, potentially allowing a faulty plugin to be deployed.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment thread plugin-src/claude/verify-install.sh
echo "=== One-shot load test via --plugin-dir (exercises the loader) ==="
claude --plugin-dir "$TARGET_DIR" \
-p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \
--safe-mode --print 2>&1 | cat

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude install path is skipped

Medium Severity

The Claude script adds the marketplace and then uses --plugin-dir, but never runs claude plugin install. A candidate with a broken marketplace install can still pass.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d808783. Configure here.

echo "Sample skills present:"
ls "$TARGET_DIR/skills/" | head -8 | cat
echo "Sample commands present:"
ls "$TARGET_DIR/commands/" 2>/dev/null | cat || true

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor checks ignore failures

Medium Severity

The Cursor layout check suppresses ls failures with || true, so missing .cursor-plugin, mcp.json, or commands still reaches the success message.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d808783. Configure here.

echo "=== Ensuring codex CLI is available ==="
if ! command -v codex >/dev/null 2>&1; then
echo "codex CLI not found; installing via official installer..."
CODEX_NON_INTERACTIVE=1 curl -fsSL https://chatgpt.com/codex/install.sh | sh

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex installer flag is mis-scoped

Medium Severity

CODEX_NON_INTERACTIVE=1 is scoped only to curl, so the installer sh never sees it. Fresh CI runs can still prompt or hang.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d808783. Configure here.

Each `plugin-src/<agent>/verify-install.sh` installs the built distribution
with that agent's real CLI and runs a basic load check, validating the exact
tree shape published to the per-agent distribution repos. The deploy workflow
runs these against the candidate tree before publishing.
@evanpurkhiser evanpurkhiser force-pushed the evanpurkhiser/test-plugins-add-live-install-verification-scripts-for-each-agent branch from d808783 to 35c22e7 Compare June 17, 2026 15:00

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 4 total unresolved issues (including 3 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 35c22e7. Configure here.

echo "=== One-shot load test via --plugin-dir (exercises the loader) ==="
claude --plugin-dir "$TARGET_DIR" \
-p 'List the names of the Sentry router skills only (sentry-sdk-setup, sentry-workflow, sentry-feature-setup). One line, comma separated.' \
--safe-mode --print 2>&1 | cat

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude prompt fails without API

High Severity

The deploy workflow runs this script with set -euo pipefail and does not provide Anthropic credentials, yet the one-shot claude --plugin-dir … -p … --print step has no fallback. A model/auth failure aborts the whole Claude deploy job even when marketplace add and static validation succeeded.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 35c22e7. Configure here.

# into the harness (from the exact tree shape that will be published)
# and performs basic introspection. We do this on the runner before
# deciding whether to push the tree.
#

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unpinned curl|bash installers run with GH_TOKEN in deploy step (supply-chain exposure)

Each new plugin-src/<agent>/verify-install.sh pipes an unverified remote installer (e.g. curl -fsSL https://x.ai/cli/install.sh | bash) and is invoked from the deploy workflow's Build and deploy step, which exports GH_TOKEN (a GitHub App token with contents:write on the matching plugin repo). If any of the four vendor install endpoints is compromised, the malicious installer inherits GH_TOKEN and the tokenized git remote already on the runner, allowing pushes to getsentry/plugin-<agent>. Pin the CLI install to a verified version/checksum or run the install in a step that does not have the deploy token in scope.

Evidence
  • .github/workflows/deploy-plugins.yml step Build and deploy plugin-${{ matrix.agent }} sets GH_TOKEN: ${{ steps.token.outputs.token }} (env block) and the minted app token is scoped contents:write to plugin-${{ matrix.agent }}.
  • The same step clones https://x-access-token:${GH_TOKEN}@github.com/getsentry/${TARGET_REPO}.git into the worktree and then runs plugin-src/${AGENT}/verify-install.sh "$WORKTREE", so child processes inherit GH_TOKEN and the tokenized remote.
  • All four scripts pipe an external URL into a shell with no checksum/signature: claude.ai/install.sh | bash, cursor.com/install | bash, chatgpt.com/codex/install.sh | sh, x.ai/cli/install.sh | bash.
  • Exploit requires one of these vendor endpoints to serve a malicious installer; impact is bounded to exfiltrating the short-lived token and pushing to a single plugin repo, hence medium rather than high.
Also found at 2 additional locations
  • plugin-src/cursor/verify-install.sh:28
  • plugin-src/grok/verify-install.sh:21

Identified by Warden security-review · TJL-YS2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant