Skip to content

verify: verify vendored node_modules against the npm registry#970

Open
potiuk wants to merge 2 commits into
mainfrom
feat/npm-registry-verify
Open

verify: verify vendored node_modules against the npm registry#970
potiuk wants to merge 2 commits into
mainfrom
feat/npm-registry-verify

Conversation

@potiuk

@potiuk potiuk commented Jun 27, 2026

Copy link
Copy Markdown
Member

What

Adds a deterministic, registry-anchored verification of vendored node_modules to verify-action-build, as a new Vendored npm registry check.

Why

Node actions that commit their entire node_modules tree were only checked by an npm ci rebuild + file diff. npm isn't byte-reproducible across versions, so the rebuild frequently doesn't match and the tool falls back to a weak "diff against the previously-approved version" prompt — which forces a manual eyeball every bump. Surfaced reviewing dawidd6/action-send-mail v17→v18 (#968), whose "DIFFERENCES DETECTED" was just a nodemailer 8→9 vendored refresh.

How

For each package in the committed node_modules/.package-lock.json (lockfileVersion 2/3):

  1. download the tarball from the lockfile resolved URL;
  2. confirm its digest matches the lockfile integrity → proves it's the published, untampered package;
  3. compare every file the tarball contains against the committed blob by git blob SHA — so committed blobs are never downloaded, the repo tree already carries them;
  4. flag any committed file inside a verified package directory that the tarball does not contain (injected code).

A pass means every vendored package matched a registry-published, integrity-verified tarball, with no extra files.

Safety / edge cases

  • Git/file/link deps (no integrity) → reported as skipped, never silently passed.
  • Truncated repo tree (too large to enumerate) → refuses to claim a pass.
  • Non-npmjs registries → flagged.
  • .bin/.cache/.package-lock.json npm-generated artifacts → ignored.
  • Tarballs cached by integrity digest to avoid re-downloads.
  • Needs network access to registry.npmjs.org (available in CI; local runs need it allowlisted).

Tests

New test_npm_registry_verify.py (13 cases) builds a real in-memory tarball and exercises the full path with only the network seams mocked: clean match, content mismatch, integrity mismatch, extra-file injection, .bin noise, git-dep skip, foreign registry, truncated tree, plus helper unit tests (git-blob-SHA known value, integrity matching, tarball prefix stripping).

uv run pytest utils/tests/verify_action_build/ green; prek clean.

Generated-by: Claude Opus 4.8 (1M context)

Node actions that commit their whole node_modules tree were only checked
by rebuilding with `npm ci` and diffing. npm is not byte-reproducible
across versions, so the rebuild frequently fails to match and the tool
falls back to a weak "diff against the previously approved version" review
prompt (seen most recently on dawidd6/action-send-mail v17→v18, which is
just a nodemailer 8→9 vendored refresh).

Add a deterministic, registry-anchored check. For every package in the
committed `node_modules/.package-lock.json` (lockfileVersion 2/3), it:

  1. downloads the tarball from the lockfile's `resolved` URL,
  2. confirms the tarball digest matches the lockfile `integrity` —
     proving it is the published, untampered package,
  3. compares every file the tarball contains against the committed blob
     by git blob SHA (so committed blobs are never downloaded — the repo
     tree already carries them),
  4. flags any committed file inside a verified package directory that the
     tarball does not contain (injected code).

Packages that cannot be registry-verified (git/file/link deps or a
missing integrity) are reported as skipped, never silently passed; a
truncated repo tree refuses to claim a pass; non-npmjs registries are
flagged. The check is wired into the verification summary as
"Vendored npm registry check" and folds into the overall verdict.

Needs network access to registry.npmjs.org.

Generated-by: Claude Opus 4.8 (1M context)
Comment thread utils/tests/verify_action_build/test_npm_registry_verify.py Fixed
Replace the `"npm.example.com" in f` substring check with an exact-equality
assertion on the foreign-registry entry. Stronger assertion, and clears
CodeQL's incomplete-URL-substring-sanitization alert (a false positive in
test context — production host comparison in npm_registry_verify.py is an
exact `!=` on the parsed host).

Generated-by: Claude Opus 4.8 (1M context)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants