Skip to content

test(ts): change-based selective integration testing#2921

Open
yonib05 wants to merge 12 commits into
strands-agents:mainfrom
yonib05:selective-integ-testing
Open

test(ts): change-based selective integration testing#2921
yonib05 wants to merge 12 commits into
strands-agents:mainfrom
yonib05:selective-integ-testing

Conversation

@yonib05

@yonib05 yonib05 commented Jun 24, 2026

Copy link
Copy Markdown
Member

What

Adds change-based selective integration testing for the TypeScript SDK. Instead of always running the full integ-node + integ-browser suite, CI and local runs now execute only the integration specs whose module graph depends on the files a change touched, with a fail-safe fallback to the full suite for structural changes.

Why

The full integration suite hits live AWS/Bedrock and is slow. For a localized source fix, most of that suite is irrelevant. This narrows the run to the covering specs using Vitest's native module-graph tracing (vitest related), which resolves the $/sdk path alias correctly. No new dependencies and no new orchestration tooling.

How it works

test-infra/scripts/run-selective-ts.sh classifies the diff against the base ref into three branches:

  1. Structural change (dependency manifests, lockfiles, any strands-ts tsconfig, vitest.config.ts, shared test/integ/__fixtures__/ and __resources__/, strandly/, the script itself, or TypeScript CI workflows) runs the full suite.
  2. No TypeScript source change skips the integration job.
  3. Otherwise passes the changed source files to vitest related --project integ-node --project integ-browser.

The same script backs both the local npm run test:integ:selective command and the CI step, so local and CI scope match for a given change. Locally it diffs the working tree against the merge-base (so uncommitted edits are included); in CI it diffs against the PR base SHA, falling back to the merge-queue base on merge_group events.

Fail-safe design

Every uncertain path runs the full suite rather than skipping: unresolvable base ref, failed diff, or any change the static graph cannot trace (binary assets imported via Vite ?url, tsconfig path-alias definitions, wasm/wit inputs) is routed to the structural fallback. The goal is never to skip a test that should run.

Scope

TypeScript only. Python selective testing is intentionally deferred to a follow-up because its options (pytest-testmon and friends) carry real trade-offs with nondeterministic AWS-backed tests and stateful coverage data that warrant separate evaluation.

Files

  • test-infra/scripts/run-selective-ts.sh (new): the selection logic
  • package.json: test:integ:selective script
  • .github/workflows/typescript-integration-test.yml: wires the script into the existing integration-test workflow (full git history for diffing; selective run with the PR/merge-queue base SHA)
  • CONTRIBUTING.md: local usage docs

Verification

All branch classifications were verified via a dry-run mode that prints the chosen branch without executing tests (so no live AWS calls): structural cases (tsconfigs, __resources__ assets, strandly/) route to the full suite, a normal source edit narrows to the selective set, and unrelated changes skip.

Note for reviewers

Integration tests still run against live AWS/Bedrock. This change reduces how many run for a localized change; it does not change per-test reliability.

yonib05 and others added 7 commits June 23, 2026 15:47
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- structural set now catches nested tsconfigs, __resources__ (?url assets),
  strandly/, and the orchestration script itself (all invisible to the graph)
- DRY_RUN guards the early full-suite fallbacks so verification never hits AWS
- word-split-safe file array (portable to bash 3.2)
- merge_group falls back to merge_group.base_sha, not origin/main
@github-actions github-actions Bot added size/m typescript Pull requests that update typescript code area-community Related to community and contributor health chore Maintenance tasks, dependency updates, CI changes, refactoring with no user-facing impact strands-running labels Jun 24, 2026
Comment thread test-infra/scripts/run-selective-ts.sh
Comment thread test-infra/scripts/run-selective-ts.sh Outdated
Comment thread test-infra/scripts/run-selective-ts.sh Outdated
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Request Changes

Solid, well-documented approach — using Vitest's native module graph (no new deps) and failing safe to the full suite on structural/uncertain changes is the right design. The structural regex and base-ref fallbacks (including the merge_group reasoning) are sound. My concern is one correctness gap that undermines the core "never skip a test that should run" invariant.

Review themes
  • Graph traceability (Critical): Integ tests import the SDK via two specifiers ($/sdk → src, @strands-agents/sdk → dist). vitest related can't trace src/ edits to the 8 tests that import only the dist-backed package, so they'd be silently skipped. Needs an alias/export-condition so the package resolves to source under the integ projects.
  • Local diff scope (Important): Untracked new files aren't seen by git diff, contradicting the "includes uncommitted edits" docs; include them or document the limit.
  • Regression protection (Suggestion): The selection logic has no automated test; consider locking in branch classification.

Nice work narrowing a slow, live-AWS suite without adding orchestration dependencies — once the dist-import path is traceable this will be a clear win.

yonib05 added 2 commits June 24, 2026 00:26
- alias @strands-agents/sdk to ./src in both integ projects so vitest related
  traces src changes to tests importing via the package specifier (which
  otherwise resolves through exports to dist/ and was silently skipped)
- append untracked files to the changed set so brand-new local source files
  select their covering tests
Adds a SELECTIVE_CHANGED_FILES test seam and a pure-bash test that feeds
synthetic changed-file lists through the structural/skip/selective
classifier with no git state or network. Locks in the structural fallback
set so a future regex edit cannot silently start skipping tests.
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Approve

All three points from the previous review are addressed and I verified each one against the updated branch (deps installed, run locally):

Verification results
  • Graph traceability (was Critical) ✅ — Confirmed both the bug and the fix empirically. With the @strands-agents/sdk → ./src alias removed, editing src/agent/agent.ts selected 0 dist-only importers; with it, the same edit selects the previously-missed agentic-context, summarizing-conversation-manager, and mcp-tasks specs. Subpath specifiers trace too (src/multiagent/index.ts → multiagent/mcp tests), and selectivity is preserved (a type-only leaf still selects 0, not the full suite).
  • Local diff scope (was Important) ✅ — Untracked files now included via git ls-files --others --exclude-standard; reduces to the committed diff on CI.
  • Regression protection (was Suggestion) ✅ — New classify.test.sh runs 21/21 passing, locking in the structural/skip/selective classification with a clean test seam.

Nice turnaround — the empirical before/after verification and the regression harness make this a solid, maintainable win for integ-test runtime.

- structural fallback expressed as a commented pattern array instead of a
  213-char single-line regex; each trigger is reviewable in isolation
- remove strands-wasm/ and wit/ handling (those dirs were removed upstream),
  so the source filter and prefix strip only deal with strands-ts/
- collapse the three identical full-suite fallback blocks (structural branch,
  no-base-found, cannot-diff) into a single run_full_suite helper so the
  fallback behaviour has one definition; status line now goes to stderr
- replace the duplicated @strands-agents/sdk alias comment in the integ-browser
  project with a back-reference to integ-node

No behaviour change: classify.test.sh passes 20/20; both git-path fallbacks
verified to print + exit 0 under dry-run.
Comment thread test-infra/scripts/run-selective-ts.sh
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Approve

Reviewed independently and verified the core behaviors empirically (deps installed, vitest related exercised against the integ projects). The design is sound and genuinely fails safe — using Vitest's native module graph with no new orchestration deps, routing every uncertain path to the full suite.

What I verified
  • Safe-skip contract ✅ — vitest related exits 0 with "No test files found" when nothing covers the change, so a legitimate no-op narrows correctly instead of failing CI.
  • Traceability fix ✅ — The @strands-agents/sdk → ./src alias makes the 31 package-specifier importers traceable; editing src/agent/agent.ts selects its dependent integ specs.
  • Reverse-dep depth ✅ — Shared helpers outside __fixtures__/ (e.g. _interrupt-helpers.ts) correctly select their importing tests under a CI-like env; earlier "No test files" outputs were artifacts of the AWS global-setup throwing on absent keys, not selection misses.
  • Fail-safe plumbing ✅ — run_full_suite propagates non-zero exits under set -e; structural regex isn't over-broad; untracked files are included while dist/ stays gitignored so build output can't pollute the change set.
  • Regression harness ✅ — classify.test.sh passes 20/20, locking in structural/skip/selective classification with a clean, network-free seam.

One non-blocking suggestion inline about pinning the exit-0-on-empty assumption against future vitest majors. Nice win for integ-test runtime without sacrificing safety.

Document that selective testing's fail-safe invariant depends on vitest v4
exiting 0 (not non-zero) when no spec covers a change, so a future major bump
that changes this is caught at review rather than silently breaking selection.
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Approve

Re-reviewed after 42f26ef ("docs(ts): pin the vitest related exit-0-on-empty assumption"). My only open suggestion is now addressed: the load-bearing exit-0-on-empty contract is documented inline in Branch 3, with an explicit instruction to revisit and bump the package.json floor if a future vitest major changes it.

I confirmed the author made the right call declining the automated-smoke-test half of that suggestion — classify.test.sh is intentionally a no-git / no-network / no-live-AWS seam, and invoking real vitest related would break that property. The comment is the appropriate guardrail here.

Re-verified post-commit:

  • classify.test.sh → 20/20 passing (no regression).
  • shellcheck → only a single SC2001 style nit (line 118, sed vs parameter expansion); non-blocking.

No outstanding concerns. Clean win for integ-test runtime that fails safe by design. Nice turnaround on the feedback. 🚀

@yonib05 yonib05 enabled auto-merge (squash) June 24, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-community Related to community and contributor health chore Maintenance tasks, dependency updates, CI changes, refactoring with no user-facing impact size/m typescript Pull requests that update typescript code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant