Skip to content

Add TORCH_NIGHTLY flag for full CI run against torch nightly#382

Open
atalman wants to merge 1 commit into
mainfrom
torch-nightly-full-run
Open

Add TORCH_NIGHTLY flag for full CI run against torch nightly#382
atalman wants to merge 1 commit into
mainfrom
torch-nightly-full-run

Conversation

@atalman

@atalman atalman commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Purpose

Today NIGHTLY=1 only runs a curated subset of steps (those tagged mirror: {torch_nightly: {}}) against the torch-nightly image. There is no way to run the full CI suite against torch nightly.

This PR adds a new TORCH_NIGHTLY flag that builds and runs the entire test suite against torch nightly, in addition to a full run on the pinned torch — effectively a full "vLLM vs torch nightly" signal on demand.

Triggered by either:

  • the build env var TORCH_NIGHTLY=1, or
  • the ready-torch-nightly PR label.

Behavior

When TORCH_NIGHTLY=1:

  • Every command step is collected into the vLLM Against PyTorch Nightly group and auto-run (unblocked) against the torch-nightly image (image_build_torch_nightly.sh, CUDA 13, PYTORCH_NIGHTLY=1) — not just the mirror.torch_nightly-tagged subset.
  • run_all is forced so the full suite also runs on the pinned torch image.
Flag Coverage
NIGHTLY=1 full pinned-torch suite + curated nightly subset
TORCH_NIGHTLY=1 full pinned-torch suite + full nightly suite

Changes

  • pipeline_generator/global_config.py: read TORCH_NIGHTLY env var and ready-torch-nightly label; force run_all.
  • pipeline_generator/buildkite_step.py: collect all steps into the nightly group and set auto_run when the flag is set.
  • bootstrap-amd.sh / bootstrap-intel.sh: plumb the flag + label through.
  • pipeline_generator/README.md: document the new env var.
  • tests/pipeline_generator/test_step.py: new test asserting an untagged step runs unblocked in the nightly group under TORCH_NIGHTLY=1.

Test Plan

  • pytest tests/pipeline_generator/ — 6 passed.
  • Pairs with the vLLM-side PR that wires --torch-nightly into trigger-ci-build.sh and documents the flag.

Authored with AI assistance.

Today NIGHTLY=1 only runs the curated subset of steps tagged with
mirror.torch_nightly against the torch nightly image. This adds a new
TORCH_NIGHTLY flag (env var or the ready-torch-nightly PR label) that
builds and runs the *entire* test suite against torch nightly, in
addition to a full run on the pinned torch.

- global_config: read TORCH_NIGHTLY env var / ready-torch-nightly label;
  force run_all so the pinned-torch suite runs fully too.
- buildkite_step: when set, collect every command step into the
  'vLLM Against PyTorch Nightly' group and auto-run it (unblocked).
- bootstrap-amd/intel: plumb the flag and label through.
- docs + unit test for the full-run behavior.
@atalman

atalman commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

Companion vLLM-repo PR (wires --torch-nightly into trigger-ci-build.sh + docs): vllm-project/vllm#46501

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant