Skip to content

[build] chore: revert "bump transformer-engine to release_v2.16.post (#4536)"#4600

Open
ko3n1g wants to merge 1 commit into
mainfrom
ko3n1g/build/revert-te-2.16-post
Open

[build] chore: revert "bump transformer-engine to release_v2.16.post (#4536)"#4600
ko3n1g wants to merge 1 commit into
mainfrom
ko3n1g/build/revert-te-2.16-post

Conversation

@ko3n1g

@ko3n1g ko3n1g commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Background

This reverts commit 7c0968e (#4536), which bumped
transformer-engine to release_v2.16.post (b9d690e).

What changed

  • Restores the transformer-engine pin to the prior commit d64bc14
    (release_v2.16_post) in pyproject.toml.
  • Regenerates the matching embedded SHA in uv.lock.

Details

@ko3n1g ko3n1g requested a review from a team as a code owner June 30, 2026 17:55
@claude

claude Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

LGTM — clean revert of the TE pin from b9d690e back to d64bc14. The pyproject.toml and uv.lock changes are consistent and no other files are touched.\n\nOne note: Per project policy, this PR should have the needs-more-tests label applied (and likely full-test-suite as well, since this is a TE pin change that touches CUDA kernels). Currently no labels are set.\n\n## Suggested test cases\n\nNo perf tests impacted.

@yaoyu-33 yaoyu-33 added area:build Dependencies, packaging, images, and environment setup ci CI, automation, test queue, or workflow infrastructure work full-test-suite needs-more-tests Requires additional L0 and L1 test coverage before merge needs-review PR is ready for code review and waiting on a reviewer labels Jun 30, 2026
@ko3n1g ko3n1g changed the title build: revert "bump transformer-engine to release_v2.16.post (#4536)" [build] chore: revert "bump transformer-engine to release_v2.16.post (#4536)" Jun 30, 2026
@ko3n1g

ko3n1g commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

CI note (not blocking on this PR's content): the red Lint check / Nemo_CICD_Test here is a pre-existing breakage on main, not introduced by this revert.

pre-commit runs ruff --all-files, which flags F821 Undefined name for cu_seqlens, cu_seqlens_argmin, cu_seqlens_unpadded, cu_seqlens_unpadded_argmin in src/megatron/bridge/training/gpt_step.py:415-418. Those names are referenced in the accumulate_flops_metadata(...) call inside _forward_step_common, but get_batch only returns (tokens, labels, loss_mask, attention_mask, position_ids, packed_seq_metadata) — they are never bound in scope. This was introduced by #4511 (current main HEAD) and affects every open PR until main is fixed.

This revert touches only pyproject.toml and uv.lock; it cannot turn lint green on its own.

…4536)"

This reverts commit 7c0968e.

Signed-off-by: oliver könig <okoenig@nvidia.com>
@ko3n1g

ko3n1g commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test a7caf94

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:build Dependencies, packaging, images, and environment setup ci CI, automation, test queue, or workflow infrastructure work full-test-suite needs-more-tests Requires additional L0 and L1 test coverage before merge needs-review PR is ready for code review and waiting on a reviewer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants