Skip to content

swiss#73

Open
xrsrke wants to merge 821 commits into
swiss-ai:mainfrom
xrsrke:main
Open

swiss#73
xrsrke wants to merge 821 commits into
swiss-ai:mainfrom
xrsrke:main

Conversation

@xrsrke

@xrsrke xrsrke commented May 23, 2025

Copy link
Copy Markdown

No description provided.

ko3n1g and others added 30 commits April 15, 2025 14:17
ci: Fix publish notify job

See merge request ADLR/megatron-lm!3117
ci: Upload pipeline telemetrics

See merge request ADLR/megatron-lm!3106
Fix `post_training/test_get_gpt_modelopt_spec_interface`

See merge request ADLR/megatron-lm!3118
Co-authored-by: Shanmugam Ramasamy <shanmugamr@shanmugamr-mlt.client.nvidia.com>
Remove legacy bert tests

See merge request ADLR/megatron-lm!3023
Co-authored-by: Ali Taghibakhshi <ataghibakhsh@cw-dfw-cs-001-vscode-01.cm.cluster>
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Alit/config mamba head

See merge request ADLR/megatron-lm!2601
Update CODEOWNERS to make modelopt  review only for QAT.

See merge request ADLR/megatron-lm!3125
Run nemo2 tests instead of nemo1

See merge request ADLR/megatron-lm!3119
…attn for dynamic batching.

Co-authored-by: Shanmugam Ramasamy <shanmugamr@shanmugamr-mlt.client.nvidia.com>
Co-authored-by: root <root@cw-dfw-h100-004-211-013.cm.cluster>
Co-authored-by: Vijay Korthikanti <vkorthikanti@cw-dfw-cs-001-login-01.cm.cluster>
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Co-authored-by: root <root@cw-dfw-h100-004-279-012.cm.cluster>
Co-authored-by: root <root@cw-dfw-h100-004-316-012.cm.cluster>
Co-authored-by: root <root@cw-dfw-h100-004-258-026.cm.cluster>
Co-authored-by: root <root@cw-dfw-h100-004-008-033.cm.cluster>
Co-authored-by: root <root@cw-dfw-h100-004-236-026.cm.cluster>
Co-authored-by: root <root@cw-dfw-h100-004-267-012.cm.cluster>
Integrating paged attention feature of flash_attn for dynamic batching.

See merge request ADLR/megatron-lm!2955
Co-authored-by: Mcore Bot <mcore-bot@nvidia.com>
Co-authored-by: yaoyu-33 <yaoyu.094@gmail.com>
Co-authored-by: Chenhan Yu <chenhany@nvidia.com>
add l2 norm in torch_norm.py for LLAMA-4 support

See merge request ADLR/megatron-lm!2960
fix: Improvements to the auto-reminder bot

See merge request ADLR/megatron-lm!3126
Fix Gemma TRTLLM export

See merge request ADLR/megatron-lm!2475
Co-authored-by: Yuzhong Wang <yuzhongw@nvidia.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Fix MLA THD format support

See merge request ADLR/megatron-lm!2691
Dynamic inference example | Control checkpoint load strictness.

See merge request ADLR/megatron-lm!2914
patch for fp8 primary weight custom fsdp support

See merge request ADLR/megatron-lm!3057
ci: Track info about MR

See merge request ADLR/megatron-lm!3129
ko3n1g and others added 30 commits May 12, 2025 14:06
Adapt _write_item call to new signature with 'serialization_format'

See merge request ADLR/megatron-lm!3243
Co-authored-by: Russell Hewett <rhewett@nvidia.com>
Add in-process restart

See merge request ADLR/megatron-lm!2711
ci: Run on multiple clusters

See merge request ADLR/megatron-lm!3292
ci: Allow specific TE-ref

See merge request ADLR/megatron-lm!3302
ci(fix): Write logs to log_dir

See merge request ADLR/megatron-lm!3299
Address dist checkpointing PyT 24.08 failure

See merge request ADLR/megatron-lm!3253
ci(hotfix): Downstream pipeline

See merge request ADLR/megatron-lm!3307
…nal argparse flag to clear GPU...

Co-authored-by: Szymon Migacz <smigacz@nvidia.com>
MR feedback: added units for arguments, optional argparse flag to clear GPU...

See merge request ADLR/megatron-lm!3308
…mamba class constructor

Co-authored-by: Zhiyu Li <zhiyul@NVIDIA.com>
Allow process group as optional argument for mamba class constructor

See merge request ADLR/megatron-lm!2966
Add NVTX ranges to categorize execution

See merge request ADLR/megatron-lm!2588
Move fsdp 2 import from _composable to public

See merge request ADLR/megatron-lm!3116
ci: Add nemo-image to `ci-rebuild-mcore-nemo-image`

See merge request ADLR/megatron-lm!3321
ci: Re-enable tests that failed on memory

See merge request ADLR/megatron-lm!3197
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Shanmugam Ramasamy <shanmugamr@cw-dfw-cs-001-vscode-01.cm.cluster>
Co-authored-by: Shanmugam Ramasamy <shanmugamr@shanmugamr-mlt.client.nvidia.com>
Engine updates

See merge request ADLR/megatron-lm!3254
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.