compat megatron dev by Jintao-Huang · Pull Request #87 · modelscope/mcore-bridge

Jintao-Huang · 2026-05-20T09:04:05Z

No description provided.

gemini-code-assist

Code Review

This pull request updates the configuration parser to keep num_query_groups, modifies _apply_rotary_pos_emb_bshd in GPTModel to support dynamic multi_latent_attention configuration, and adds logic to TransformerLayer for initializing offloading modules. A high-severity issue was identified in the GPTModel changes: the monkey-patched function captures the state of the first model instance, which will cause configuration conflicts in environments running multiple models.

gemini-code-assist · 2026-05-20T09:06:03Z

+            if multi_latent_attention is None:
+                multi_latent_attention = self.config.multi_latent_attention


The monkey-patched _apply_rotary_pos_emb_bshd function captures the self instance of the first GPTModel that triggers the patch. Since the patch is only applied once (due to the check at line 144), all subsequent GPTModel instances will use the multi_latent_attention configuration from the first instance, regardless of their own configuration. This will cause incorrect behavior in multi-model environments (e.g., different models in the same process or complex pipeline parallel setups).

Jintao-Huang added 2 commits May 20, 2026 17:03

compat megatron dev

e0ec17a

lint pass

ed6b565

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compat megatron dev#87

compat megatron dev#87
Jintao-Huang wants to merge 2 commits into
modelscope:mainfrom
Jintao-Huang:compat_megatron_dev

Jintao-Huang commented May 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if multi_latent_attention is None:
		multi_latent_attention = self.config.multi_latent_attention

Conversation

Jintao-Huang commented May 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant