Skip to content

训练qwen3.5moe系列时 swift的 moe_grouped_gemm需要为true #38

@xtyXpastor

Description

@xtyXpastor

Checklist / 检查清单

  • I have searched existing issues, and this is a new feature request. / 我已经搜索过现有的 issues,确认这是一个新的 Feature Request。

Feature Request Description / Feature Request 描述

训练qwen3.5moe系列时
如果用了mcore-bridge
swift的 moe_grouped_gemm需要为true
虽然这个参数默认就是true,不过置成flase的话 megatron会走SequentialMLP的路子,就会出现和mcore-bridge不兼容~
可以在readme/swift的文档里mark一下

Pull Request / Pull Request 信息

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions