Skip to content

[fix] regression introduced by #45534#46456

Merged
hmellor merged 5 commits into
mainfrom
fix-tie-words-embeddings-regression
Jun 8, 2026
Merged

[fix] regression introduced by #45534#46456
hmellor merged 5 commits into
mainfrom
fix-tie-words-embeddings-regression

Conversation

@eustlb

@eustlb eustlb commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

A continuation of #46400

Verified by running (on this branch) this script

model                 lm_head   lm==embed   weights tied  config tie  match
---------------------------------------------------------------------------
qwen2_audio           yes       False       False         False       OK
voxtral               yes       False       False         False       OK
voxtral_realtime      no        n/a         True          True        OK
glmasr                yes       True        False         True        MISMATCH <<<
granite_speech        no        n/a         True          True        OK
granite_speech_plus   no        n/a         True          True        OK
audioflamingo3        yes       False       False         False       OK
musicflamingo         yes       False       False         False       OK
vibevoice_asr         yes       False       False         False       OK
  • lm_head — yes/no: whether the checkpoint has a separate lm_head.* tensor in its safetensors (has_lm_head).
  • lm==embed — True/False/n/a: when an lm_head exists, whether lm_head.weight is bitwise identical to embed_tokens.weight. n/a when there's no separate lm_head.
  • weights tied — True/False: what the actual checkpoint implies (not has_lm_head → no separate head means weights are tied).
  • config tie — True/False: what the config class resolves tie_word_embeddings to (fallback False).
  • match — OK if config tie == weights tied, else MISMATCH <<< (the regression check).

For glmasr, hub weights have a lm head but it's bitwise to embed tokens so we keep weight tying

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@hmellor hmellor added the for patch Tag issues / labels that should be included in the next patch label Jun 5, 2026
@hmellor

hmellor commented Jun 5, 2026

Copy link
Copy Markdown
Member

For the models with no LM head, why set tie_word_embeddings=True?

Comment thread src/transformers/models/glmasr/modular_glmasr.py
@eustlb

eustlb commented Jun 6, 2026

Copy link
Copy Markdown
Contributor Author

no lm_heads in the above tables means no lm_head in hub weigths, so tie_word_embeddings must be set tot True

@eustlb

eustlb commented Jun 6, 2026

Copy link
Copy Markdown
Contributor Author

run-slow: audioflamingo3, glmasr, musicflamingo, qwen2_audio, vibevoice_asr, voxtral, voxtral_realtime

@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Workflow Run ⚙️💔 This comment contains run-slow, but unknown error occurred and the workflow run aborted!

@eustlb

eustlb commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

run-slow: audioflamingo3, glmasr, musicflamingo, qwen2_audio, vibevoice_asr, voxtral, voxtral_realtime

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

CI Dashboard: View test results in Grafana

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Workflow Run ⚙️💔 This comment contains run-slow, but unknown error occurred and the workflow run aborted!

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: audioflamingo3, glmasr, musicflamingo, qwen2_audio, vibevoice_asr, voxtral, voxtral_realtime

@hmellor hmellor added this pull request to the merge queue Jun 8, 2026
Merged via the queue into main with commit b6bad75 Jun 8, 2026
23 checks passed
@hmellor hmellor deleted the fix-tie-words-embeddings-regression branch June 8, 2026 17:56
khushali9 pushed a commit to khushali9/transformers that referenced this pull request Jun 8, 2026
louzongzhi pushed a commit to louzongzhi/transformers that referenced this pull request Jun 10, 2026
louzongzhi pushed a commit to louzongzhi/transformers that referenced this pull request Jun 10, 2026
ArthurZucker pushed a commit that referenced this pull request Jun 11, 2026
* fix

* fix

* unnecessary and misleading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants