fix: include vocab sizes in EAGLE3 vocab mapping cache key#602
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the cache key generation in scripts/train_eagle3.py by separating the dataset cache key from the vocabulary mapping cache key. It introduces a new vocab_cache_key that incorporates the draft and target model vocabulary sizes, ensuring that vocabulary mapping caches are correctly invalidated when vocabulary sizes change. There are no review comments to evaluate, and the changes look correct.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Motivation
draft_model_config.draft_vocab_sizeanddraft_model_config.vocab_sizeaffect vocab mapping generation. But the current vocab mapping cache key only follows the dataset cache key. This can incorrectly reuse a stale mapping when dataset/tokenizer inputs are unchanged but vocab dimensions differ.Modifications
draft_vocab_sizeandvocab_sizeto the existing dataset cache params string.Checklist