Skip to content

SmolLM3 HF-to-nanotron conversion#393

Open
loubnabnl wants to merge 1 commit into
nouamane/smollm3-fix-conversionfrom
loubna/smollm3-qwen-conversion
Open

SmolLM3 HF-to-nanotron conversion#393
loubnabnl wants to merge 1 commit into
nouamane/smollm3-fix-conversionfrom
loubna/smollm3-qwen-conversion

Conversation

@loubnabnl

@loubnabnl loubnabnl commented Feb 11, 2026

Copy link
Copy Markdown
Contributor

Adapt current conversion script from nouamane to support SmolLM3 transformers to nanotron conversion, the current script fails with AttributeError: 'LlamaConfig' object has no attribute 'rope_interleaved'. Changes:

  • Add convert_hf_to_nanotron_qwen.py: use AutoModelForCausalLM and NanotronQwen2Config + warns on missing HF attrs with defaults, and saves both JSON and YAML configs
  • Add no_rope_layer_interval -> no_rope_layer mapping in get_config_mapping()
  • Add test_smollm3_conversion.py
=== Logits comparison (HuggingFaceTB/SmolLM3-3B-Base) ===
  Mean abs diff: 0.080078
  Max abs diff:  1.375000
  Elements > 0.03: 3054698/3847680 (79.39%)
  Argmax match:  93.3%

TEST RESULT: argmax match 93.3% 

- Add convert_hf_to_nanotron_qwen.py: standalone conversion script using
  AutoModelForCausalLM and NanotronQwen2Config with robust config mapping
  (warns on missing HF attrs with defaults), saves both JSON and YAML configs
- Add no_rope_layer_interval -> no_rope_layer mapping in get_config_mapping()
- Add test_smollm3_conversion.py: logits matching test for SmolLM3-3B-Base

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant