[Feature] Add VP drafter training mode for DFlash by catnanami · Pull Request #592 · sgl-project/SpecForge

catnanami · 2026-06-24T13:25:45Z

Motivation

This PR adds training support for the VP-Drafter used in D2SD (Dual Diffusion Draft Speculative Decoding). D2SD extends DFlash by using a first DFlash draft to estimate likely rejection boundaries, then training a second variable-prefix drafter to re-anchor at selected prefixes and generate alternative continuations.

The key training requirement is different from standard DFlash: the drafter must learn from variable-length visible prefixes instead of always seeing only the anchor token followed by masks. This PR implements that behavior as a DFlash training-mode branch, so the resulting model still uses the same DFlashDraftModel architecture and config format.

References:

Project: https://github.com/catnanami/D2-SD
Paper: https://arxiv.org/abs/2606.04446

Modifications

Added a vp_drafter training mode to OnlineDFlashModel.
The vp_drafter mode samples a variable visible prefix length per block.
Prefix tokens are fed as real token embeddings; suffix positions remain masked and contribute to loss.
Added exponential loss decay from the first masked suffix position, matching the VP-Drafter training recipe.
Wired scripts/train_dflash.py to read dflash_config.training_mode / dflash_config.loss_type when --loss-type is not explicitly provided.
Added prefix_weight_base support for variable-prefix sampling.
Added a Qwen3-8B VP-Drafter config and example script:
- configs/qwen3-8b-dta.json
- examples/run_qwen3_8b_dta_online.sh
Added formula-level unit coverage for VP-Drafter loss masking and decay behavior.

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://sgl-fru7574.slack.com/archives/C09784E3EN6 to discuss your PR.

gemini-code-assist

Code Review

This pull request introduces the vp_drafter training objective (variable-prefix drafter) to the DFlash framework, allowing for training with variable visible prefixes. It adds a configuration file for Qwen3-8B, an online training script, and updates the core DFlash model and training script to support prefix length sampling, VP noise embedding generation, and corresponding loss calculations. Unit tests are also added to validate the new loss implementation. The review feedback highlights three key improvements: replacing torch.distributions.Categorical with torch.multinomial to prevent graph breaks under torch.compile, handling potential None values for prefix_weight_base to avoid a TypeError, and using .reshape() instead of .view() on noise_ids to safely handle non-contiguous tensors.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

catnanami · 2026-06-24T14:04:08Z

Hi @claude @jiapingW @wenqf11, could you please review this PR?

jiapingW · 2026-06-24T14:07:55Z

Great work! We'll check it soon.

Add VP drafter training mode for DFlash

0b948e7

catnanami requested review from FlamingoPg, FrankLeeeee, shuaills and sleepcoo as code owners June 24, 2026 13:25

catnanami changed the title ~~Add VP drafter training mode for DFlash~~ [Feature] Add VP drafter training mode for DFlash Jun 24, 2026

gemini-code-assist Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread specforge/core/dflash.py

Comment thread specforge/core/dflash.py

Comment thread specforge/core/dflash.py Outdated

catnanami and others added 3 commits June 24, 2026 21:49

Update specforge/core/dflash.py

1a144ea

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update specforge/core/dflash.py

81a3554

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update specforge/core/dflash.py

8a0c9c7

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fix formatting

e1cb99d

jiapingW self-requested a review June 24, 2026 14:29

root and others added 2 commits June 24, 2026 22:39

fix scripts with shebang

47006b5

Merge branch 'main' into feature/vp-drafter-training

784d2c3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Add VP drafter training mode for DFlash#592

[Feature] Add VP drafter training mode for DFlash#592
catnanami wants to merge 7 commits into
sgl-project:mainfrom
catnanami:feature/vp-drafter-training

catnanami commented Jun 24, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

catnanami commented Jun 24, 2026

Uh oh!

jiapingW commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

catnanami commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

catnanami commented Jun 24, 2026

Uh oh!

jiapingW commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

catnanami commented Jun 24, 2026 •

edited

Loading