Skip to content

[NPU]: support use-rollout-routing-replay#292

Open
Windfeng8 wants to merge 2 commits into
vllm-project:ascendfrom
Windfeng8:fix/routing-replay-v2
Open

[NPU]: support use-rollout-routing-replay#292
Windfeng8 wants to merge 2 commits into
vllm-project:ascendfrom
Windfeng8:fix/routing-replay-v2

Conversation

@Windfeng8

Copy link
Copy Markdown

No description provided.

@read-the-docs-community

read-the-docs-community Bot commented Jun 23, 2026

Copy link
Copy Markdown

Documentation build overview

📚 vime | 🛠️ Build #33270785 | 📁 Comparing 5c4969a against latest (e62d44f)

  🔍 Preview build  

8 files changed · + 1 added · ± 6 modified · - 1 deleted

+ Added

± Modified

- Deleted

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new training script run-qwen3-30B-A3B-npu-use-routing-replay.sh for running Qwen3-30B-A3B training on NPUs. Feedback on the script highlights two main issues: first, there are duplicate arguments (--use-dynamic-batch-size and --max-tokens-per-gpu) with conflicting values (8192 vs 20480) that need to be cleaned up; second, a hardcoded user directory path is used for the prompt dataset, which should be replaced with a configurable environment variable to improve portability.

Comment on lines +80 to +81
--use-dynamic-batch-size \
--max-tokens-per-gpu 8192 \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The arguments --use-dynamic-batch-size and --max-tokens-per-gpu are duplicated in this script (see lines 90-91). Additionally, --max-tokens-per-gpu is specified with conflicting values (8192 vs 20480). Please remove the duplicate entries and keep only the correct configuration (likely 20480 to match the max model length of 20k).

\
--hf-checkpoint /home/data/Qwen3-30B-A3B/ \
\
--prompt-data /home/w00893744/dataset/dapo-math-17k/dapo-math-17k.jsonl \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The path /home/w00893744/dataset/dapo-math-17k/dapo-math-17k.jsonl contains a hardcoded user directory (w00893744), which prevents the script from being portable across different environments or users. Consider using a configurable environment variable with a fallback default.

Suggested change
--prompt-data /home/w00893744/dataset/dapo-math-17k/dapo-math-17k.jsonl \
--prompt-data "${PROMPT_DATA:-/home/data/dapo-math-17k/dapo-math-17k.jsonl}" \

@CalvinXKY

Copy link
Copy Markdown
Collaborator

I'd suggest putting the sh script in #270, and also using the ray submit submission format.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants