[Bug] [Perf] [SIT] Llama-2-7b-chat-hf Model performance has degraded.

### Checklist

- [x] I searched related issues but found no solution.
- [x] The bug persists in the latest version.
- [x] Issues without environment info and a minimal reproducible demo are hard to resolve and may receive no feedback.
- [x] If this is not a bug report but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
- [x] Please use English. Otherwise, it will be closed.

### Describe the bug

[Version Information] main
[Version problem] New version issue
[Assignee] wangyao
[Issue Time] 2026/5/12
[Valid issue] Yes
[Description]
The execution of the test case python/sglang/multimodal_gen/test/server/ascend/test_server_1_npu.py failed due to performance not meeting the required standards.

### Reproduction

[Test Steps]
Execute the test case python/sglang/multimodal_gen/test/server/ascend/test_server_1_npu.py
[Expected Result]
The test case is passed.
[Actual Result]
The failure is caused by substandard performance.
[Opt] [TestCase Path]
python/sglang/multimodal_gen/test/server/ascend/test_server_1_npu.py

### Environment

A3, single-node, PD co-deployed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] [Perf] [SIT] Llama-2-7b-chat-hf Model performance has degraded. #527

Checklist

Describe the bug

Reproduction

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] [Perf] [SIT] Llama-2-7b-chat-hf Model performance has degraded. #527

Description

Checklist

Describe the bug

Reproduction

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions