Fix ROCm backend release target mapping#2423
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6b8c2a4ad2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for isa in ("gfx1030", "gfx1031", "gfx1032", "gfx1033", | ||
| "gfx1034", "gfx1035", "gfx1036"): |
There was a problem hiding this comment.
Blacken the new ROCm release-target test
AGENTS.md says Python must be formatted with Black and that this is enforced in CI; in the checked commit, python3 -m black --check test/test_rocm_release_target.py reports this new file would be reformatted, starting with this tuple and later literals. When the formatting check runs, the PR will fail even though the test logic itself passes.
Useful? React with 👍 / 👎.
| // vllm-rocm user pins may be base versions or full per-target | ||
| // release tags; normalize either form to the base version so | ||
| // status matches the install path's current-target suffix. | ||
| expected_version = normalize_expected_version(user_pin); |
There was a problem hiding this comment.
Require the current vLLM ROCm target in status
When vllm.rocm_bin is set to a full per-target tag and a same-base install for a different target is already in the cache (for example, a cache copied from a gfx1151 machine to gfx120X), this strips the expected version down to the base. The versions_match helper below then accepts any {expected}-... suffix, so /system-info reports the wrong-target vLLM bundle as installed instead of prompting a reinstall, even though the install path would construct the current machine's target suffix.
Useful? React with 👍 / 👎.
|
This diff is more comments than code. Please see b46f3b6 and adjust the PR. |
|
Verified on gfx1201 (R9700), built from source.
One confirmation re the version bump: the older Heads-up, separate from this PR (it's the release artifact in
With (1) symlinked, vLLM gets all the way through V1 engine init ( |
Summary
gfx1201resolve to published family targets likegfx120Xwhile APU targets likegfx1151stay specific.Closes #2415.
Tests
python3 test/test_rocm_release_target.pypython3 test/test_device_family_matching.pypython3 test/test_cuda_arch_mapping.pyninja -C build lemond lemonadegfx120X, old brokengfx1201, andgfx1151regression cases