[debug-tools] Enable multi-arch CI#5977
Draft
lumachad wants to merge 8 commits into
Draft
Conversation
The REPO_CONFIGS dictionary uses lowercase keys (rocm-libraries, rocm-systems). When the repo name is extracted from the full repository string (e.g. "ROCm/ROCgdb"), the split on "/" yields the original casing, causing lookup failures for mixed-case repo names. Lowercase the extracted name so the lookup is case-insensitive. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Add rocgdb to REPO_CONFIGS in detect_external_repo_config.py so it can be used as an external repository in TheRock's multi-arch CI pipeline. cmake_source_var: THEROCK_ROCGDB_SOURCE_DIR submodule_path: debug-tools/rocgdb/source Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Allow external repositories to pass additional CMake flags into TheRock builds via the external_repo JSON input, without requiring TheRock to hardcode repo-specific flags. detect_external_repo_config.py reads extra_cmake_options from the incoming --external-repo-json argument (alongside the existing projects field) and forwards it through config_json to the build workflows. The three artifact build workflows consume the new field by appending it to the cmake invocation immediately after the _SOURCE_DIR flag: multi_arch_build_portable_linux_artifacts.yml multi_arch_build_windows_artifacts.yml multi_arch_build_wsl_rocdxg_artifacts.yml External repos that need no extra flags omit the field (or pass an empty string) and the cmake line expands to nothing, preserving full backwards compatibility. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Cover the three fixes added to detect_external_repo_config.py: - test_rocgdb_config: verifies the rocgdb REPO_CONFIGS entry has the correct cmake_source_var, submodule_path, and skip_submodules. - test_external_repo_json_mixed_case_name: verifies that a full repo name like "ROCm/ROCgdb" is lowercased before the REPO_CONFIGS lookup, producing checkout_path "external-rocgdb" and resolving the correct cmake_source_var. - test_extra_cmake_options_forwarded: verifies that extra_cmake_options supplied in the external_repo JSON is forwarded verbatim into config_json. - test_extra_cmake_options_empty_by_default: verifies that omitting extra_cmake_options results in an empty string in config_json, preserving backwards compatibility. - test_extra_cmake_options_multiple_flags: verifies that multiple space-separated flags are forwarded intact through the JSON parse/serialize round-trip. - test_extra_cmake_options_embedded_quotes: verifies that embedded double quotes survive the JSON round-trip correctly; json.loads unescapes \" to ", then json.dumps re-escapes it back. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Contributor
Author
|
ci:skip as this is still under validation against a ROCgdb counterpart: ROCm/ROCgdb#179 |
28b7ee4 to
65fae39
Compare
The test matrix in fetch_test_configurations.py uses rocgdb-cpu and rocgdb-gpu as job keys (to distinguish CPU-only vs GPU tests), but the TEST_SUBPROJECTS blocks for rocgdb and amd-dbgapi were not aligned. This caused determine_rocm_test_dependencies.py to return names that never matched the actual test job keys — so no rocgdb tests would run when filtering by project. - rocgdb: add rocgdb-cpu and rocgdb-gpu to TEST_SUBPROJECTS - amd-dbgapi: replace rocgdb with rocgdb-cpu and rocgdb-gpu (expansion is non-recursive, so rocgdb alone would not resolve further) Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
97f6cb5 to
c94e568
Compare
When an external repo (e.g. ROCgdb) passes extra_cmake_options such as
-DTHEROCK_USE_EXTERNAL_ROCGDB=ON, those flags were previously injected
into every stage's configure step — even stages that have no knowledge of
the external component.
This commit scopes extra_cmake_options to the stages that actually own the
submodule, and skips stages whose artifacts are not needed at all.
stage_impact.py:
- Add StageImpactAnalyzer.required_stages_for_component(submodule_name)
that walks upstream: owning stage(s) + every stage whose artifacts
they depend on transitively. Uses topology.get_source_set_for_submodule()
and get_source_set_to_stages() directly; no new build_topology.py
methods needed.
detect_external_repo_config.py:
- Add _derive_build_stages(skip_submodules) which calls
required_stages_for_component() for each skipped submodule and
deduplicates. Populates build_stages (comma-separated) in config_json.
Empty string means no restriction (all stages run).
- Forward skip_packaging from external_repo JSON into config_json so
downstream steps can suppress packaging jobs without extra lookups.
- Add build_tools/ to sys.path so _therock_utils is importable when the
script is invoked as `python build_tools/github_actions/...` from root.
Build workflows:
- multi_arch_build_portable_linux_artifacts.yml,
multi_arch_build_windows_artifacts.yml,
multi_arch_build_wsl_rocdxg_artifacts.yml: gate extra_cmake_options on
build_stages — flags are only injected when build_stages is empty or
lists the current stage_name.
- multi_arch_build_portable_linux.yml: skip each non-compiler-runtime stage
when build_stages is non-empty and does not include that stage.
compiler-runtime is always built since every stage depends on it.
Example: ROCgdb derives ["compiler-runtime", "debug-tools"], so only those
two stages run and only debug-tools receives -DTHEROCK_USE_EXTERNAL_ROCGDB=ON.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
When a component like ROCgdb triggers multi-arch CI, DEB/RPM packages,
Python wheels, and PyTorch wheels are unnecessary overhead — only the
build and test stages matter.
External repos opt in by setting "skip_packaging": true in their
external_repo JSON. detect_external_repo_config.py forwards this field
into external_repo_config (commit in previous change). Here we consume it:
configure_multi_arch_ci.py:
- Add build_python_packages field to LinuxBuildConfig (dataclass).
- Read skip_packaging from EXTERNAL_REPO_JSON env var (the raw caller
JSON, available at configure time before the detect step runs) and
disable build_native_linux, build_python_packages, and build_pytorch
when true.
setup_multi_arch.yml:
- Pass EXTERNAL_REPO_JSON: ${{ inputs.external_repo }} to the configure
step so configure_multi_arch_ci.py can read it directly.
multi_arch_ci_linux.yml, multi_arch_ci_windows.yml:
- Guard build_python_packages job: add
fromJSON(inputs.build_config).build_python_packages == true to if:.
- Guard test_python_packages_per_family with the same condition.
Without this guard, a skipped build job still satisfied
!failure() && !cancelled(), causing the test job to run spuriously.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Document the external repo integration feature for TheRock's multi-arch CI: - external_repo JSON format and all supported fields - Stage scoping: how build_stages is derived from BUILD_TOPOLOGY.toml via StageImpactAnalyzer.required_stages_for_component(), and how it controls which stages run and which stages receive extra_cmake_options - Packaging suppression: skip_packaging: true opt-in - Step-by-step guide for adding a new external repo (REPO_CONFIGS entry, CMake variable wiring, caller workflow setup) - Reference links to all relevant source files Cross-link from ci_overview.md Infrastructure section. Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
c94e568 to
9b9a312
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enable ROCgdb multi-arch CI cross-triggers. Fix bugs and add unit tests.