Skip to content

[Do not merge]Enable building with cuda for el9_amd64_gcc15#2785

Open
smuzaffar wants to merge 3 commits into
masterfrom
smuzaffar-patch-5
Open

[Do not merge]Enable building with cuda for el9_amd64_gcc15#2785
smuzaffar wants to merge 3 commits into
masterfrom
smuzaffar-patch-5

Conversation

@smuzaffar

Copy link
Copy Markdown
Contributor

No description provided.

@cmsbuild

cmsbuild commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

A new Pull Request was created by @smuzaffar for branch master.

@akritkbehera, @cmsbuild, @iarspider, @raoatifshad, @smuzaffar can you please review it and eventually sign? Thanks.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here

@cmsbuild

cmsbuild commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

cms-bot internal usage

@smuzaffar

Copy link
Copy Markdown
Contributor Author

please test with cms-sw/cmsdist#10613 for CMSSW_20_1_X/el9_amd64_gcc15

lets see if devel branch PR works

@cmsbuild

cmsbuild commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-48f289/53784/summary.html
COMMIT: 900ea85
CMSSW: CMSSW_20_1_X_2026-06-09-1100/el9_amd64_gcc15
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2785/53784/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed External Build

I found compilation error when building:

cwd: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el9_amd64_gcc15/external/py3-torch-cuda/2.11.0-11fd4a3df8667dc745e2a828865a9885/external_py3-torch-cuda_2.11.0-11fd4a3df8667dc745e2a828865a9885-1-build/cmsdist-pip-src/py3-torch-cuda-2.11.0
Building wheel for torch (pyproject.toml): finished with status 'error'
ERROR: Failed building wheel for torch
Failed to build torch
ERROR: Failed to build one or more wheels
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.ZlYxMk (%build)

RPM build warnings:
Macro expanded in comment on line 608: %{pkginstroot}/bin/*

Macro expanded in comment on line 613: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}


@cmsbuild

Copy link
Copy Markdown
Contributor

Pull request #2785 was updated.

@smuzaffar

Copy link
Copy Markdown
Contributor Author

please test for el9_amd64_gcc15

@smuzaffar

Copy link
Copy Markdown
Contributor Author

please test with cms-sw/cmsdist#10637

@cmsbuild

Copy link
Copy Markdown
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-48f289/53971/summary.html
COMMIT: c5535e3
CMSSW: CMSSW_20_1_X_2026-06-13-1100/el9_amd64_gcc15
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2785/53971/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed External Build

I found compilation error when building:

./usr/lib/.build-id/e8
./usr/lib/.build-id/e8/a15e345c889841a4bfd473fb9fadbdbe79aab3
1209 blocks
+ cp /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/SOURCES/external/rocm/7.2.4-57ab736cd3fc6438ce9e1761b1f8cf53/c++config.h opt/rocm-7.2.4/llvm/lib/clang/20/include/cuda_wrappers/bits
cp: cannot create regular file 'opt/rocm-7.2.4/llvm/lib/clang/20/include/cuda_wrappers/bits': No such file or directory
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.OAUQIr (%build)

RPM build warnings:
Macro expanded in comment on line 492: %{pkginstroot}




@smuzaffar

Copy link
Copy Markdown
Contributor Author

please test with cms-sw/cmsdist#10637, cms-sw/cmsdist#10639

@cmsbuild

Copy link
Copy Markdown
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-48f289/53977/summary.html
COMMIT: c5535e3
CMSSW: CMSSW_20_1_X_2026-06-13-1100/el9_amd64_gcc15
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2785/53977/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed External Build

I found compilation error when building:

cwd: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el9_amd64_gcc15/external/py3-torch-cuda/2.11.0-c2e5ad28ee619d26da6b8fc54038615c/external_py3-torch-cuda_2.11.0-c2e5ad28ee619d26da6b8fc54038615c-1-build/cmsdist-pip-src/py3-torch-cuda-2.11.0
Building wheel for torch (pyproject.toml): finished with status 'error'
ERROR: Failed building wheel for torch
Failed to build torch
ERROR: Failed to build one or more wheels
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.FwTogz (%build)

RPM build warnings:
Macro expanded in comment on line 609: %{pkginstroot}/bin/*

Macro expanded in comment on line 614: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}


@smuzaffar

Copy link
Copy Markdown
Contributor Author

@fwyzard , cuda 13.3.0 fails with Error: Internal Compiler Error (codegen): "unsupported float variant!" for GCC 15/C++23 ( see log : https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-48f289/53977/externals/py3-torch-cuda/2.11.0-c2e5ad28ee619d26da6b8fc54038615c/log )

@fwyzard

fwyzard commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Do we have a way to test with gcc15/c++20 or gcc14/c++23 ?

@smuzaffar smuzaffar changed the title Enable building with cuda for el9_amd64_gcc15 [Do not merge]Enable building with cuda for el9_amd64_gcc15 Jun 17, 2026
@cmsbuild

Copy link
Copy Markdown
Contributor

Pull request #2785 was updated.

@smuzaffar

Copy link
Copy Markdown
Contributor Author

please test with cms-sw/cmsdist#10637 for el9_amd64_gcc15

@smuzaffar

Copy link
Copy Markdown
Contributor Author

Do we have a way to test with gcc15/c++20 or gcc14/c++23 ?

sure we can test these combination. I just have started tests for gcc15/c++20

@smuzaffar

Copy link
Copy Markdown
Contributor Author

please test with cms-sw/cmsdist#10639 for CMSSW_20_1_CPP23_X

@cmsbuild

Copy link
Copy Markdown
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-48f289/54020/summary.html
COMMIT: cd46f4a
CMSSW: CMSSW_20_1_CPP23_X_2026-06-16-1100/el9_amd64_gcc14
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cms-bot/2785/54020/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed External Build

I found compilation error when building:

cwd: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el9_amd64_gcc14/external/py3-torch-cuda/2.11.0-1abefcb39578da1e3e593f091c6c2141/cmsdist-pip-src/py3-torch-cuda-2.11.0
Building wheel for torch (pyproject.toml): finished with status 'error'
ERROR: Failed building wheel for torch
Failed to build torch
ERROR: Failed to build one or more wheels
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.kLmSiT (%build)

RPM build warnings:
Macro expanded in comment on line 609: %{pkginstroot}/bin/*

Macro expanded in comment on line 614: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}


@smuzaffar

smuzaffar commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

@fwyzard , gcc14/c++23 also failed with Error: Internal Compiler Error (codegen): "unsupported float variant!" though gcc15/c++20 seems to work ( at least torch-cuda packages built)

@fwyzard

fwyzard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

OK, so it's a c++23 thing.

@cmsbuild

Copy link
Copy Markdown
Contributor

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-48f289/54017/summary.html
COMMIT: cd46f4a
CMSSW: CMSSW_20_1_X_2026-06-16-1100/el9_amd64_gcc15
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cms-bot/2785/54017/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-48f289/54017/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-48f289/54017/git-merge-result

Failed Build

I found compilation error when building:

/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/gcc/15.3.1-d0029c3359733fc60a20a68217f2ee30/bin/c++ -c -DCMS_MICRO_ARCH='x86-64-v3' -DGNU_GCC -D_GNU_SOURCE -DTBB_USE_GLIBCXX_VERSION=150301 -DTBB_SUPPRESS_DEPRECATED_MESSAGES -DTBB_PREVIEW_RESUMABLE_TASKS=1 -DTBB_PREVIEW_TASK_GROUP_EXTENSIONS=1 -DBOOST_SPIRIT_THREADSAFE -DPHOENIX_THREADSAFE -DBOOST_MATH_DISABLE_STD_FPCLASSIFY -DBOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX -DBOOST_MPL_IGNORE_PARENTHESES_WARNING -DCMSSW_GIT_HASH='CMSSW_20_1_X_2026-06-16-1100' -DPROJECT_NAME='CMSSW' -DPROJECT_VERSION='CMSSW_20_1_X_2026-06-16-1100' -DCMSSW_CUDA_IS_AVAILABLE -Isrc -Ipoison -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02946/el9_amd64_gcc15/cms/cmssw/CMSSW_20_1_X_2026-06-16-1100/src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/onnxruntime/1.26.0-3582942831743366fda413a5764ad8a5/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/cudnn/9.23.0.39-5a144f08cf1280717420730aab3c4e36/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/pcre/8.43-2e7cd569f80863eb9dfcf87970bfdccd/include -isystem/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/boost/1.80.0-4f6e739d81a79ab1242ff9495ad34534/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/bz2lib/1.0.8-dacdc81794eb1c23244654b096d5e8f8/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/cppunit/1.15.x-081381717bd0b64e56504613a9ce28a8/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/libuuid/2.40-c2a3b3080a707cf8c78728fbf2ac665e/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/protobuf/3.21.9-aa4a6ab2525698657cc7aac4722ee586/include -isystem/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/lcg/root/6.36.13-64a378d191152fbe62c144f63b1b0660/include -isystem/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/tbb/v2022.3.0-6942b993edee2e6f4dd32e69e1eda274/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/xz/5.6.4-b14670493efeb6e4b6272b668fa44b59/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/zlib/1.3.2-992f3b833708f30789f7ac5c998d015e/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/cuda/13.3.0-f776659a3999974309daf53af5ff4914/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/md5/1.0.0-0be1dd40a21181c3fdb3ab5d2442cd2a/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/re2/2021-06-01-aede4cc69b27912315cf7e45d4afa5f4/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/tinyxml2/6.2.0-71c40642358afd1fc6abc610ea54c6f7/include -O3 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++20 -ftree-vectorize -Werror=array-bounds -Werror=format-contains-nul -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -Wno-error=array-bounds -Warray-bounds -fuse-ld=bfd -march=x86-64-v3 -felide-constructors -fmessage-length=0 -Wall -Wno-non-template-friend -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-unused-parameter -Wunused -Wparentheses -Werror=return-type -Werror=unused-value -Werror=unused-result -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=unused-but-set-variable -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Werror=return-local-addr -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -Wno-error=unused-variable -DBOOST_DISABLE_ASSERTS -flto=auto -fipa-icf -flto-odr-type-merging -fno-fat-lto-objects -Wodr -fPIC -MMD -MF tmp/el9_amd64_gcc15/src/PhysicsTools/ONNXRuntime/test/testONNXRuntime/testRunner.cpp.d src/PhysicsTools/ONNXRuntime/test/testRunner.cpp -o tmp/el9_amd64_gcc15/src/PhysicsTools/ONNXRuntime/test/testONNXRuntime/testRunner.cpp.o
>> Building binary testONNXRuntime
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/gcc/15.3.1-d0029c3359733fc60a20a68217f2ee30/bin/c++ -O3 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++20 -ftree-vectorize -Werror=array-bounds -Werror=format-contains-nul -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -Wno-error=array-bounds -Warray-bounds -fuse-ld=bfd -march=x86-64-v3 -felide-constructors -fmessage-length=0 -Wall -Wno-non-template-friend -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-unused-parameter -Wunused -Wparentheses -Werror=return-type -Werror=unused-value -Werror=unused-result -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=unused-but-set-variable -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Werror=return-local-addr -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -Wno-error=unused-variable -DBOOST_DISABLE_ASSERTS -flto=auto -fipa-icf -flto-odr-type-merging -fno-fat-lto-objects -Wodr -fPIC  tmp/el9_amd64_gcc15/src/PhysicsTools/ONNXRuntime/test/testONNXRuntime/testONNXRuntime.cc.o tmp/el9_amd64_gcc15/src/PhysicsTools/ONNXRuntime/test/testONNXRuntime/testRunner.cpp.o -Wl,-E -Wl,--hash-style=gnu -Wl,--as-needed -Wl,-z,noexecstack -L/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_20_1_X_2026-06-16-1100/biglib/el9_amd64_gcc15 -L/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_20_1_X_2026-06-16-1100/lib/el9_amd64_gcc15 -L/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_20_1_X_2026-06-16-1100/external/el9_amd64_gcc15/lib -L/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02946/el9_amd64_gcc15/cms/cmssw/CMSSW_20_1_X_2026-06-16-1100/biglib/el9_amd64_gcc15 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02946/el9_amd64_gcc15/cms/cmssw/CMSSW_20_1_X_2026-06-16-1100/lib/el9_amd64_gcc15 -L/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/cuda/13.3.0-f776659a3999974309daf53af5ff4914/lib64/stubs -L/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/cuda/13.3.0-f776659a3999974309daf53af5ff4914/lib64 -L/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_20_1_X_2026-06-16-1100/static/el9_amd64_gcc15 -L/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02946/el9_amd64_gcc15/cms/cmssw/CMSSW_20_1_X_2026-06-16-1100/static/el9_amd64_gcc15 -lFWCoreParameterSet -lFWCoreMessageLogger -lDataFormatsProvenance -lFWCorePluginManager -lFWCoreReflection -lPhysicsToolsONNXRuntime -lFWCoreUtilities -lonnxruntime -lcudnn -lCore -lboost_thread -lboost_date_time -lcudart -lcudadevrt -lpcre -lbz2 -lcppunit -lcuda -luuid -lprotobuf -ltbb -llzma -lz -lcms-md5 -lre2 -lcrypt -ldl -lrt -ltinyxml2 -o tmp/el9_amd64_gcc15/src/PhysicsTools/ONNXRuntime/test/testONNXRuntime/testONNXRuntime
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el9_amd64_gcc15/external/gcc/15.3.1-d0029c3359733fc60a20a68217f2ee30/x86_64-vendor-linux/bin/ld.bfd: tmp/el9_amd64_gcc15/src/PhysicsTools/ONNXRuntime/test/testONNXRuntime/ccDopaqu.ltrans0.ltrans.o: in function `testONNXRuntime::checkGPU()':
:(.text+0x19fd): undefined reference to `cms::cudatest::testDevices()'
collect2: error: ld returned 1 exit status
>> Deleted: tmp/el9_amd64_gcc15/src/PhysicsTools/ONNXRuntime/test/testONNXRuntime/testONNXRuntime
gmake: *** [tmp/el9_amd64_gcc15/src/PhysicsTools/ONNXRuntime/test/testONNXRuntime/testONNXRuntime] Error 1
>> Leaving Package PhysicsTools/ONNXRuntime
>> Package PhysicsTools/ONNXRuntime built
>> Entering Package CondFormats/DQMObjects


@smuzaffar

Copy link
Copy Markdown
Contributor Author

please test with cms-sw/cmsdist#10637 using full cmssw for el9_amd64_gcc15

@fwyzard

fwyzard commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

please test for el9_amd64_gcc14

The CMSSW archive is no longer available.

@smuzaffar

Copy link
Copy Markdown
Contributor Author

The CMSSW archive is no longer available.

when externals do not build then we do not have cmssw archive file. As gcc14 fails during the external build phase that is why the archive is missing.

By the way, by default ( i.e. simple please test) cms-bot PRs do not build externals . you need to add a cmsdist PR so that bot builds externals

@fwyzard

fwyzard commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Ah, ok.

I'll try to reproduce it locally, then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants