Skip to content

feat: Build NCCL from source for more up-to-date versions#35

Draft
Eta0 wants to merge 2 commits into
masterfrom
eta/build-nccl
Draft

feat: Build NCCL from source for more up-to-date versions#35
Eta0 wants to merge 2 commits into
masterfrom
eta/build-nccl

Conversation

@Eta0

@Eta0 Eta0 commented Apr 8, 2024

Copy link
Copy Markdown
Collaborator

Build NCCL from source

This change switches the installation of libnccl2 and libnccl-dev from apt-get to a source build.
The prebuilt NCCL distributions available for an apt-get install are not updated consistently across all OS version × CUDA version pairs, so this change ensures all of our images can use new NCCL releases as they come out.

This change additionally updates the NCCL version to v2.21.5 on all of our current build targets.

This build supports the following (configurable) list of compute architectures by default: 7.0, 7.5, 8.0, 8.6, 8.9, 9.0+PTX (matching our ml-containers PyTorch builds). This is slightly smaller than the default architecture support list, so the binaries are a bit smaller than the prebuilt distribution as well.

@Eta0 Eta0 added the enhancement New feature or request label Apr 8, 2024
@Eta0 Eta0 requested a review from salanki April 8, 2024 18:52
@Eta0 Eta0 self-assigned this Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants