Releases: NVIDIA/cuda-samples
Releases · NVIDIA/cuda-samples
CUDA Samples v13.3
CUDA Samples v13.2 update
CUDA 13.2 samples update
- Added Python samples for CUDA Python 1.0 release
- Renamed top-level
Samplesdirectory tocppto accommodate Python samples.
CUDA Samples v13.2
CUDA 13.2
- Added the MSVC compile flag
-Xcompiler=/Zc:preprocessorin CMakeLists.txt to comply with CUDA13.2 CCCL. Previously, using the traditional preprocessor triggered the warning “MSVC/cl.exe with traditional preprocessor is used…”, which now leads to a build error. - Minor bug fixes
CUDA Samples v13.1
Updates and bug fixes for the CUDA Toolkit v13.1 release.
CUDA Samples v13.0
CUDA 13.0
- Updated the samples using the cudaDeviceProp fields which are deprecated and removed in CUDA 13.0, replacing the fields with the equivalents in "cudaDeviceGetAttribute":
- Deprecated "cudaDeviceProp" fields
int clockRate; // - Replaced with "cudaDevAttrClockRate"
int deviceOverlap; // - Replaced with "cudaDevAttrGpuOverlap */
int kernelExecTimeoutEnabled; // - Replaced with "cudaDevAttrKernelExecTimeout
int computeMode; // - Replaced with "cudaDevAttrComputeMode" */
int memoryClockRate; // - Replaced with "cudaDevAttrMemoryClockRate"
int cooperativeMultiDeviceLaunch; // - Deprecated, cudaLaunchCooperativeKernelMultiDevice is deprecated. 0_IntroductionUnifiedMemoryStreamssimpleHyperQsimpleIPCsimpleMultiCopysystemWideAtomics
1_UtilitiedeviceQuery
2_Concepts_and_TechniquesstreamOrderedAllocationIPC
4_CUDA_LibrariessimpleCUBLASXT
5_Domain_SpecificsimpleVulkanvulkanImageCUDA
- Deprecated "cudaDeviceProp" fields
- Updated the samples using the CUDA driver API "cuCtxCreate" with adding the parameter "CUctxCreateParams" as "cuCtxCreate" is updated to "cuCtxCreate_v4" by default in CUDA 13.0:
Commonnvrtc_helper.h
0_IntroductionUnifiedMemoryStreamsmatrixMulDrvsimpleTextureDrvvectorAddDrvvectorAddMMAP
2_Concepts_and_TechniquesEGLStream_CUDA_CrossGPUEGLStream_CUDA_InteropthreadMigration
3_CUDA_FeaturesgraphMemoryFootprintmemMapIPCDrv
4_CUDA_LibrariesjitLto
7_libNVVMcuda-c-linkingdevice-side-launchsimpleuvmlite
8_Platform_Specific/TegraEGLSync_CUDAEvent_Interop
- Updated the sample using CUDA API "cudaGraphAddNode"/"cudaStreamGetCaptureInfo" with adding "cudaGraphEdgeData" pointer parameter as they are updated to "cudaGraphAddNode_v2"/"cudaStreamGetCaptureInfo_v3" by default in CUDA 13.0:
3_CUDA_FeaturesgraphConditionalNodes
- Updated the samples using CUDA API "cudaMemAdvise"/"cudaMemPrefetchAsync" with changing the parameter "int device" to "cudaMemLocation location" as they are updated to "cudaMemAdvise_v2"/"cudaMemPrefetchAsyn_v2" by default in CUDA 13.0.
4_CUDA_LibrariesconjugateGradientMultiDeviceCG
6_PerformanceUnifiedMemoryPerf
- Replaced "thrust::identity()" with "cuda::std::identity()" as it is deprecated in CUDA 13.0.
2_Concepts_and_TechniquessegmentationTreeThrust
- Updated the the headers file and samples for CUFFT error codes update.
- Deprecated CUFFT errors:
CUFFT_INCOMPLETE_PARAMETER_LISTCUFFT_PARSE_ERRORCUFFT_LICENSE_ERROR
- New added CUFFT errors:
CUFFT_MISSING_DEPENDENCYCUFFT_NVRTC_FAILURECUFFT_NVJITLINK_FAILURECUFFT_NVSHMEM_FAILURE
- Header files and samples that are related with this change:
Common/helper_cuda.h4_CUDA_LibrariessimpleCUFFTsimpleCUFFT_2d_MGPUsimpleCUFFT_MGPUsimpleCUFFT_callback
- Deprecated CUFFT errors:
- Updated toolchain for cross-compilation for Tegra QNX platforms.
CUDA Samples v12.9
CUDA 12.9
- Updated toolchain for cross-compilation for Tegra Linux platforms.
- Added run_tests.py utility to exercise all samples. See README.md for details
- Repository has been updated with consistent code formatting across all samples
- Many small code tweaks and bug fixes (see commit history for details)
- Removed the following outdated samples:
- 1_Utilities
- bandwidthTest - this sample was out of date and did not produce accurate results. For bandwidth testing of NVIDIA GPU platforms, please refer to NVBandwidth
- 1_Utilities
CUDA Samples v12.8
CUDA Samples for release 12.8.
CUDA 12.8
- Updated build system across the repository to CMake. Removed Visual Studio project files and Makefiles.
- Removed the following outdated samples:
0_Introductionc++11_cudademonstrating CUDA and C++ 11 interoperability (reason: obsolete)concurrentKernelsdemonstrating the ability to run multiple kernels simultaneously (reason: obsolete)cppIntegrationdemonstrating calling between .cu and .cpp files (reason: obsolete)cppOverloaddemonstrating C++ function overloading (reason: obsolete)simpleSeparateCompilationdemonstrating NVCC compilation to a static library (reason: trivial)simpleTemplates_nvrtcdemonstrating NVRTC usage forsimpleTemplatessample (reason: redundant)simpleVoteIntrinsics_nvrtcdemonstrating NVRTC usage forsimpleVoteIntrinsicssample (reason: redundant)
2_Concepts_and_TechniquescuHookdemonstrating dlsym hooks. (reason: incompatible with modernglibc)
4_CUDA_LibrariesbatchedLabelMarkersAndLabelCompressionNPPdemonstrating NPP features (reason: some functionality removed from library)
5_Domain_Specific- Legacy Direct3D 9 and 10 interoperability samples:
fluidsD3D9simpleD3D10simpleD3D10RenderTargetsimpleD3D10TexturesimpleD3D9simpleD3D9TextureSLID3D10TextureVFlockingD3D10
- Legacy Direct3D 9 and 10 interoperability samples:
8_Platform_Specific/Tegra- Temporarily removed the following two samples pending updates:
nbody_screendemonstrating the nbody sample in QNXsimpleGLES_screendemonstrating GLES interop in QNX
- Temporarily removed the following two samples pending updates:
- Moved the following Tegra-specific samples to a dedicated subdirectory:
8_Platform_Specific/TegraEGLSync_CUDAEvent_InteropcuDLAErrorReportingcuDLAHybridModecuDLALayerwiseStatsHybridcuDLALayerwiseStatsStandalonecuDLAStandaloneModecudaNvSciNvMediafluidsGLESnbody_openglessimpleGLESsimpleGLES_EGLOutput
CUDA Samples v12.5
Updates CUDA Samples for 12.5
CUDA Samples v12.4.1
Minor Updates to CUDA Samples 12.4
CUDA Samples v12.4
Updating README with Confidential Computing notes