Skip to content

Problemi con CUDA RTX5070 #467

Description

@dozeen

Segmentation fault on RTX 5070 (SM120) under WSL2 before model loading

Environment

  • OS: Ubuntu 24.04 running under WSL2
  • Host OS: Windows 11 Pro
  • CPU: AMD Ryzen 7 5700X
  • GPU: NVIDIA GeForce RTX 5070 (Compute Capability 12.0 / SM120)
  • NVIDIA Driver: 595.79
  • CUDA Toolkit: 13.2
  • nvcc: 13.2.51

Repository:

80ebbc396aee40eedc1d829222f3362d10fa4c6c

Problem

Both ds4 and ds4-server immediately crash with a segmentation fault when using the CUDA backend.

Example:

./ds4 --cuda -p "Hello"

Output:

ds4: Linux cuda backend set oom_score_adj=1000
Segmentation fault (core dumped)

The crash happens before the model is loaded.

The same happens with:

./ds4-server --cuda --ctx 4096

What I already verified

  • Fresh clone of the repository
  • Built from source successfully
  • Rebuilt after replacing
targets/sbsa-linux/lib

with

targets/x86_64-linux/lib

inside the Makefile.

  • Compiled using
make cuda CUDA_ARCH=sm_120
  • Model path is correct.
  • CPU backend works correctly.
  • CUDA Toolkit installation appears correct.
  • cudaGetDeviceCount() and cudaSetDevice() both work in a standalone CUDA test program.

Example:

cudaGetDeviceCount(...)
cudaSetDevice(0)

returns success.

GDB backtrace

The crash occurs during CUDA initialization.

cudaSetDevice()

↓

libcuda.so

↓

libnvidia-ptxjitcompiler.so

↓

SIGSEGV

Backtrace:

#0 libnvidia-ptxjitcompiler.so
#1 __cuda_CallJitEntryPoint
#2 libcuda.so
#3 cudaSetDevice
#4 ds4_gpu_init()
#5 ds4_engine_open()
#6 main()

strace

The last operations before the crash are loading

/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.580.159.03

Immediately afterwards:

SIGSEGV si_addr=0x30

Additional information

The crash happens on two independent clones of the repository.

The crash also occurs before loading the GGUF model, so it does not appear to be model-related.

Since a standalone CUDA program works correctly, I suspect this may be:

  • a CUDA JIT issue affecting SM120 / RTX 5070,
  • or an interaction between DS4 CUDA initialization and the CUDA 13.2 runtime under WSL2.

If there is anything else I can test or any additional logging you would like, I'd be happy to help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions