Skip to content

tf.image.sobel_edges crashes with cuDNN status 1002 (CUDNN_BACKEND_TENSOR_DESCRIPTOR finalize failure) on GPU #43449

@jasminetrail

Description

@jasminetrail

Description

Calling tf.image.sobel_edges on the output of tf.raw_ops.LogSoftmax raises an UnknownError with cuDNN status 1002 during eager execution on GPU. The error originates from DepthwiseConv2dNative (which sobel_edges uses internally) and indicates a failure to finalize a CUDNN_BACKEND_TENSOR_DESCRIPTOR.

Minimal Code to reproduce

import tensorflow as tf

tf.random.set_seed(10)
t160 = tf.random.uniform([4, 8, 8, 1], dtype=tf.float32)

def model_10(t160):
    t161 = tf.raw_ops.LogSoftmax(logits=t160)
    t162 = tf.image.sobel_edges(t161)
    return t162

model_10(t160)

Error Log

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1780085730.685744  516322 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
I0000 00:00:1780085730.713858  516322 cpu_feature_guard.cc:227] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1780085731.309838  516322 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
I0000 00:00:1780085731.533995  516322 gpu_device.cc:2043] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 21960 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:01:00.0, compute capability: 8.9
I0000 00:00:1780085731.709029  516322 cuda_dnn.cc:461] Loaded cuDNN version 92000
W0000 00:00:1780085731.717507  516322 op_kernel.cc:1858] OP_REQUIRES failed at conv_ops_impl.h:1204 : UNKNOWN: <unknown cudnn status: 1002>
in external/xla/xla/stream_executor/cuda/cuda_dnn.cc(2859): 'tensor' CUDNN_BACKEND_TENSOR_DESCRIPTOR cudnnFinalize failedptrDesc->finalize()
W0000 00:00:1780085731.717532  516322 local_rendezvous.cc:412] Local rendezvous is aborting with status: UNKNOWN: <unknown cudnn status: 1002>
in external/xla/xla/stream_executor/cuda/cuda_dnn.cc(2859): 'tensor' CUDNN_BACKEND_TENSOR_DESCRIPTOR cudnnFinalize failedptrDesc->finalize()
Traceback (most recent call last):
  File "/home/test.py", line 89, in <module>
    model_10(t160)
  File "/home/test.py", line 86, in model_10
    t162 = tf.image.sobel_edges(t161)
  File "/home/.venv-tf/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 167, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/.venv-tf/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 6027, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.UnknownError: {{function_node __wrapped__DepthwiseConv2dNative_device_/job:localhost/replica:0/task:0/device:GPU:0}} <unknown cudnn status: 1002>
in external/xla/xla/stream_executor/cuda/cuda_dnn.cc(2859): 'tensor' CUDNN_BACKEND_TENSOR_DESCRIPTOR cudnnFinalize failedptrDesc->finalize() [Op:DepthwiseConv2dNative] name: 

Versions:

  • Tensorflow: 2.21.0
  • GPU: NVIDIA GeForce RTX 4090
  • CUDA / cuDNN: CUDA 13.0, cuDNN 9.2.0
  • Python: 3.10
  • Ubuntu 22.04.5 LTS

Metadata

Metadata

Labels

GPUXLA on GPUNVIDIA-GPUXLA on Nvidia GPU

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions