tf.raw_ops.Where with tf.experimental.numpy.compress output causes XLA runtime error: "Buffers have different size"

#### Description
When tf.raw_ops.Where is applied to the output of tf.experimental.numpy.compress inside a jit_compile=True function, XLA compilation succeeds but execution fails at runtime with InternalError: Buffers have different size. Eager execution completes without error.
The likely cause is a shape inference mismatch: tf.experimental.numpy.compress produces a dynamically-shaped output (the number of surviving elements depends on the condition values, which are only known at runtime), and tf.raw_ops.Where similarly produces a dynamic output. XLA statically allocates output buffers at compile time, so if the inferred static shape disagrees with the actual runtime shape of either op, buffer size mismatches arise at execution.

#### Minimal Code to reproduce
```python
import tensorflow as tf

tf.random.set_seed(9399)
t101937 = tf.random.uniform([1], dtype=tf.float64)
t101935 = tf.random.uniform([2, 16, 16, 8], minval=0, maxval=8, dtype=tf.int32)

def model_9399(t101937, t101935):
    t101938 = tf.experimental.numpy.compress(t101937, t101935)
    t101939 = tf.raw_ops.Where(condition=t101938)
    return t101939

jit_fn = tf.function(model_9399, jit_compile=True)
jit_fn(t101937, t101935)
```
#### Error Log
```console
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1780086739.472410  517462 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
I0000 00:00:1780086739.500487  517462 cpu_feature_guard.cc:227] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1780086740.096920  517462 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
I0000 00:00:1780086740.317470  517462 gpu_device.cc:2043] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 21960 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:01:00.0, compute capability: 8.9
I0000 00:00:1780086740.525868  517462 service.cc:153] XLA service 0x64cb68207050 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1780086740.525885  517462 service.cc:161]   StreamExecutor [0]: NVIDIA GeForce RTX 4090, Compute Capability 8.9 (Driver: 13.0.0; Runtime: 12.9.0; Toolkit: 12.5.0; DNN: 9.20.0)
I0000 00:00:1780086740.569211  517462 cuda_dnn.cc:461] Loaded cuDNN version 92000
I0000 00:00:1780086742.103596  517462 device_compiler.h:208] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
Traceback (most recent call last):
  File "/home/test.py", line 92, in <module>
    jit_fn(t101937, t101935)
  File "/home/.venv-tf/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 167, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/.venv-tf/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 53, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: Buffers have different size at runtime [Op:__inference_model_9399_47]
```
#### Versions
- Tensorflow: 2.21.0
- GPU: NVIDIA GeForce RTX 4090
- CUDA / cuDNN: CUDA 13.0, cuDNN 9.2.0
- Python: 3.10
- Ubuntu 22.04.5 LTS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tf.raw_ops.Where with tf.experimental.numpy.compress output causes XLA runtime error: "Buffers have different size" #43451

Description

Minimal Code to reproduce

Error Log

Versions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

tf.raw_ops.Where with tf.experimental.numpy.compress output causes XLA runtime error: "Buffers have different size" #43451

Description

Description

Minimal Code to reproduce

Error Log

Versions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions