[Bug Report] NaN value Issue

If you are submitting a bug report, please fill in the following details and use the tag [bug].

### Describe the bug
In IsaacLab 3.0.0 beta2, it has problem when using newton physics: 
```
ValueError: The observation group 'policy' returned by the environment contains NaN values. This usually indicates a bug in the environment's step() or reset() function.
```
Are you still working for the compatibility or that is a bug? Thank you for your help!

### Steps to reproduce

Use "./isaaclab.sh train --rl_library rsl_rl --task=Isaac-Velocity-Rough-Anymal-C-v0 physics=newton_mjwarp" and this bug appears. 


```
Traceback (most recent call last):
  File "/home/wsy/env_isaaclab/IsaacLab/source/isaaclab_tasks/isaaclab_tasks/utils/sim_launcher.py", line 499, in launch_simulation
    yield
  File "/home/wsy/env_isaaclab/IsaacLab/scripts/reinforcement_learning/rsl_rl/train_rsl_rl.py", line 179, in run
    runner.learn(num_learning_iterations=agent_cfg.max_iterations, init_at_random_ep_len=True)
  File "/home/wsy/env_isaaclab/lib/python3.12/site-packages/rsl_rl/runners/on_policy_runner.py", line 90, in learn
    check_nan(obs, rewards, dones)
  File "/home/wsy/env_isaaclab/lib/python3.12/site-packages/rsl_rl/utils/utils.py", line 279, in check_nan
    raise ValueError(
ValueError: The observation group 'policy' returned by the environment contains NaN values. This usually indicates a bug in the environment's step() or reset() function.
Traceback (most recent call last):
  File "/home/wsy/env_isaaclab/IsaacLab/scripts/reinforcement_learning/train.py", line 37, in <module>
    raise SystemExit(main())
                     ^^^^^^
  File "/home/wsy/env_isaaclab/IsaacLab/scripts/reinforcement_learning/train.py", line 27, in main
    return dispatch_library_entrypoint(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wsy/env_isaaclab/IsaacLab/scripts/reinforcement_learning/common.py", line 79, in dispatch_library_entrypoint
    module.run(library_args)
  File "/home/wsy/env_isaaclab/IsaacLab/scripts/reinforcement_learning/rsl_rl/train_rsl_rl.py", line 179, in run
    runner.learn(num_learning_iterations=agent_cfg.max_iterations, init_at_random_ep_len=True)
  File "/home/wsy/env_isaaclab/lib/python3.12/site-packages/rsl_rl/runners/on_policy_runner.py", line 90, in learn
    check_nan(obs, rewards, dones)
  File "/home/wsy/env_isaaclab/lib/python3.12/site-packages/rsl_rl/utils/utils.py", line 279, in check_nan
    raise ValueError(
ValueError: The observation group 'policy' returned by the environment contains NaN values. This usually indicates a bug in the environment's step() or reset() function.

```



### System Info

Describe the characteristic of your environment:


- Commit: [e.g. 8f3b9ca]
- Isaac Sim Version: [6.0.0.1]
- OS: [Ubuntu 24.04 & 26.04]
- GPU: [e.g. RTX 5080 & 4080 super]
- CUDA: [12.8]
- GPU Driver: [580.159]

### Additional context

Add any other context about the problem here.

### Checklist

- [x] I have checked that there is no similar issue in the repo (**required**)
- [ ] I have checked that the issue is not in running Isaac Sim itself and is related to the repo

### Acceptance Criteria

Add the criteria for which this task is considered **done**. If not known at issue creation time, you can add this once the issue is assigned.

- [ ] Similar tasks in IsaacLab does not have ValueError problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Report] NaN value Issue #6184

Describe the bug

Steps to reproduce

System Info

Additional context

Checklist

Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug Report] NaN value Issue #6184

Description

Describe the bug

Steps to reproduce

System Info

Additional context

Checklist

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions