Skip to content

Add unified training benchmark dispatcher and RL adapters (benchmark refactor, Part 3/5)#6199

Open
AntoineRichard wants to merge 3 commits into
isaac-sim:developfrom
AntoineRichard:antoiner/benchmark-training
Open

Add unified training benchmark dispatcher and RL adapters (benchmark refactor, Part 3/5)#6199
AntoineRichard wants to merge 3 commits into
isaac-sim:developfrom
AntoineRichard:antoiner/benchmark-training

Conversation

@AntoineRichard

@AntoineRichard AntoineRichard commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Description

Part 3 of 5 of the benchmark refactor series — the unified training dispatcher + RL adapters.

Stacked on Parts 1–2 (#6197, #6198). The diff against develop below also includes Parts 1–2 until they merge. For the incremental Part 3 changes only, view:
AntoineRichard/IsaacLab@antoiner/benchmark-runtime-startup...antoiner/benchmark-training

Series: Part 1/5 core (#6197) → Part 2/5 runtime + startup (#6198) → Part 3/5 training (this PR) → Part 4/5 play (#6201) → Part 5/5 cleanup.

This PR is purely additive — it adds training.py and the per-backend adapters alongside the existing benchmark_rsl_rl.py / benchmark_rlgames.py / run_training_benchmarks.sh, which keep working unchanged. Removal of the legacy scripts and utils.py is deferred to Part 5/5.

Adds:

  • scripts/benchmarks/training.py — dispatcher selecting the RL library with --rl_library {rsl_rl, rl_games, skrl, sb3} (mirrors scripts/reinforcement_learning/train.py).
  • Per-backend adapters (rsl_rl/bench_rsl_rl.py, rl_games/bench_rl_games.py, skrl/bench_skrl.py, sb3/bench_sb3.py) that run real training under BenchmarkMonitor and emit a TrainingBundle via the shared core.
  • Smoke tests for all four backends.

Also repoints the shared early_stop.py at the core SuccessRateTracker. This is a behavior-preserving change: the public wrappers/observers/CLI helpers are unchanged, and the legacy benchmark_rsl_rl.py / benchmark_rlgames.py (which import only those preserved symbols) keep working against it.

Docs: updates the RL-training sections of the benchmarking / performance / warp-environments / visualization pages. (The 3.0 migration-guide "Benchmark Scripts" section, which documents the legacy-script removal, lands with Part 5/5.)

Validated on develop (Newton/MJWarp): all four training smokes pass (rsl_rl / skrl / sb3 at 16 envs, rl_games at 512).

Fixes # (n/a)

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added a changelog fragment under source/<pkg>/changelog.d/ for every touched package
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions Bot added documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team labels Jun 16, 2026
@AntoineRichard AntoineRichard force-pushed the antoiner/benchmark-training branch from 0e66178 to cc2b7a0 Compare June 16, 2026 12:22
@AntoineRichard AntoineRichard marked this pull request as ready for review June 16, 2026 12:33
@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Part 3 of the benchmark refactor series adds a unified training.py dispatcher and four per-library RL adapters (rsl_rl, rl_games, skrl, sb3), each running real training under BenchmarkMonitor and emitting a typed TrainingBundle JSON via the new schema backend. It also refactors early_stop.py to reuse SuccessRateTracker from the shared core and removes the superseded monolithic benchmark scripts.

  • New dispatcher + adapters: training.py selects the RL library via --rl_library and delegates to <backend>/bench_<backend>.py; each adapter captures per-iteration timing/reward/episode-length independently of TensorBoard where possible.
  • Core library changes: BaseIsaacLabBenchmark now accepts a list of backend types, gains attach_bundle(), and dispatches to each backend's finalize() with suffix-disambiguated filenames.
  • Test coverage: Four subprocess smoke tests validate end-to-end bundle emission; _find_bundle is copy-pasted identically into all four smoke test files and could be consolidated in conftest.py.

Confidence Score: 4/5

The dispatcher and all four RL adapters work correctly end-to-end; the only issues are localized quality concerns in the skrl adapter and duplicated test helpers.

The skrl adapter has two concerns that would silently drop metrics if skrl internals change, but neither causes a crash or data corruption. All other adapters are clean.

scripts/benchmarks/skrl/bench_skrl.py

Important Files Changed

Filename Overview
scripts/benchmarks/training.py New unified dispatcher selecting per-library adapter via --rl_library.
scripts/benchmarks/rsl_rl/bench_rsl_rl.py New RSL-RL adapter; redundant if/else branches and private _finalize_impl() call.
scripts/benchmarks/rl_games/bench_rl_games.py New RL-Games adapter; flushes TensorBoard writer before parsing, handles missing log data gracefully.
scripts/benchmarks/skrl/bench_skrl.py New SKRL adapter; private runner._trainer access and heuristic episode-length key may silently produce zero series.
scripts/benchmarks/sb3/bench_sb3.py New SB3 adapter; BenchmarkCallback captures per-rollout timing/reward; NaN filtering for incomplete rollouts is correct.
scripts/benchmarks/_common.py New shared helpers: backend normalization, multi-backend splitting, preset extraction, safe importlib loading.
scripts/benchmarks/early_stop.py Re-exports SuccessRateTracker from core; adds defensive hasattr guard for after_clear_stats.
source/isaaclab/isaaclab/test/benchmark/benchmark_core.py Adds multi-backend support, attach_bundle(), and per-backend suffixed output filenames.
source/isaaclab/isaaclab/test/benchmark/backends.py Adds SchemaBundleFile backend serializing typed bundles; no-ops when no bundle attached.
source/isaaclab/isaaclab/test/benchmark/builders.py New pure assembly module converting raw series into schema dataclasses.
scripts/benchmarks/test/test_training_rsl_rl_smoke.py Smoke test for rsl_rl; _find_bundle duplicated across all four smoke test files.
scripts/benchmarks/test/conftest.py New require_isaacsim fixture that skips smoke tests when Isaac Sim is unavailable.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    CLI["training.py --rl_library lib"] --> DISP["dispatch_library_entrypoint()"]
    DISP --> RSL["bench_rsl_rl.py run()"]
    DISP --> RLG["bench_rl_games.py run()"]
    DISP --> SKRL["bench_skrl.py run()"]
    DISP --> SB3["bench_sb3.py run()"]
    RSL & RLG & SKRL & SB3 --> LAUNCH["launch_simulation()"]
    LAUNCH --> TRAIN["Framework training loop"]
    TRAIN --> BUILD["builders.build_training_bundle()"]
    BUILD --> ATTACH["benchmark.attach_bundle()"]
    ATTACH --> FI["benchmark._finalize_impl()"]
    FI --> SCHEMA["SchemaBundleFile TrainingBundle .json"]
    FI --> OTHER["omniperf / json / osmo KPI .json"]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    CLI["training.py --rl_library lib"] --> DISP["dispatch_library_entrypoint()"]
    DISP --> RSL["bench_rsl_rl.py run()"]
    DISP --> RLG["bench_rl_games.py run()"]
    DISP --> SKRL["bench_skrl.py run()"]
    DISP --> SB3["bench_sb3.py run()"]
    RSL & RLG & SKRL & SB3 --> LAUNCH["launch_simulation()"]
    LAUNCH --> TRAIN["Framework training loop"]
    TRAIN --> BUILD["builders.build_training_bundle()"]
    BUILD --> ATTACH["benchmark.attach_bundle()"]
    ATTACH --> FI["benchmark._finalize_impl()"]
    FI --> SCHEMA["SchemaBundleFile TrainingBundle .json"]
    FI --> OTHER["omniperf / json / osmo KPI .json"]
Loading

Comments Outside Diff (3)

  1. scripts/benchmarks/rsl_rl/bench_rsl_rl.py, line 781-791 (link)

    P2 Redundant if/else with identical bodies

    Both branches produce exactly the same OnPolicyRunner(...) constructor call; the only observable difference is the warning print. The condition could be restructured so that the runner creation happens once unconditionally, with only the warning gated behind the else.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

  2. scripts/benchmarks/skrl/bench_skrl.py, line 799-800 (link)

    P2 Private attribute access on skrl Runner

    runner._trainer is a private attribute of skrl's Runner class. If skrl renames this field, all per-iteration timing, reward, and episode-length series will be empty in the emitted bundle with no error surfaced. Consider exposing the trainer via a property on _BenchmarkRunner instead.

  3. scripts/benchmarks/skrl/bench_skrl.py, line 550-553 (link)

    P2 Episode-length key heuristic may silently produce all-zero series

    The search requires both "episode" and "timestep" in the key name. If skrl uses "Episode / Length" instead, ep_len_key is always None and iter_ep_lengths will be all zeros without any warning.

Reviews (1): Last reviewed commit: "Add unified training benchmark dispatche..." | Re-trigger Greptile

Comment on lines +14 to +26
ROOT = Path(__file__).resolve().parents[3]

_TASK = "Isaac-Cartpole-Direct"

# Top-level keys that identify a schema TrainingBundle (runtime bundle plus ``learning``).
_TRAINING_BUNDLE_KEYS = {"run", "versions", "hardware", "runtime", "resources", "learning"}


def _find_bundle(out_dir: Path, expected_keys: set[str]) -> dict:
"""Return the parsed JSON whose top-level keys cover ``expected_keys``.

The schema backend names its file from a timestamped prefix, so the smoke
tests glob the output directory rather than hardcode the filename.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _find_bundle duplicated verbatim across all four smoke-test files

The helper is copy-pasted identically into all four smoke test files. Moving it to conftest.py would eliminate the duplication.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!


benchmark.attach_bundle(bundle)

benchmark._finalize_impl()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 All adapters call _finalize_impl() as external callers

All four adapters reach into BaseIsaacLabBenchmark's private API. Consider promoting _finalize_impl to a public method or documenting the external-call contract explicitly.

Introduce the capture, metrics, builders, stepping, profiling, and
backend_descriptor submodules for assembling the schema-v1 benchmark
bundles, add a schema output backend, and let BaseIsaacLabBenchmark emit
several backends in one run via a new attach_bundle hook. Unit tests
cover each submodule plus the schema backend and multi-backend finalize.

Part 1 of a series splitting the oversized benchmark refactor
(core -> runtime/startup -> training -> play).
@AntoineRichard AntoineRichard force-pushed the antoiner/benchmark-training branch from cc2b7a0 to 45e2299 Compare June 17, 2026 07:43
Add backend-agnostic runtime.py (random-action stepping, emits a
RuntimeBundle) and startup.py (cProfile startup-phase profiling, emits a
StartupBundle), wired to develop's launch API (launch_simulation and
add_launcher_args from isaaclab.app; preset tokens forwarded to Hydra
without folding). Remove the legacy benchmark_non_rl.py and
benchmark_startup.py scripts plus the run_non_rl_benchmarks.sh and
run_physx_benchmarks.sh runner shells; repoint benchmark_hydra_resolve
at _common.get_backend_type.

Part 2 of the benchmark refactor series (core -> runtime/startup ->
training -> play); stacked on Part 1 (isaac-sim#6197).
Add training.py dispatching over --rl_library {rsl_rl, rl_games, skrl,
sb3}; each adapter runs real training under BenchmarkMonitor and emits a
TrainingBundle via the shared core, with an optional success-metric early
stop. Scripts use develop's launch API (launch_simulation from
isaaclab.app; preset tokens forwarded without folding). Remove the legacy
benchmark_rsl_rl.py / benchmark_rlgames.py scripts, the
run_training_benchmarks.sh runner shell, and the obsolete utils.py helper.

Part 3 of the benchmark refactor series (core -> runtime/startup ->
training -> play); stacked on Parts 1-2 (isaac-sim#6197, isaac-sim#6198).
@AntoineRichard AntoineRichard force-pushed the antoiner/benchmark-training branch from 45e2299 to a2bc539 Compare June 17, 2026 08:54
@AntoineRichard AntoineRichard changed the title Add unified training benchmark dispatcher and RL adapters (benchmark refactor, Part 3/4) Add unified training benchmark dispatcher and RL adapters (benchmark refactor, Part 3/5) Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant