Add unified training benchmark dispatcher and RL adapters (benchmark refactor, Part 3/5)#6199
Conversation
0e66178 to
cc2b7a0
Compare
Greptile SummaryPart 3 of the benchmark refactor series adds a unified
Confidence Score: 4/5The dispatcher and all four RL adapters work correctly end-to-end; the only issues are localized quality concerns in the skrl adapter and duplicated test helpers. The skrl adapter has two concerns that would silently drop metrics if skrl internals change, but neither causes a crash or data corruption. All other adapters are clean. scripts/benchmarks/skrl/bench_skrl.py Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
CLI["training.py --rl_library lib"] --> DISP["dispatch_library_entrypoint()"]
DISP --> RSL["bench_rsl_rl.py run()"]
DISP --> RLG["bench_rl_games.py run()"]
DISP --> SKRL["bench_skrl.py run()"]
DISP --> SB3["bench_sb3.py run()"]
RSL & RLG & SKRL & SB3 --> LAUNCH["launch_simulation()"]
LAUNCH --> TRAIN["Framework training loop"]
TRAIN --> BUILD["builders.build_training_bundle()"]
BUILD --> ATTACH["benchmark.attach_bundle()"]
ATTACH --> FI["benchmark._finalize_impl()"]
FI --> SCHEMA["SchemaBundleFile TrainingBundle .json"]
FI --> OTHER["omniperf / json / osmo KPI .json"]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
CLI["training.py --rl_library lib"] --> DISP["dispatch_library_entrypoint()"]
DISP --> RSL["bench_rsl_rl.py run()"]
DISP --> RLG["bench_rl_games.py run()"]
DISP --> SKRL["bench_skrl.py run()"]
DISP --> SB3["bench_sb3.py run()"]
RSL & RLG & SKRL & SB3 --> LAUNCH["launch_simulation()"]
LAUNCH --> TRAIN["Framework training loop"]
TRAIN --> BUILD["builders.build_training_bundle()"]
BUILD --> ATTACH["benchmark.attach_bundle()"]
ATTACH --> FI["benchmark._finalize_impl()"]
FI --> SCHEMA["SchemaBundleFile TrainingBundle .json"]
FI --> OTHER["omniperf / json / osmo KPI .json"]
|
| ROOT = Path(__file__).resolve().parents[3] | ||
|
|
||
| _TASK = "Isaac-Cartpole-Direct" | ||
|
|
||
| # Top-level keys that identify a schema TrainingBundle (runtime bundle plus ``learning``). | ||
| _TRAINING_BUNDLE_KEYS = {"run", "versions", "hardware", "runtime", "resources", "learning"} | ||
|
|
||
|
|
||
| def _find_bundle(out_dir: Path, expected_keys: set[str]) -> dict: | ||
| """Return the parsed JSON whose top-level keys cover ``expected_keys``. | ||
|
|
||
| The schema backend names its file from a timestamped prefix, so the smoke | ||
| tests glob the output directory rather than hardcode the filename. |
There was a problem hiding this comment.
_find_bundle duplicated verbatim across all four smoke-test files
The helper is copy-pasted identically into all four smoke test files. Moving it to conftest.py would eliminate the duplication.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
|
||
| benchmark.attach_bundle(bundle) | ||
|
|
||
| benchmark._finalize_impl() |
Introduce the capture, metrics, builders, stepping, profiling, and backend_descriptor submodules for assembling the schema-v1 benchmark bundles, add a schema output backend, and let BaseIsaacLabBenchmark emit several backends in one run via a new attach_bundle hook. Unit tests cover each submodule plus the schema backend and multi-backend finalize. Part 1 of a series splitting the oversized benchmark refactor (core -> runtime/startup -> training -> play).
cc2b7a0 to
45e2299
Compare
Add backend-agnostic runtime.py (random-action stepping, emits a RuntimeBundle) and startup.py (cProfile startup-phase profiling, emits a StartupBundle), wired to develop's launch API (launch_simulation and add_launcher_args from isaaclab.app; preset tokens forwarded to Hydra without folding). Remove the legacy benchmark_non_rl.py and benchmark_startup.py scripts plus the run_non_rl_benchmarks.sh and run_physx_benchmarks.sh runner shells; repoint benchmark_hydra_resolve at _common.get_backend_type. Part 2 of the benchmark refactor series (core -> runtime/startup -> training -> play); stacked on Part 1 (isaac-sim#6197).
Add training.py dispatching over --rl_library {rsl_rl, rl_games, skrl,
sb3}; each adapter runs real training under BenchmarkMonitor and emits a
TrainingBundle via the shared core, with an optional success-metric early
stop. Scripts use develop's launch API (launch_simulation from
isaaclab.app; preset tokens forwarded without folding). Remove the legacy
benchmark_rsl_rl.py / benchmark_rlgames.py scripts, the
run_training_benchmarks.sh runner shell, and the obsolete utils.py helper.
Part 3 of the benchmark refactor series (core -> runtime/startup ->
training -> play); stacked on Parts 1-2 (isaac-sim#6197, isaac-sim#6198).
45e2299 to
a2bc539
Compare
Description
Part 3 of 5 of the benchmark refactor series — the unified training dispatcher + RL adapters.
Series: Part 1/5 core (#6197) → Part 2/5 runtime + startup (#6198) → Part 3/5 training (this PR) → Part 4/5 play (#6201) → Part 5/5 cleanup.
This PR is purely additive — it adds
training.pyand the per-backend adapters alongside the existingbenchmark_rsl_rl.py/benchmark_rlgames.py/run_training_benchmarks.sh, which keep working unchanged. Removal of the legacy scripts andutils.pyis deferred to Part 5/5.Adds:
scripts/benchmarks/training.py— dispatcher selecting the RL library with--rl_library {rsl_rl, rl_games, skrl, sb3}(mirrorsscripts/reinforcement_learning/train.py).rsl_rl/bench_rsl_rl.py,rl_games/bench_rl_games.py,skrl/bench_skrl.py,sb3/bench_sb3.py) that run real training underBenchmarkMonitorand emit aTrainingBundlevia the shared core.Also repoints the shared
early_stop.pyat the coreSuccessRateTracker. This is a behavior-preserving change: the public wrappers/observers/CLI helpers are unchanged, and the legacybenchmark_rsl_rl.py/benchmark_rlgames.py(which import only those preserved symbols) keep working against it.Docs: updates the RL-training sections of the benchmarking / performance / warp-environments / visualization pages. (The 3.0 migration-guide "Benchmark Scripts" section, which documents the legacy-script removal, lands with Part 5/5.)
Validated on
develop(Newton/MJWarp): all four training smokes pass (rsl_rl / skrl / sb3 at 16 envs, rl_games at 512).Fixes # (n/a)
Type of change
Checklist
pre-commitchecks with./isaaclab.sh --formatsource/<pkg>/changelog.d/for every touched packageCONTRIBUTORS.mdor my name already exists there