Skip to content

Rust phase 4: native streaming recalculate_backtests_in_directory #525

@MDUYN

Description

@MDUYN

Goal

Replace the Python ProcessPoolExecutor-driven implementation of recalculate_backtests_in_directory with a Rust-driven worker pool that streams bundles from disk, recomputes metrics in native code, and writes the updated bundle back — all without crossing the Python boundary per row. Parent process memory stays flat; throughput improves by a large factor.

Tasks

  • Native worker in Rust using rayon for parallelism. Bound concurrency by workers parameter (mirroring the current Python signature).
  • Each worker: discover bundle → read (Phase 3) → recompute metrics (Phase 2) → write back → update progress channel.
  • PyO3 wrapper recalculate_backtests_in_directory_native(src_dir, dst_dir, *, risk_free_rate, metrics, workers, show_progress, include_ohlcv, max_tasks_per_child, update_index) that mirrors the Python signature exactly.
  • Progress callback: Rust → Python via pyo3::Python::with_gil only for tqdm updates (cheap, infrequent).
  • Drop max_tasks_per_child if no longer needed (no fork-bomb in Rust); document the deprecation but keep the parameter to preserve the public signature.
  • Index rewrite: keep update_index=True parity. Either reuse the existing Python index.parquet writer or port it.
  • End-to-end parity test: run native and pure-Python paths on the same tests/resources/backtest_databases/ directory, assert summaries are equal.
  • Benchmark on a directory of 1,000 bundles: target ≥ 10× wall-clock improvement and flat parent RSS.

Acceptance criteria

  • recalculate_backtests_in_directory(...) in investing_algorithm_framework/services/metrics/generate.py dispatches to the native implementation when IAF_NATIVE is enabled and the native module is importable; otherwise falls back to the current Python implementation.
  • Public Python signature unchanged.
  • Parity test green; benchmark report committed.

Dependencies

  • Phase 2 (metrics kernel)
  • Phase 3 (bundle I/O)

Part of the Epic: hybrid Python + Rust acceleration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance / memory optimizationrelease:backlogNot yet targetedrustRust port / native accelerationtype:featureNew capability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions