Skip to content

Epic: tiered backtest storage (rewrite the storage layer) #540

@MDUYN

Description

@MDUYN

Epic: Tiered backtest storage (rewrite the storage layer)

Epic. Replace the single-file .iafbt storage primitive with a tiered backend (relational index + columnar Parquet bulk + content-addressed chunks). Demote .iafbt to a deterministic export format. Keep the OSS path local-first and self-contained.

Full design: docs/design/tiered-backtest-storage.md
Companion specs: docs/design/bundle-format-v2.md, docs/design/ohlcv-dedup-protocol.md

Why now

Empirical measurements on a real production-shape archive (12,500 bundles ≈ 64 GB, 10 May 2026) showed that per-file format work has hit its ceiling:

Configuration Per bundle Total Notes
v1, zstd 7 (pre-v8.9) 569 KB 64.0 GB baseline
v2 + zstd 19 (v8.9, shipped) 489 KB ~55 GB per-file ceiling
Tiered store + content-addressed dedup n/a (decomposed) < 20 GB projected this epic

Two structural problems remain after v8.9:

  1. Per-file compression has hit its ceiling. zstd at level 22 saturates at the same size as level 19. Within a single .iafbt, no remaining headroom.
  2. The .iafbt is the wrong primitive for two of the three real workloads — listing/ranking and cross-run analytics. Decoding 12,500 zstd payloads to read 50 scalar metrics each is a multi-minute loop instead of a 50 ms SQL query.

Cross-bundle redundancy (strategy params, symbol metadata, OHLCV slices, recurring trade patterns) is the dominant unexploited source of size, and zstd cannot see across files. The fix lives one layer up.

Architecture (target)

┌─────────────────────────────────────────────────────────────────┐
│ Tier 1 — Index   (SQLite locally, Postgres remotely)            │
│   One row per backtest run. Scalar metrics + provenance + refs. │
│   Indexed for ranking, filtering, sweep navigation.             │
├─────────────────────────────────────────────────────────────────┤
│ Tier 2 — Columnar bulk   (Parquet on local disk or object store)│
│   Per-project Parquet datasets, partitioned by run_id:          │
│     portfolio_snapshots/   trades/   orders/   metric_series/   │
├─────────────────────────────────────────────────────────────────┤
│ Tier 3 — Content-addressed chunks                               │
│   SHA-256-keyed, immutable:                                     │
│     ohlcv/  code/  params/  symbols/  metric-series-blobs/      │
└─────────────────────────────────────────────────────────────────┘

.iafbt v2 stays — but as backtest.export("x.iafbt") / Backtest.import_("x.iafbt"), deterministic and round-trippable through any store. It is no longer the storage primitive.

Phases

This epic ships in three phases, each independently mergeable. Each phase is purely additive until phase 3, which deprecates (does not remove) the directory-of-.iafbt default.

Phase 1 — BacktestSummary DTO + scalar-summary read path

Goal: make "list / rank by Sharpe over many bundles" cheap on the existing on-disk shape, without writing anything new.

  • Define BacktestSummary dataclass — frozen schema for the ~50 scalar metrics + provenance + config columns from design doc §3.1.
  • Backtest.scalar_summary() -> BacktestSummary — readable from a v2 bundle without decoding Parquet metric blobs (already supported by summary_only=True; this wraps it in a typed return).
  • Tests: round-trip from v2 bundle → BacktestSummary → SQL row payload.
  • Document the BacktestSummary schema as the authoritative Tier 1 row contract.

Risk: low. Pure additive read path. No write changes.

Phase 2 — iaf index CLI + SQLite index

Goal: turn a folder of .iafbt files into a queryable archive without changing how they're written.

  • iaf index <dir> walks .iafbt files (recursive), populates <dir>/index.sqlite with one row per run from scalar_summary().
  • Idempotent (re-runs only ingest changed bundles, keyed by bundle_id + mtime).
  • iaf list <index.sqlite> --sort sharpe --limit 20
  • iaf rank <index.sqlite> --by sortino --filter 'tag LIKE "sweep_%"'
  • Tolerate partial corruption (skip + log, never crash).
  • Acceptance: iaf index builds a SQLite from examples/batch_one/ in < 5 s; iaf list --sort sharpe --limit 20 returns in < 100 ms over 12,500-row index.

Risk: low. SQLite is a derived view — delete and rebuild is always safe.

Phase 3 — BacktestStore abstraction + LocalTieredStore

Goal: introduce the store interface so the framework can write into multiple backends; ship the local tiered backend; demote .iafbt to export format.

  • BacktestStore Protocol with: put, get, list, delete, export, import_.
  • LocalDirStore — wraps current save_bundle/open_bundle. Default for backwards compatibility.
  • LocalTieredStore:
    • Tier 1: index.sqlite (schema = BacktestSummary from phase 1)
    • Tier 2: per-project Parquet datasets per design doc §3.2
    • Tier 3: chunks/ directory per design doc §3.3
  • Backtest service constructors accept a store= kwarg, default LocalDirStore.
  • iaf migrate-store --from local-dir --to local-tiered <src> <dst> one-shot migrator.
  • backtest.export("x.iafbt") reassembles a v2 bundle deterministically from any store.
  • Backtest.import_("x.iafbt") decomposes into the configured store (idempotent on bundle_id).
  • All existing backtest tests pass against both stores via a parameterised fixture.
  • Acceptance: on a 12,500-bundle workload, LocalTieredStore.list(sort='sharpe', limit=20) returns in < 100 ms; round-trip LocalDirStoreexportLocalTieredStore.import_export is byte-identical (modulo writer timestamp).

Risk: medium. Touches every backtest service constructor. Mitigated by keeping LocalDirStore the default for at least one minor release with a deprecation warning when the old kwargs are used.

Wire format note

The .iafbt v2 produced by phase 0 (already shipped in v8.9) is the only wire format the framework reads or writes for the archive-as-file use case. v1 is readable indefinitely; the v8.9 writer never emits v1.

A remote store implementation (Finterion-side, closed-source, not part of this epic) will negotiate-then-upload chunks via the protocol in docs/design/ohlcv-dedup-protocol.md generalized to all chunk types. The BacktestStore interface from phase 3 is the OSS-side contract that makes that possible.

Non-goals

  • Per-bundle Parquet for everything (measured net-negative on real data — see design doc §9)
  • Custom binary column format (Parquet is solved; leverage it)
  • Lossy snapshot/trade compression (user's data, hands off)
  • Schema-on-read JSON anywhere (the original (value, ISO-string) mistake)
  • Removing LocalDirStore (stays as default for at least one minor cycle after phase 3)
  • Cross-tenant or cross-project dedup at the OSS layer (a remote store's concern)
  • Remote / cloud store implementation (closed-source, separate)

Tracking

Replaces (closed as superseded):

Why one epic, not three issues: the three phases share schemas, naming, and tests, and shipping them as a single coherent epic makes the deprecation path for .iafbt-as-storage one decision rather than three. Each phase is still independently mergeable as separate PRs against this epic.

Headline

Today's design treats a backtest as a file. The future design treats a backtest as a row that points to chunks, with the file as one of several possible views.

That single shift turns the 64 GB problem into the 20 GB problem, makes "list 12,500 backtests sorted by Sharpe" a 50 ms query instead of a 30-minute decode loop, and unlocks DuckDB/Polars analytics over the entire archive without writing any new code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions