Skip to content

Add normalized dataset store and offline Record3D support#88

Draft
JanDuchscherer104 wants to merge 15 commits into
mainfrom
codex/record3d-only
Draft

Add normalized dataset store and offline Record3D support#88
JanDuchscherer104 wants to merge 15 commits into
mainfrom
codex/record3d-only

Conversation

@JanDuchscherer104

@JanDuchscherer104 JanDuchscherer104 commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Summary

This branch replaces the earlier Record3D-only extraction with a clean, reviewable replay that adds a normalized dataset store for dataset-backed sources, introduces offline Record3D .r3d archives as a dataset package, and wires the minimal CLI/source/pipeline/app surfaces needed to inspect and replay those normalized datasets. ADVIO and TUM RGB-D keep their raw local source defaults, Record3D offline support stays separate from live Record3D transport, and the branch intentionally excludes W&B, sweeps/run orchestration, run-bundle import/export, pipeline sink/Ray runtime drift, utils/geometry.py, uv.lock, and committed graphify-out churn.

Focused verification completed:

  • PYTHONDONTWRITEBYTECODE=1 prml-vslam-worktree-env run pytest tests/test_record3d_dataset.py tests/test_tum_rgbd.py tests/test_advio.py tests/test_advio_page.py tests/test_app.py tests/test_cli.py tests/test_main.py tests/test_source_runtime.py
  • make ci
  • make graphify-rebuild
  • git diff --check origin/feature/extend-trajectory-evaluation..HEAD
  • Branch-scope forbidden path/content scans for uv.lock, graphify-out, src/prml_vslam/utils/geometry.py, W&B, sweeps/run orchestration, run-bundle import/export, pipeline sinks/Ray runtime, and unrelated trajectory-eval service renames

Work Packages

WP Scope Primary surfaces Status
WP1 Normalized dataset store sources/datasets/normalized_store.py, normalized_tables.py, batch_normalization.py, dataset_query.py resolved
WP2 Offline Record3D dataset package sources/datasets/record3d/*, tests/test_record3d_dataset.py resolved
WP3 Source, pipeline, and CLI integration sources/config.py, pipeline/config.py, main.py, eval/stage_cloud_alignment/config.py resolved
WP4 Streamlit dataset UI app/pages/datasets.py, app/models.py, app/services.py, app/bootstrap.py resolved
WP5 Compatibility, docs, and final gates tests/test_advio.py, tests/test_tum_rgbd.py, docs/README surfaces resolved

WP1 — Normalized Dataset Store

Scope

  • Adds a dataset-owned normalized store under .data/<dataset>/.normalized/<sequence>/<profile-key>/ with manifests, observation tables, depth/stat sidecars, benchmark inputs, and queryable metadata.
  • Adds batch normalization and dataset query services so CLI and UI callers can inspect existing normalized entries without recomputing them in Streamlit.
  • Preserves external benchmark observation sources and source-frame provenance while supporting frame selection at replay/read time.

Key files

  • src/prml_vslam/sources/datasets/normalized_store.py
  • src/prml_vslam/sources/datasets/normalized_tables.py
  • src/prml_vslam/sources/datasets/batch_normalization.py
  • src/prml_vslam/sources/dataset_query.py
  • .configs/datasets/normalize-benchmark.toml

Initial success criteria resolution

  • normalized dataset store core is replayed without unrelated orchestration: resolved
  • store entries are queryable and tamper-checked: resolved
  • ADVIO/TUM timestamp and provenance behavior remains compatible: resolved

API/path mapping

  • New normalized-store path: .data/<dataset>/.normalized/<sequence>/<profile-key>/.
  • New dataset CLI surfaces include normalized summary, stats, and batch normalization commands under prml-vslam dataset ....

WP2 — Offline Record3D Dataset Package

Scope

  • Adds an offline Record3D dataset package for .r3d archive cataloging, local archive discovery, checksum-aware downloads, metadata parsing, RGB/depth/confidence decoding, ARKit trajectory extraction, and reference-cloud materialization.
  • Keeps offline archive handling under sources/datasets/record3d/; live Record3D capture remains in the existing live transport source package.
  • Makes pyliblzfse optional for synthetic tests while preserving an explicit runtime error for real .r3d depth/confidence decoding when the package is absent.

Key files

  • src/prml_vslam/sources/datasets/record3d/record3d_loading.py
  • src/prml_vslam/sources/datasets/record3d/record3d_sequence.py
  • src/prml_vslam/sources/datasets/record3d/record3d_service.py
  • src/prml_vslam/sources/datasets/record3d/record3d_download.py
  • tests/test_record3d_dataset.py

Initial success criteria resolution

  • Record3D dataset package is replayed from salvage patch: resolved
  • offline Record3D stays distinct from live Record3D transport: resolved
  • missing real LZFSE decoder is reported explicitly: resolved

API/path mapping

  • New source discriminator: source_id = "record3d_dataset".
  • New dataset id: record3d_dataset.
  • New reference-cloud source: record3d_lidar.

WP3 — Source, Pipeline, and CLI Integration

Scope

  • Wires normalized dataset sources into the source factory, source protocols, pipeline planning, source-stage materialization, and trajectory/cloud-alignment availability without moving ownership into pipeline sinks or Ray runtime code.
  • Keeps ADVIO/TUM setup_target() defaults raw-local and makes normalized-store replay explicit through normalized dataset workflows.
  • Adds CLI commands for Record3D download/inspection and normalized dataset summary/stat/batch operations.

Key files

  • src/prml_vslam/sources/config.py
  • src/prml_vslam/sources/datasets/sources.py
  • src/prml_vslam/pipeline/config.py
  • src/prml_vslam/main.py
  • src/prml_vslam/eval/stage_cloud_alignment/config.py

Initial success criteria resolution

  • minimal source/pipeline integration only: resolved
  • ADVIO/TUM behavior-compatible: resolved
  • no W&B, sweeps, run orchestration, run-bundle, sink/Ray, or geometry drift: resolved

Symbol mapping

  • OfflineSequenceSource.label is now a read-only property contract, matching dataset and streaming wrappers.
  • Record3DDatasetSourceConfig is the offline archive source config; Record3DSourceConfig remains the live transport source config.

WP4 — Streamlit Dataset UI

Scope

  • Adds Record3D to the datasets page with dataset-scoped download state, local archive rows, sequence details, loop preview controls, and normalized-store status/characterization panels.
  • Keeps Streamlit read-only over normalized-store query results; normalization remains a CLI/data operation rather than a page-side compute path.
  • Preserves ADVIO/TUM download preset and modality controls and restores metrics page persisted state fields used by the existing metrics page.

Key files

  • src/prml_vslam/app/pages/datasets.py
  • src/prml_vslam/app/models.py
  • src/prml_vslam/app/services.py
  • src/prml_vslam/app/bootstrap.py
  • src/prml_vslam/app/advio_controller.py

Initial success criteria resolution

  • Streamlit dataset UI supports Record3D and normalized store: resolved
  • ADVIO/TUM form behavior stays compatible: resolved
  • metrics page state remains compatible: resolved

WP5 — Compatibility, Docs, and Final Gates

Scope

  • Adds regression coverage for raw-local ADVIO/TUM source defaults, Record3D normalized-store behavior, protocol property compatibility, metrics state compatibility, dataset CLI surfaces, and app dataset UI controls.
  • Updates source/dataset/app docs to describe the normalized-store boundary and offline Record3D dataset ownership.
  • Rebuilds Graphify during verification and intentionally leaves generated graphify-out diffs uncommitted.

Key files

  • tests/test_advio.py
  • tests/test_tum_rgbd.py
  • tests/test_source_runtime.py
  • tests/test_app.py
  • src/prml_vslam/sources/README.md
  • src/prml_vslam/sources/datasets/README.md

Initial success criteria resolution

  • docs/tests replayed after salvage: resolved
  • final branch scope excludes forbidden surfaces: resolved
  • local final gate is green: resolved

Remaining follow-ups

  • GitHub Actions CI must pass on the pushed PR head before merge readiness is claimed.

@JanDuchscherer104 JanDuchscherer104 changed the title Add offline Record3D archive dataset support Add normalized dataset store and offline Record3D support Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant