DreamACT is a lightweight research prototype for offline robot control. It trains an ACT-style behavior cloning policy, trains a compact scalar world model, and uses that world model to rerank candidate action chunks at inference time.
This repository is intentionally small and runnable on CPU. The default environment is a deterministic 2D PushT-lite simulator that can generate offline demonstrations without external robotics dependencies. The code is structured so the toy data loader can later be replaced by LeRobot/PushT or Robomimic.
- Offline demonstration generation for a PushT-lite continuous-control task.
- ACT-lite / behavior cloning policy that predicts action chunks.
- Scalar progress/risk world model.
- Gaussian candidate action generation.
- World-model reranking.
- Baseline vs reranked rollout evaluation.
- GIF rollout export.
- Smoke tests for dataset, model, and reranking flow.
python -m venv .venv
.venv\Scripts\activate
pip install -e .[dev]Run the full CPU smoke pipeline:
python src/train_policy.py --quick --output experiments/policy.pt
python src/train_world_model.py --quick --output experiments/world_model.pt
python src/eval_policy.py --policy experiments/policy.pt --episodes 20
python src/eval_rerank.py --policy experiments/policy.pt --world-model experiments/world_model.pt --episodes 20 --make-gif
python -m pytestThe reranking script writes a JSON summary to results/tables/rerank_metrics.json
and, when --make-gif is used, a rollout GIF to results/videos/.
At each control step:
- The policy predicts a short action chunk.
- The candidate generator creates noisy variants of that chunk.
- The world model predicts progress and risk for each candidate.
- DreamACT executes the first action from the highest-scoring chunk.
The score is:
score = predicted_progress - risk_weight * predicted_risk
configs/ YAML defaults for policy, world model, reranking
src/dreamact/ Reusable package code
src/train_policy.py Train behavior cloning policy
src/train_world_model.py Train scalar progress/risk world model
src/eval_policy.py Evaluate baseline policy
src/eval_rerank.py Evaluate policy + world model reranking
tests/ CPU smoke tests
report/ Short project report
This is not a full VLA model and does not demonstrate broad language generalization. The default task is a CPU-friendly PushT-lite proxy. The main research value is the end-to-end reranking interface and measurement scaffold.