Skip to content

NguyenNgocMinh30012005/DreamACT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DreamACT

DreamACT is a lightweight research prototype for offline robot control. It trains an ACT-style behavior cloning policy, trains a compact scalar world model, and uses that world model to rerank candidate action chunks at inference time.

This repository is intentionally small and runnable on CPU. The default environment is a deterministic 2D PushT-lite simulator that can generate offline demonstrations without external robotics dependencies. The code is structured so the toy data loader can later be replaced by LeRobot/PushT or Robomimic.

What Is Implemented

  • Offline demonstration generation for a PushT-lite continuous-control task.
  • ACT-lite / behavior cloning policy that predicts action chunks.
  • Scalar progress/risk world model.
  • Gaussian candidate action generation.
  • World-model reranking.
  • Baseline vs reranked rollout evaluation.
  • GIF rollout export.
  • Smoke tests for dataset, model, and reranking flow.

Quickstart

python -m venv .venv
.venv\Scripts\activate
pip install -e .[dev]

Run the full CPU smoke pipeline:

python src/train_policy.py --quick --output experiments/policy.pt
python src/train_world_model.py --quick --output experiments/world_model.pt
python src/eval_policy.py --policy experiments/policy.pt --episodes 20
python src/eval_rerank.py --policy experiments/policy.pt --world-model experiments/world_model.pt --episodes 20 --make-gif
python -m pytest

The reranking script writes a JSON summary to results/tables/rerank_metrics.json and, when --make-gif is used, a rollout GIF to results/videos/.

Method

At each control step:

  1. The policy predicts a short action chunk.
  2. The candidate generator creates noisy variants of that chunk.
  3. The world model predicts progress and risk for each candidate.
  4. DreamACT executes the first action from the highest-scoring chunk.

The score is:

score = predicted_progress - risk_weight * predicted_risk

Repository Layout

configs/                 YAML defaults for policy, world model, reranking
src/dreamact/            Reusable package code
src/train_policy.py      Train behavior cloning policy
src/train_world_model.py Train scalar progress/risk world model
src/eval_policy.py       Evaluate baseline policy
src/eval_rerank.py       Evaluate policy + world model reranking
tests/                   CPU smoke tests
report/                  Short project report

Honest Limitations

This is not a full VLA model and does not demonstrate broad language generalization. The default task is a CPU-friendly PushT-lite proxy. The main research value is the end-to-end reranking interface and measurement scaffold.

About

World-model assisted VLA-lite prototype for offline robot control with action-chunk reranking.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages