This is the official implementation of the ICLR 2026 paper Primary-Fine Decoupling for Action Generation in Robotic Imitation.
Important
Status: Early Access
This codebase is currently very raw. Runnability has not been verified, and the relevant policies have not yet been fully deployed.
We are continuously updating this repository. Please stay tuned for stable releases.
PF-DAG is a two-stage imitation learning framework that decouples coarse action consistency from fine-grained variations:
- Primary Mode Policy: Compresses action chunks into a small set of discrete modes and selects consistent coarse modes
- Mode-Conditioned MeanFlow Policy: Generates high-fidelity continuous actions conditioned on the selected mode
# Clone the repository
git clone https://github.com/XiaohanLei/PF-DAG.git
cd PF-DAG
# Install the package in editable mode
pip install -e .The code supports two hardware configurations:
- xArm + Two-Finger Gripper: With GELLO teleoperation
- xArm + XHand: With Meta Quest 3 teleoperation
- xArm 7 manipulator
- Intel RealSense L515 LiDAR camera
- (Optional) GELLO demonstration arm
- (Optional) Meta Quest 3 headset
- (Optional) XHand dexterous hand
Calibrate the camera to robot transformation using an ARUCO marker:
python scripts/calibrate.pyThis will save the extrinsic calibration to bc_data/extrinsic.npy.
Collect expert demonstrations using teleoperation:
# For xArm + Gripper with GELLO
python scripts/run_env.py --agent gello
# For xArm + XHand with Quest 3
python scripts/run_env_xhand.py --agent questConvert collected demonstrations into training format:
python scripts/convert_dataset_vq.pyFirst, train the VQ-VAE to learn the discrete primary modes:
python scripts/train_vq.pyThis will save the VQ-VAE model to bc_data/vq_model.pth.
Then train the full PF-DAG policy:
python scripts/train.pyDeploy and evaluate the trained policy:
python scripts/vis_policy.pyPF-DAG/
├── pf_dag/ # Robot control and teleoperation code
│ ├── agents/ # Teleoperation agents (GELLO, Quest, Policy)
│ ├── cameras/ # Camera interfaces
│ ├── robots/ # Robot interfaces
│ ├── utils/ # Utility functions
│ └── zmq_core/ # ZMQ communication nodes
├── pf_dag_policy/ # Policy and learning code
│ ├── policy/ # Policy implementations (PF-DAG, DP3, etc.)
│ ├── model/ # Model architectures
│ ├── dataset/ # Dataset loaders
│ └── config/ # Hydra config files
├── scripts/ # Execution scripts
├── assets/ # Robot URDFs and meshes
└── resources/ # Paper PDF
If you find this code useful, please consider citing our paper:
@inproceedings{lei2026pfdag,
title={Primary-Fine Decoupling for Action Generation in Robotic Imitation},
author={Lei, Xiaohan and Wang, Min and Zhou, Wengang and Lu, Xingyu and Li, Houqiang},
booktitle={International Conference on Learning Representations (ICLR)},
year={2026}
}This project is licensed under the MIT License - see the LICENSE file for details.
This work builds upon several excellent open-source projects: