Skip to content

XiaohanLei/PF-DAG

Repository files navigation

PF-DAG: Primary-Fine Decoupling for Action Generation in Robotic Imitation

This is the official implementation of the ICLR 2026 paper Primary-Fine Decoupling for Action Generation in Robotic Imitation.

Important

Status: Early Access

This codebase is currently very raw. Runnability has not been verified, and the relevant policies have not yet been fully deployed.

We are continuously updating this repository. Please stay tuned for stable releases.

Overview

PF-DAG is a two-stage imitation learning framework that decouples coarse action consistency from fine-grained variations:

  1. Primary Mode Policy: Compresses action chunks into a small set of discrete modes and selects consistent coarse modes
  2. Mode-Conditioned MeanFlow Policy: Generates high-fidelity continuous actions conditioned on the selected mode

Installation

# Clone the repository
git clone https://github.com/XiaohanLei/PF-DAG.git
cd PF-DAG

# Install the package in editable mode
pip install -e .

Hardware Setup

The code supports two hardware configurations:

  1. xArm + Two-Finger Gripper: With GELLO teleoperation
  2. xArm + XHand: With Meta Quest 3 teleoperation

Required Hardware

  • xArm 7 manipulator
  • Intel RealSense L515 LiDAR camera
  • (Optional) GELLO demonstration arm
  • (Optional) Meta Quest 3 headset
  • (Optional) XHand dexterous hand

Execution Flow

1. Calibration

Calibrate the camera to robot transformation using an ARUCO marker:

python scripts/calibrate.py

This will save the extrinsic calibration to bc_data/extrinsic.npy.

2. Teleoperation & Data Collection

Collect expert demonstrations using teleoperation:

# For xArm + Gripper with GELLO
python scripts/run_env.py --agent gello

# For xArm + XHand with Quest 3
python scripts/run_env_xhand.py --agent quest

3. Data Conversion

Convert collected demonstrations into training format:

python scripts/convert_dataset_vq.py

4. Training

Step 4.1: Train VQ-VAE for Primary Modes

First, train the VQ-VAE to learn the discrete primary modes:

python scripts/train_vq.py

This will save the VQ-VAE model to bc_data/vq_model.pth.

Step 4.2: Train PF-DAG Policy

Then train the full PF-DAG policy:

python scripts/train.py

5. Policy Evaluation

Deploy and evaluate the trained policy:

python scripts/vis_policy.py

Project Structure

PF-DAG/
├── pf_dag/                  # Robot control and teleoperation code
│   ├── agents/              # Teleoperation agents (GELLO, Quest, Policy)
│   ├── cameras/             # Camera interfaces
│   ├── robots/              # Robot interfaces
│   ├── utils/               # Utility functions
│   └── zmq_core/            # ZMQ communication nodes
├── pf_dag_policy/           # Policy and learning code
│   ├── policy/              # Policy implementations (PF-DAG, DP3, etc.)
│   ├── model/               # Model architectures
│   ├── dataset/             # Dataset loaders
│   └── config/              # Hydra config files
├── scripts/                 # Execution scripts
├── assets/                  # Robot URDFs and meshes
└── resources/               # Paper PDF

Citation

If you find this code useful, please consider citing our paper:

@inproceedings{lei2026pfdag,
  title={Primary-Fine Decoupling for Action Generation in Robotic Imitation},
  author={Lei, Xiaohan and Wang, Min and Zhou, Wengang and Lu, Xingyu and Li, Houqiang},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2026}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

This work builds upon several excellent open-source projects:

About

Official Implementation of ICLR2026 paper Primary-Fine Decoupling for Action Generation in Robotic Imitation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages