Skip to content

alwaysbyx/DiffOP

Repository files navigation

DiffOP

Paper: DiffOP: Reinforcement Learning of Optimization-Based Control Policies via Implicit Policy Gradients is accepted by AAAI 2026.

Experimental results are available at WandB DiffOP Results Link.

Framework

DiffOP Framework

Installation

  1. Clone this repository:
cd DiffOP
  1. Create a conda environment (recommended):
conda create -n diffop python=3.9
conda activate diffop
  1. Install the package:
pip install -e .
  1. Install additional dependencies (if needed):
pip install gymnasium wandb scipy pandas matplotlib

Running Experiments

Nonlinear Control Experiments

To run experiments on nonlinear control environments (Cartpole, Robotarm, Quadrotor):

./run_nonlinear_experiments.sh [max_parallel_jobs]

This script will run experiments on all three environments with multiple seeds. You can specify the maximum number of parallel jobs (default: 4).

Example:

./run_nonlinear_experiments.sh 8  # Run with up to 8 parallel jobs

Voltage Control Experiments

To run experiments on the Voltage-v0 environment:

./run_voltage_experiments.sh

Manual Training

You can also run individual training scripts directly:

cd diffop/experiments
python train_diffop.py --env Voltage-v0 --seed 0 --lr 0.5 --std 0.01 --horizon 6 --apply_horizon 1 --wandb_log

Available environments:

  • Cartpole-v0
  • Robotarm-v0
  • Quadrotor-v0
  • Voltage-v0

Key parameters:

  • --env: Environment name
  • --seed: Random seed
  • --lr: Learning rate
  • --std: Noise standard deviation
  • --horizon: Planning horizon
  • --apply_horizon: Horizon for applying actions
  • --wandb_log: Enable WandB logging

Visualization

To visualize results, open the Jupyter notebook:

cd diffop/experiments
jupyter notebook visualize_results.ipynb

Make sure the results/ folder contains the necessary data files for visualization.

About

Reinforcement Learning of Optimization-Based Control Policies via Implicit Policy Gradients

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors