VERL WebUI is a user-friendly graphical interface designed for VERL (Volcano Engine Reinforcement Learning). It simplifies the configuration and command generation for RLHF training of Large Language Models.
This WebUI provides an intuitive way to configure various components of VERL, including PPO/GRPO algorithms, model parameters (Actor, Critic, Reward, Reference), and data settings. It streamlines the process of generating complex training commands for large-scale RLHF experiments without needing to manually write lengthy shell scripts.
- VERL GitHub Repository: https://github.com/volcengine/verl
- VERL Documentation: https://verl.readthedocs.io/en/latest/index.html
We provide a user-friendly Web Interface (WebUI) to generate training configurations and commands easily.
To launch the WebUI, ensure you have the required dependencies installed (including gradio).
pip install gradioYou can start the WebUI directly using Python. By default, it runs on port 7860.
python webui.pySpecify a custom port:
python webui.py --port 8888Enable public sharing:
python webui.py --shareFor Windows users, we provide a convenient PowerShell script run_webui.ps1.
# Basic usage
.\run_webui.ps1
# Specify port
.\run_webui.ps1 -Port 7862
# Enable public link
.\run_webui.ps1 -Port 7862 -ShareThe WebUI allows you to configure the following modules:
Data Configuration: Setup training and validation datasets, batch sizes, and prompt lengths.
Actor Configuration:
Reference Model: Enable/Disable reference models with flexible KL implementation choices (use_kl_loss or use_kl_in_reward).
Critic Model: Configure value function models.
Reward Model: Setup reward model parameters and managers (Naive, Prime, DAPO).
Algorithm: Choose between GAE, GRPO, Reinforce++, and more.
Trainer: Manage experiment names, logging (WandB, Tensorboard, etc.), and checkpointing.

