Skip to content

zyk42/verl-webui

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VERL WebUI

VERL WebUI is a user-friendly graphical interface designed for VERL (Volcano Engine Reinforcement Learning). It simplifies the configuration and command generation for RLHF training of Large Language Models.

Introduction

This WebUI provides an intuitive way to configure various components of VERL, including PPO/GRPO algorithms, model parameters (Actor, Critic, Reward, Reference), and data settings. It streamlines the process of generating complex training commands for large-scale RLHF experiments without needing to manually write lengthy shell scripts.

🔗 References

✨ WebUI Usage

We provide a user-friendly Web Interface (WebUI) to generate training configurations and commands easily.

WebUI Screenshot Placeholder

🚀 Quick Start

To launch the WebUI, ensure you have the required dependencies installed (including gradio).

pip install gradio

Method 1: Python Command (Recommended)

You can start the WebUI directly using Python. By default, it runs on port 7860.

python webui.py

Specify a custom port:

python webui.py --port 8888

Enable public sharing:

python webui.py --share

Method 2: PowerShell Script (Windows)

For Windows users, we provide a convenient PowerShell script run_webui.ps1.

# Basic usage
.\run_webui.ps1

# Specify port
.\run_webui.ps1 -Port 7862

# Enable public link
.\run_webui.ps1 -Port 7862 -Share

🛠 Features & Configuration

The WebUI allows you to configure the following modules:

Data Configuration: Setup training and validation datasets, batch sizes, and prompt lengths. Configuration Section Screenshot Placeholder Actor Configuration: Configuration Section Screenshot Placeholder Reference Model: Enable/Disable reference models with flexible KL implementation choices (use_kl_loss or use_kl_in_reward). Configuration Section Screenshot Placeholder Critic Model: Configure value function models. Configuration Section Screenshot Placeholder Reward Model: Setup reward model parameters and managers (Naive, Prime, DAPO). Configuration Section Screenshot Placeholder Algorithm: Choose between GAE, GRPO, Reinforce++, and more. Configuration Section Screenshot Placeholder Trainer: Manage experiment names, logging (WandB, Tensorboard, etc.), and checkpointing. Configuration Section Screenshot Placeholder

About

VERL WebUI is a user-friendly graphical interface designed for VERL and based on Gradio. It simplifies the configuration and command generation for RLHF training of Large Language Models. VERL WebUI 是一款为 VERL 设计、基于 Gradio 开发的易用型图形化界面。它简化了大语言模型 RLHF 训练的配置流程与指令生成工作。

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages