Skip to content

deepbiolab/minimal-rl-vlm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Applying Reinforcement Learning to Vision-Language Models for Mathematical Reasoning

License: MIT

A minimal implementation of PPO-based Reinforcement Learning for vision-language models, focusing on mathematical reasoning tasks.

๐Ÿšง Work in progress...

๐Ÿ™ Acknowledgements

This implementation is inspired by the MAYE framework described in "Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme".

@article{ma2025maye,
  title={Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme},
  author={Ma, Yan and Chern, Steffi and Shen, Xuyang and Zhong, Yiran and Liu, Pengfei},
  journal={arXiv preprint arXiv:2504.02587},
  year={2025},
}

About

A minimal implementation of PPO-based RL for vision-language models, designed to run on a single A100 GPU.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages