You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ICCV 2025] VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation
Official implementation of VEGGIE, a unified versatile video generative model that handles various tasks for both video concept grounding and editing according to user instructions.
Adobe Research, University of Michigan, University of North Carolina at Chapel Hill
Release Item / Timeline
Data Generation Pipeline
VEGGIE Model Training Code
Evaluation Code
Instructional Video Editing Examples
*: Non-Instructional methods utilize paired video captions for editing.
Instruction: Make it on the beach.
Input Video
VEGGIE
VidToMe*
TokenFlow*
Flatten*
InstructDiff
LGVI
InsV2V
Instruction: Please add a ball in the given video frames.
Input Video
VEGGIE
VidToMe*
TokenFlow*
Flatten*
InstructDiff
LGVI
InsV2V
Instruction: Make it Chinese ink style.
Input Video
VEGGIE
VidToMe*
TokenFlow*
Flatten*
InstructDiff
LGVI
InsV2V
Instruction: Could you label the bear in these video frames with red color masks?
Input Video
VEGGIE
VidToMe*
TokenFlow*
Flatten*
InstructDiff
LGVI
InsV2V
Instruction: Replace the cup with a bottle of flower.
Input Video
VEGGIE
VidToMe*
TokenFlow*
Flatten*
InstructDiff
LGVI
InsV2V
Instruction: Please remove the man in black in given video frames.
Input Video
VEGGIE
VidToMe*
TokenFlow*
Flatten*
InstructDiff
LGVI
InsV2V
Instruction: Make the swan white.
Input Video
VEGGIE
VidToMe*
TokenFlow*
Flatten*
InstructDiff
LGVI
InsV2V
Instruction: What can be used for heating food? Highlight your answer with red masks.
Input Video
VEGGIE
VidToMe*
TokenFlow*
Flatten*
InstructDiff
LGVI
InsV2V
Instructional Video Editing Examples
Reference
Please cite our paper if you use our models in your works:
@article{yu2025veggie,
title={VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation},
author={Shoubin Yu and Difan Liu and Ziqiao Ma and Yicong Hong and Yang Zhou and Hao Tan and Joyce Chai and Mohit Bansal},
year={2025},
journal={arXiv:2503.14350},
}
About
[ICCV2025] VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation