Skip to content
This repository was archived by the owner on May 11, 2026. It is now read-only.
This repository was archived by the owner on May 11, 2026. It is now read-only.

Feature Request: Visual Prompt Tuning Support #307

Description

@hwilner

Summary

I would like to propose adding support for Visual Prompt Tuning to the prompt-tuning library. This technique extends prompt tuning to vision and vision-language models, enabling parameter-efficient adaptation of visual encoders and multimodal models.

Motivation

Visual prompt tuning brings the benefits of parameter-efficient fine-tuning to computer vision and multimodal domains. This is particularly valuable for adapting large vision-language models (like CLIP) to downstream tasks without fine-tuning the entire model.

Proposed Implementation

The implementation would include:

  1. Pixel-space prompts: Add trainable prompts as image borders, corners, or patches
  2. Patch-level prompts: Insert prompts at the patch embedding level for Vision Transformers
  3. Deep visual prompts: Add prompts at multiple layers of the visual encoder
  4. Vision-language coordination: Coordinate visual and textual prompts for multimodal models
  5. Adaptive visual prompts: Condition prompts on image content

Key Features

  • Multiple visual prompt insertion strategies
  • Support for Vision Transformers (ViT)
  • Coordination with textual prompts for VL models
  • Adaptive prompts based on image features
  • Integration with existing T5X/Flaxformer infrastructure

References

Additional Context

I have implemented a prototype of this technique that follows the library's design patterns and coding standards. The implementation is available in my fork at https://github.com/hwilner/prompt-tuning

Would the maintainers be interested in this enhancement? I'm happy to discuss the design and implementation details further.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions