You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to propose adding support for P-Tuning v2 (also known as Deep Prompt Tuning) to the prompt-tuning library. This technique extends standard prompt tuning by adding trainable prompts at every layer of the transformer, not just the input layer.
Motivation
P-Tuning v2 has been shown to achieve performance comparable to full fine-tuning across various scales and tasks, while maintaining the parameter efficiency of prompt tuning. The key advantage is that it enables more fine-grained control over intermediate representations by inserting prompts at multiple layers.
Proposed Implementation
The implementation would include:
Layer-specific prompts: Trainable prompt parameters for each transformer layer
Optional shared prompts: Parameter-efficient variant where prompts are shared across layers
MLP-based prompt encoder: Generate layer-specific prompts from a shared representation
Compatibility: Work seamlessly with both encoder-decoder and decoder-only models
Key Features
Prompts at every transformer layer (not just input)
Configurable prompt length per layer
Optional parameter sharing across layers
Integration with existing T5X/Flaxformer infrastructure
Reference
Liu et al. (2022). "P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks." arXiv:2110.07602
Additional Context
I have implemented a prototype of this technique that follows the library's design patterns and coding standards. The implementation is available in my fork at https://github.com/hwilner/prompt-tuning
Would the maintainers be interested in this enhancement? I'm happy to discuss the design and implementation details further.
Summary
I would like to propose adding support for P-Tuning v2 (also known as Deep Prompt Tuning) to the prompt-tuning library. This technique extends standard prompt tuning by adding trainable prompts at every layer of the transformer, not just the input layer.
Motivation
P-Tuning v2 has been shown to achieve performance comparable to full fine-tuning across various scales and tasks, while maintaining the parameter efficiency of prompt tuning. The key advantage is that it enables more fine-grained control over intermediate representations by inserting prompts at multiple layers.
Proposed Implementation
The implementation would include:
Key Features
Reference
Liu et al. (2022). "P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks." arXiv:2110.07602
Additional Context
I have implemented a prototype of this technique that follows the library's design patterns and coding standards. The implementation is available in my fork at https://github.com/hwilner/prompt-tuning
Would the maintainers be interested in this enhancement? I'm happy to discuss the design and implementation details further.