This model fine-tunes meta-llama/Llama-3.2-3B using a diffusion-style denoising approach on instruction-following datasets. It applies structured noise to assistant responses and learns to iteratively recover clean text. Fine-tuning is performed using LoRA adapters for efficient training.
This model is designed for research on:
- Diffusion-based natural language generation
- Instruction tuning and prompt-response modeling
- Lightweight fine-tuning with LoRA
Not intended for direct deployment in high-stakes or safety-critical applications.
- Base: meta-llama/Llama-3.2-3B
- Modified: Attention mask options (
causal,bidirectional,bidirectional_masked) - LoRA layers:
q_proj,k_proj,v_proj,o_proj - Training style: corrupt → denoise mapping of assistant response
- Corruption types:
- Masking (MASK tokens)
- Swaps (token reordering)
- Duplication (token copying)
- Span shifts (local rearrangements)
- 🐣
tatsu-lab/alpaca - 🧠
vicgalle/alpaca-gpt4 - 🧼
crumb/Clean-Instruct-3M(filtered + streamed)
- LoRA rank:
r=512, α=512 - Batch size: 8 (gradient accumulation = 1)
- Epochs: 3
- Optimizer: AdamW
- Scheduler: Cosine with warmup
- Learning rate: 1e-6
- Noise sampling: token distribution from training set
Model generates predictions on corrupted assistant responses. Periodic sampling shows recovery quality. No formal benchmark scores reported yet.
config.json: Custom attention + LoRA configpytorch_model.bin: LoRA-weighted model headtokenizer_config.json: Tokenizer settings
If you use this model or its components, please cite: (WIP)