Skip to content

Latest commit

 

History

History
69 lines (46 loc) · 1.85 KB

File metadata and controls

69 lines (46 loc) · 1.85 KB

🧠 Diffusion-Style Instruction-Tuned LLaMA 3.2B (LoRA)

This model fine-tunes meta-llama/Llama-3.2-3B using a diffusion-style denoising approach on instruction-following datasets. It applies structured noise to assistant responses and learns to iteratively recover clean text. Fine-tuning is performed using LoRA adapters for efficient training.


🧪 Intended Use

This model is designed for research on:

  • Diffusion-based natural language generation
  • Instruction tuning and prompt-response modeling
  • Lightweight fine-tuning with LoRA

Not intended for direct deployment in high-stakes or safety-critical applications.


🧬 Model Architecture

  • Base: meta-llama/Llama-3.2-3B
  • Modified: Attention mask options (causal, bidirectional, bidirectional_masked)
  • LoRA layers: q_proj, k_proj, v_proj, o_proj
  • Training style: corrupt → denoise mapping of assistant response
  • Corruption types:
    • Masking (MASK tokens)
    • Swaps (token reordering)
    • Duplication (token copying)
    • Span shifts (local rearrangements)

🗃️ Datasets Used

  • 🐣 tatsu-lab/alpaca
  • 🧠 vicgalle/alpaca-gpt4
  • 🧼 crumb/Clean-Instruct-3M (filtered + streamed)

⚙️ Training Details

  • LoRA rank: r=512, α=512
  • Batch size: 8 (gradient accumulation = 1)
  • Epochs: 3
  • Optimizer: AdamW
  • Scheduler: Cosine with warmup
  • Learning rate: 1e-6
  • Noise sampling: token distribution from training set

📊 Evaluation

Model generates predictions on corrupted assistant responses. Periodic sampling shows recovery quality. No formal benchmark scores reported yet.


📁 Files

  • config.json: Custom attention + LoRA config
  • pytorch_model.bin: LoRA-weighted model head
  • tokenizer_config.json: Tokenizer settings

✍️ Citation

If you use this model or its components, please cite: (WIP)