Synthetic-document fine-tuning on Qwen2.5-7B: a controlled study of whether SDF installs sandbagging, finding a layered recognition/generation/behavior dissociation.
-
Updated
May 25, 2026 - Python
Synthetic-document fine-tuning on Qwen2.5-7B: a controlled study of whether SDF installs sandbagging, finding a layered recognition/generation/behavior dissociation.
Proving that the neural network is honest about its lack of capabilities
Reproducible red-team findings for openai/gpt-oss-20b: five minimal harnesses with checks, zips & manifest (v0.9.3).
Add a description, image, and links to the sandbagging topic page so that developers can more easily learn about it.
To associate your repository with the sandbagging topic, visit your repo's landing page and select "manage topics."