Script “eval_qwen3_4b_aime_gpqa.sh”: Intended for Evaluation but Functions as a Training Script

I intend to execute the script named "recipe/demystify/eval/eval_qwen3_4b_aime_gpqa.sh" for the purpose of evaluating models. Nevertheless, upon closer examination, it turns out that this script is actually a training - oriented one. It makes use of gradients, an optimizer, and reward models, which are typical components in a training process rather than just an evaluation setup.