- Singapore
Popular repositories Loading
-
-
qwen3.5-4b-grpo-gsm8k
qwen3.5-4b-grpo-gsm8k PublicGRPO reasoning fine-tune of Qwen3.5-4B-Base on GSM8K with verifiable rewards (LoRA, vLLM rollouts). +7.0 pp pass@1 over base.
Python 1
-
-
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
