Training and Inference Scripts

This directory contains GitHub-ready versions of the local training and inference scripts. Private paths and API keys have been removed. Configure paths through environment variables.

Files

sft.sh: supervised fine-tuning with swift sft.
RL.sh: GRPO/RLHF training with an external rollout server.
plugin_reward.py: custom reward functions used by RL.sh.
1.infer_task_no_prompt.py: task-level inference without injecting the prompt tags.
1.infer_task_prompt_tags.py: task-level inference with the built-in GeoGebra prompt.
.env.example: example configuration values. Do not commit real API keys.

Quick Start

cp .env.example .env
# Edit .env, then:
source .env

Run SFT:

OUTPUT_DIR=./outputs/sft bash sft.sh

Run RL training:

OUTPUT_DIR=./outputs/rl_both bash RL.sh

Run evaluation:

CHECKPOINT=/path/to/checkpoint TASK_DATASET_DIR=./task_datasets python 1.infer_task_prompt_tags.py
CHECKPOINT=/path/to/checkpoint TASK_DATASET_DIR=./task_datasets WORLD_SIZE=8 python 1.infer_task_no_prompt.py

Notes

SWANLAB_API_KEY is intentionally not set in the scripts. Export it locally only when needed.
REWARD_PLUGIN, SYSTEM_PROMPT_FILE, model checkpoints, and datasets are expected to be provided by the user or the surrounding project.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
code		code
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training and Inference Scripts

Files

Quick Start

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Training and Inference Scripts

Files

Quick Start

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages