Skip to content

Expose response_schema= argument on add_response_schema and GRPOTrainer/GRPOConfig #5724

@rycerzes

Description

@rycerzes

Feature request

Add an optional response_schema= parameter to add_response_schema and GRPOConfig/GRPOTrainer. When passed, TRL skips template fingerprinting and assigns the schema directly onto the inner tokenizer, the same thing the workaround does manually today, just first-class:

GRPOTrainer(..., response_schema=qwen3_5_schema)
# or
add_response_schema(tokenizer, response_schema=qwen3_5_schema)

Motivation

add_response_schema currently identifies tokenizers by exact string equality against TRL's vendored .jinja files. Any fork of a supported model with legitimate edits to the template falls outside TRL's recognition and crashes:

ValueError: Unrecognized chat template, failed to add response schema...

Concretely, Qwen/Qwen3.5-4B works because its chat_template matches TRL's vendored file. unsloth/Qwen3.5-4B has modified the template so it fails, even though the schema that should be applied is the same.

from transformers import AutoTokenizer
from trl.chat_template_utils import add_response_schema
import json
from huggingface_hub import hf_hub_download

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-4B")
add_response_schema(tokenizer)  # OK

config_path = hf_hub_download("unsloth/Qwen3.5-4B", "tokenizer_config.json")
with open(config_path) as f:
    tokenizer.chat_template = json.load(f)["chat_template"]

add_response_schema(tokenizer)  # ValueError

The workaround is tokenizer.response_schema = qwen3_5_schema before constructing the trainer, but this is undiscoverable, nothing in the error message points to it, and it leaks trainer internals to the caller.

Your contribution

Happy to submit a PR for this :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions