Expose `response_schema=` argument on `add_response_schema` and `GRPOTrainer`/`GRPOConfig`

### Feature request

Add an optional `response_schema=` parameter to `add_response_schema` and `GRPOConfig`/`GRPOTrainer`. When passed, TRL skips template fingerprinting and assigns the schema directly onto the inner tokenizer, the same thing the workaround does manually today, just first-class:

```python
GRPOTrainer(..., response_schema=qwen3_5_schema)
# or
add_response_schema(tokenizer, response_schema=qwen3_5_schema)
```



### Motivation

`add_response_schema` currently identifies tokenizers by exact string equality against TRL's vendored `.jinja` files. Any fork of a supported model with legitimate edits to the template falls outside TRL's recognition and crashes:

```
ValueError: Unrecognized chat template, failed to add response schema...
```

Concretely, `Qwen/Qwen3.5-4B` works because its `chat_template` matches TRL's vendored file. `unsloth/Qwen3.5-4B` has modified the template so it fails, even though the schema that should be applied is the same.

```python
from transformers import AutoTokenizer
from trl.chat_template_utils import add_response_schema
import json
from huggingface_hub import hf_hub_download

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-4B")
add_response_schema(tokenizer)  # OK

config_path = hf_hub_download("unsloth/Qwen3.5-4B", "tokenizer_config.json")
with open(config_path) as f:
    tokenizer.chat_template = json.load(f)["chat_template"]

add_response_schema(tokenizer)  # ValueError
```

The workaround is `tokenizer.response_schema = qwen3_5_schema` before constructing the trainer, but this is undiscoverable, nothing in the error message points to it, and it leaks trainer internals to the caller.


### Your contribution

Happy to submit a PR for this :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose `response_schema=` argument on `add_response_schema` and `GRPOTrainer`/`GRPOConfig` #5724

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Expose response_schema= argument on add_response_schema and GRPOTrainer/GRPOConfig #5724

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Expose `response_schema=` argument on `add_response_schema` and `GRPOTrainer`/`GRPOConfig` #5724