Skip to content

feat(client): per-request reasoning_effort passthrough#57

Merged
mudler merged 1 commit into
mainfrom
feat/reasoning-effort
Jun 5, 2026
Merged

feat(client): per-request reasoning_effort passthrough#57
mudler merged 1 commit into
mainfrom
feat/reasoning-effort

Conversation

@localai-bot

Copy link
Copy Markdown
Collaborator

What

Adds OpenAIOptions.ReasoningEffort, attached to every chat-completion request as the OpenAI-standard reasoning_effort field (in Ask, CreateChatCompletion, and CreateChatCompletionStream). Mirrors the existing Metadata passthrough.

Why

Metadata ({"enable_thinking":"false"}) only disables thinking when the model's chat template honors that flag. Reasoning models whose GGUF chat template has no enable_thinking toggle — e.g. LFM2.5 — ignore it and keep emitting <think> blocks. Those same models do honor reasoning_effort: "none" (LocalAI/llama.cpp translate it into the empty-think prefill). This gives callers the portable, OpenAI-standard lever to silence reasoning regardless of the template.

Verified against a live LocalAI + LFM2.5: metadata.enable_thinking:false → still thinking; reasoning_effort:"none" → thinking off.

Behavior

  • Empty ReasoningEffort → field omitted, no change to existing requests.
  • go-openai already has the native ReasoningEffort field, so this is a one-line set at each request site.

Tests

  • ReasoningEffort reaches the wire as reasoning_effort (real httptest server)
  • omitted when unset
  • existing metadata tests still pass

🤖 Generated with Claude Code

Add OpenAIOptions.ReasoningEffort, attached to every chat-completion request as
the OpenAI-standard "reasoning_effort" field (Ask, CreateChatCompletion,
CreateChatCompletionStream). Mirrors the existing Metadata passthrough.

Why: Metadata ({"enable_thinking":"false"}) only disables thinking when the
model's chat template honors that flag. Reasoning models whose GGUF template has
no enable_thinking toggle (e.g. LFM2.5) ignore it and keep reasoning — but they
DO honor reasoning_effort:"none". This gives callers the portable lever to
silence reasoning regardless of the template. Unset = field omitted (no behavior
change).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mudler mudler merged commit 7c13c87 into main Jun 5, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants