Skip to content

feat(config): reasoning_effort to disable thinking on any reasoning model#22

Merged
mudler merged 1 commit into
masterfrom
feat/reasoning-effort
Jun 5, 2026
Merged

feat(config): reasoning_effort to disable thinking on any reasoning model#22
mudler merged 1 commit into
masterfrom
feat/reasoning-effort

Conversation

@localai-bot

Copy link
Copy Markdown
Collaborator

What

Adds a global reasoning_effort config, sent as the OpenAI-standard reasoning_effort field on every request (main session + per-agent factory), via cogito's new OpenAIOptions.ReasoningEffort.

# disable a reasoning model's thinking, reliably:
reasoning_effort: "none"   # or "low" / "medium" / "high"

Why

metadata.enable_thinking:false only silences models whose chat template honors that flag. Reasoning models whose GGUF template has no enable_thinking toggle — e.g. LFM2.5 — ignore it and keep emitting <think> blocks. Those models do honor reasoning_effort:"none".

Verified against a live LocalAI + LFM2.5:

sent result
metadata.enable_thinking:false still thinking
reasoning_budget:0 / thinking:false still thinking
reasoning_effort:"none" thinking off ✅

Changes

  • types.Config.ReasoningEffort (yaml: reasoning_effort), wired into the main session and the per-agent LLM factory.
  • Bumps github.com/mudler/cogito to the commit adding OpenAIOptions.ReasoningEffort.
  • Test: the configured reasoning_effort reaches the wire (fake OpenAI server, stream + non-stream). README documented next to metadata.

Depends on

mudler/cogito#57 (adds OpenAIOptions.ReasoningEffort). The go.mod here pins the cogito branch commit; re-pin to the merged commit once #57 lands.

🤖 Generated with Claude Code

…odel

Add a global `reasoning_effort` config (and Session plumbing) sent as the
OpenAI-standard "reasoning_effort" field on every request, via cogito's new
OpenAIOptions.ReasoningEffort.

Why: `metadata.enable_thinking:false` only silences models whose chat template
honors that flag. Reasoning models whose GGUF template has no enable_thinking
toggle (e.g. LFM2.5) ignore it and keep emitting <think> blocks — but they DO
honor `reasoning_effort:"none"`. This gives a reliable, model-agnostic way to
turn thinking off. Empty leaves it unset (no behavior change).

- types.Config.ReasoningEffort (yaml: reasoning_effort), wired into the main
  session and the per-agent LLM factory.
- bumps cogito to the commit adding OpenAIOptions.ReasoningEffort.
- test: the configured reasoning_effort reaches the wire; README documented.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mudler mudler merged commit c810370 into master Jun 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants