feat: add extra_body parameter to LLM for provider-specific options#130
Conversation
Zhipu GLM-4.7 enables thinking/reasoning mode by default, which wastes
~10x tokens on internal reasoning before producing actual content. The
library only reads delta.content from streaming chunks, so when the
token budget is exhausted during reasoning, the response can be empty.
Adding extra_body pass-through to the OpenAI client allows users to
send provider-specific parameters like {"enable_thinking": false} to
disable reasoning mode and get direct responses.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
Summary by CodeRabbitRelease Notes
WalkthroughThis pull request introduces an optional Estimated code review effort🎯 1 (Trivial) | ⏱️ ~4 minutes 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
extra_bodyparameter toLLMandLLMExecutor, passed through toclient.chat.completions.create(){"enable_thinking": False}for Zhipu GLM-4.7)Problem
Zhipu's GLM-4.7 is a thinking/reasoning model that enables reasoning mode by default. This causes two issues:
reasoning_content) before producing actual content (content), wasting API creditsdelta.contentcan be empty, causing translation failuresSolution
Pass-through
extra_bodydict to the OpenAI client so users can disable thinking mode:This is a generic solution that works for any provider-specific parameter, not just Zhipu.
Changes
epub_translator/llm/core.py: Addextra_bodyparameter toLLM.__init__epub_translator/llm/executor.py: Addextra_bodytoLLMExecutor.__init__and pass tochat.completions.create()