fix: propagate context reset to LLM service#276
Conversation
Sets run_llm=True on LLMMessagesUpdateFrame when resetting context. This ensures the aggregator pushes the fresh context upstream to the LLM, preventing the LLM from using its internal cached history.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adjusts how LLM context frames are queued during node updates, distinguishing between update and append frame types with different constructor arguments.
Changes:
- Replaces the conditional frame type selection with an explicit if/else branch.
- Passes
run_llm=TrueforLLMMessagesUpdateFrameand omits it forLLMMessagesAppendFrame.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Updates both LLMMessagesUpdateFrame and LLMMessagesAppendFrame to include run_llm=True. This ensures that any context change (reset or append) is immediately propagated to the LLM service, maintaining consistency across all flow transitions.
0ddde2a to
6510a0e
Compare
|
@markbackman Can you review? Thanks |
|
Thanks for the PR! I don't think this change is needed, and I think it would cause a regression. The reset context already reaches the LLM today. After For edge functions, the function result is intentionally sent with I ran an example and confirmed |
|
Thanks for the detailed explanation and for taking the time to review! You are right that for standard transitions with However, the specific failure mode I'm encountering happens when Here is the exact blind spot in
For standard turn-based text bots using stateless models (like For streaming realtime services, the bug is critical. Audio frames ( Proposed Compromise We could pass # manager.py (in _update_llm_context)
frame_type = LLMMessagesUpdateFrame if is_reset else LLMMessagesAppendFrame
frames.append(frame_type(messages=messages, run_llm=respond_immediately))
# Remove the separate LLMRunFrame push in _set_nodeThis ensures the LLM receives exactly one state update immediately, preventing the double-inference issue while keeping streaming models in sync and respecting the |
|
Thanks for digging into this, and for confirming the double-inference point. On the In practice If you have a concrete realtime repro where a node transition leaves the session stale, please open a separate issue with a minimal example and we'll take a look. For this PR I'll keep it closed. Thanks again for the thoughtful investigation! |
When using
ContextStrategy.RESET,FlowManagersends anLLMMessagesUpdateFrame. However, it currently omits therun_llm=Trueflag. In the Pipecat pipeline, theLLMContextAggregatoronly pushes the updated context downstream if this flag is set. Consequently, the LLM service continues using its internal cached history, making the reset ineffective.This PR sets
run_llm=Trueon the update frame to ensure the cleared context is properly synchronized with the LLM service.