fix: propagate context reset to LLM service by bhavik-mangla · Pull Request #276 · pipecat-ai/pipecat-flows

bhavik-mangla · 2026-05-29T21:14:17Z

When using ContextStrategy.RESET, FlowManager sends an LLMMessagesUpdateFrame. However, it currently omits the run_llm=True flag. In the Pipecat pipeline, the LLMContextAggregator only pushes the updated context downstream if this flag is set. Consequently, the LLM service continues using its internal cached history, making the reset ineffective.

This PR sets run_llm=True on the update frame to ensure the cleared context is properly synchronized with the LLM service.

Sets run_llm=True on LLMMessagesUpdateFrame when resetting context. This ensures the aggregator pushes the fresh context upstream to the LLM, preventing the LLM from using its internal cached history.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adjusts how LLM context frames are queued during node updates, distinguishing between update and append frame types with different constructor arguments.

Changes:

Replaces the conditional frame type selection with an explicit if/else branch.
Passes run_llm=True for LLMMessagesUpdateFrame and omits it for LLMMessagesAppendFrame.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Updates both LLMMessagesUpdateFrame and LLMMessagesAppendFrame to include run_llm=True. This ensures that any context change (reset or append) is immediately propagated to the LLM service, maintaining consistency across all flow transitions.

bhavik-mangla · 2026-05-30T12:53:37Z

@markbackman Can you review? Thanks

markbackman · 2026-06-03T14:23:33Z

Thanks for the PR! I don't think this change is needed, and I think it would cause a regression.

The reset context already reaches the LLM today. After _update_llm_context() replaces the context via set_messages(), _set_node() queues an LLMRunFrame() that pushes the cleared context downstream, resulting in one inference.

For edge functions, the function result is intentionally sent with run_llm=False so inference is deferred until the transition completes. That gives exactly one inference per transition, not two. Setting run_llm=True on the update/append frame would push the context immediately, and then the existing LLMRunFrame would push it again, causing a double inference.

I ran an example and confirmed ContextStrategy.RESET clears context correctly as-is. If you have a concrete repro where it doesn't, please share it. Otherwise I'd suggest holding off on this change.

bhavik-mangla · 2026-06-03T16:48:18Z

@markbackman

Thanks for the detailed explanation and for taking the time to review! You are right that for standard transitions with respond_immediately=True, the subsequent LLMRunFrame already pushes the LLMContextFrame downstream, which properly triggers the single inference for stateless models. I missed that secondary trigger in my initial diagnosis, and I agree this PR as written would cause a double-inference regression.

However, the specific failure mode I'm encountering happens when respond_immediately: False is used (which is a supported option in NodeConfig), and the impact is especially noticeable when using stateful, streaming WebRTC models (like OpenAIRealtimeLLMService or GeminiMultimodalLiveService), though it affects stateless models like GoogleLLMService too.

Here is the exact blind spot in _set_node (around line 757):

When respond_immediately: False, FlowManager queues the LLMMessagesUpdateFrame(run_llm=False) but intentionally skips queueing the LLMRunFrame.
The LLMContextAggregator receives the update frame, modifies its internal memory, but pushes nothing downstream (because run_llm is False and there is no LLMRunFrame to flush it).
The Result: The LLM service in the pipeline receives zero frames indicating a context change.

For standard turn-based text bots using stateless models (like GoogleLLMService), this seems fine—the aggregator will eventually push the new context when the user next finishes speaking. But it creates a race condition where the LLM's internal state hasn't actually updated yet.

For streaming realtime services, the bug is critical. Audio frames (InputAudioRawFrame) flow continuously to the LLM. Because the LLM service was never notified of the node transition, it continues processing incoming user audio using its old, "dirty" server-side session and old system prompt until something else happens to trigger a context sync.

Proposed Compromise
To guarantee exactly one synchronization event—and to ensure models are updated immediately even if we defer inference—would you be open to a surgical fix in manager.py?

We could pass run_llm=respond_immediately directly on the update frame, and skip queueing the separate LLMRunFrame.

# manager.py (in _update_llm_context)
frame_type = LLMMessagesUpdateFrame if is_reset else LLMMessagesAppendFrame
frames.append(frame_type(messages=messages, run_llm=respond_immediately))

# Remove the separate LLMRunFrame push in _set_node

This ensures the LLM receives exactly one state update immediately, preventing the double-inference issue while keeping streaming models in sync and respecting the respond_immediately flag. Let me know what you think, and I'd be happy to update the PR to reflect this!

markbackman · 2026-06-03T20:01:57Z

Thanks for digging into this, and for confirming the double-inference point.

On the respond_immediately=False case: that behavior is by design. The whole purpose of respond_immediately=False is to skip inference for that turn, so the LLM intentionally doesn't run and the context isn't pushed. Setting run_llm=True and queueing LLMRunFrame() are really the same trigger, and we deliberately do neither here. The new context isn't lost either, since set_messages() has already updated the aggregator, so it syncs on the next user turn.

In practice respond_immediately=False is meant for the first turn of a conversation, where you want the bot to wait for the user to speak first. I can't think of a mid-conversation case where it applies, and even there the behavior above is correct rather than a bug.

If you have a concrete realtime repro where a node transition leaves the session stale, please open a separate issue with a minimal example and we'll take a look. For this PR I'll keep it closed. Thanks again for the thoughtful investigation!

fix: propagate context reset to LLM service

c545eea

Sets run_llm=True on LLMMessagesUpdateFrame when resetting context. This ensures the aggregator pushes the fresh context upstream to the LLM, preventing the LLM from using its internal cached history.

Copilot AI review requested due to automatic review settings May 29, 2026 21:14

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread src/pipecat_flows/manager.py Outdated

bhavik-mangla force-pushed the fix/context-reset-propagation branch from 0ddde2a to 6510a0e Compare May 29, 2026 21:37

markbackman closed this Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: propagate context reset to LLM service#276

fix: propagate context reset to LLM service#276
bhavik-mangla wants to merge 2 commits into
pipecat-ai:mainfrom
bhavik-mangla:fix/context-reset-propagation

bhavik-mangla commented May 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

bhavik-mangla commented May 30, 2026

Uh oh!

markbackman commented Jun 3, 2026

Uh oh!

bhavik-mangla commented Jun 3, 2026 •

edited

Loading

Uh oh!

markbackman commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bhavik-mangla commented May 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

bhavik-mangla commented May 30, 2026

Uh oh!

markbackman commented Jun 3, 2026

Uh oh!

bhavik-mangla commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markbackman commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bhavik-mangla commented Jun 3, 2026 •

edited

Loading