mirror of
https://github.com/nesquena/hermes-webui.git
synced 2026-05-24 10:40:16 +00:00
stage-350: apply Opus SHOULD-FIX — tighten _partial_already_present dedup scope
Opus flagged that PR #2151's cancel-handler partial-dedup loop used a substring check that was too broad: any short prior assistant reply ('OK', 'Here is the answer:') would dedup a longer new partial containing it, silently dropping the partial and resurrecting the #893 data-loss bug. Tightened to only dedup against actual prior _partial=True markers with exact (whitespace-stripped) content match. Three new regression tests added (short-non-partial-prefix-does-not-dedup, exact-partial-match-still- dedups, same-content-non-partial-does-not-dedup). 10/10 partial-cancel tests pass after the fix. Also updated CHANGELOG with the conflict-resolution notes for #2151 vs #2136 and the #2178 test-fix.
This commit is contained in:
@@ -20,6 +20,14 @@
|
||||
|
||||
- **PR #2203** by @dobby-d-elf — Animates the "Activity: X tools" composer footer text while the LLM is using tools — subtle shimmer gradient that stops when tool-calling completes. Highlight color follows the active theme. Reduced-motion and mask-support fallbacks render plain muted Activity text unchanged in unsupported or `prefers-reduced-motion` environments. Also fixes a small flickering/unclickable first "Thinking" block when the user clicks it while the model is still streaming reasoning into it (unrelated to the animation but right next to it on screen).
|
||||
|
||||
### Stage-350 maintainer fixes
|
||||
|
||||
- **`api/streaming.py:_partial_already_present` dedup scope tightening** — Opus SHOULD-FIX-pre-merge on PR #2151. The dedup loop that prevents double-writing a `_partial` marker on `cancel_stream` re-entry used a substring check (`_stripped in _existing or _existing in _stripped`) against any prior assistant message — too broad. Any short prior assistant reply like "OK" or "Here is the answer:" would be a substring of many later partial bodies and could silently drop the new partial, resurrecting the #893 data-loss bug on long sessions. Tightened to: only dedup against actual prior `_partial=True` markers, with exact (whitespace-stripped) content match. Three new regression tests added: (a) short prior non-partial reply does NOT dedup a longer new partial that contains it, (b) exact-content match against a prior `_partial` marker DOES still dedup (re-entry safety), (c) prior assistant message with same content but NOT marked `_partial` does NOT dedup (it's from a completed earlier turn). 10/10 partial-cancel tests pass after the fix.
|
||||
|
||||
- **`api/streaming.py` cancel-handler conflict resolution between #2151 and the already-shipped #2136** — Resolved a semantic merge conflict on the cancel handler. Both PRs added stale-stream ownership guards at the same site. Kept #2136's `_stream_writeback_is_current()` check as the strictly-stronger condition (it also catches the case where the stream rotated to a new stream with a new pending_user_message — #2151's standalone check would have let that case fall through). Adopted #2151's `_emit_cancel_event = False` semantic on the same path so the terminal cancel SSE event isn't emitted in addition to skipping the writeback (otherwise a successful done payload already delivered to the client would be contradicted by a late cancel event). 55/55 tests across both PR suites pass after the resolution.
|
||||
|
||||
- **`tests/test_ollama_model_chip_label_regression.py` updated to match PR #2178's new `allowOllamaFormat` guard** — The existing static-source test asserted on the pre-PR string and was failing CI. Updated the assertion to require the new `allowOllamaFormat &&` guard prefix, with an extended docstring explaining the bug class (`Qwen3.6-35B-A3B`-shaped bare custom-provider model IDs had hyphens stripped to spaces + last letter lowercased by the ollama formatter pre-fix).
|
||||
|
||||
## [v0.51.56] — 2026-05-13 — Release AF (stage-349 — Tier 1 safe slice — reasoning_content whitelist + fork-from-here absolute index + Firefox sidebar scroll + provisional session titles)
|
||||
|
||||
### Added
|
||||
|
||||
+11
-3
@@ -4669,10 +4669,18 @@ def cancel_stream(stream_id: str) -> bool:
|
||||
_partial_already_present = False
|
||||
if _stripped:
|
||||
for _m in _cs.messages:
|
||||
if not isinstance(_m, dict) or _m.get('role') != 'assistant' or _m.get('_error'):
|
||||
# Stage-350 Opus SHOULD-FIX (#2151): only dedup
|
||||
# against actual prior _partial markers from the
|
||||
# same stream, with exact content match. The original
|
||||
# substring check (`_stripped in _existing or
|
||||
# _existing in _stripped`) was too broad — any short
|
||||
# prior assistant reply (e.g. "OK", "Here is the
|
||||
# answer:") becomes a substring of many later partial
|
||||
# bodies and could silently drop the new partial,
|
||||
# resurrecting the #893 data-loss bug on long sessions.
|
||||
if not isinstance(_m, dict) or not _m.get('_partial'):
|
||||
continue
|
||||
_existing = str(_m.get('content') or '').strip()
|
||||
if _existing and (_stripped in _existing or _existing in _stripped):
|
||||
if str(_m.get('content') or '').strip() == _stripped:
|
||||
_partial_already_present = True
|
||||
break
|
||||
if (_stripped or _has_reasoning or _has_tools) and not _partial_already_present:
|
||||
|
||||
@@ -295,3 +295,115 @@ class TestPartialMessageInContext:
|
||||
assert not any('Task cancelled' in c for c in contents), (
|
||||
"Cancel marker with _error=True must be stripped from API context"
|
||||
)
|
||||
|
||||
def test_short_prior_assistant_reply_does_not_dedup_new_partial(self):
|
||||
'''Stage-350 Opus SHOULD-FIX (#2151 follow-up): the partial-dedup loop
|
||||
in cancel_stream must only dedup against actual prior _partial markers
|
||||
with exact content match, not via substring containment against any
|
||||
prior assistant reply.
|
||||
|
||||
The original substring check (`_stripped in _existing or _existing in
|
||||
_stripped`) was too broad — a short prior assistant reply like "OK" or
|
||||
"Here is the answer:" would be a substring of many later partial bodies
|
||||
and silently drop the new partial, resurrecting the #893 data-loss bug.
|
||||
'''
|
||||
from api.models import Session
|
||||
|
||||
# Build a session that already has a short prior assistant reply
|
||||
s = Session(session_id='sess_short_prior', title='Test')
|
||||
s.messages = [
|
||||
{'role': 'user', 'content': 'Question one'},
|
||||
{'role': 'assistant', 'content': 'OK'}, # short reply, NOT _partial
|
||||
{'role': 'user', 'content': 'Question two — please answer fully'},
|
||||
]
|
||||
|
||||
# Simulate what cancel_stream does for the dedup check.
|
||||
# The new partial would be "OK, let me think about this..."
|
||||
# — "OK" appears as a substring of this. Under the OLD substring
|
||||
# check, this would have set _partial_already_present=True and
|
||||
# dropped the new partial. Under the NEW exact-match-against-_partial
|
||||
# check, no prior _partial exists, so the loop should NOT short-circuit.
|
||||
|
||||
new_partial_content = 'OK, let me think about this carefully...'
|
||||
_stripped = new_partial_content.strip()
|
||||
|
||||
# Inline the new dedup logic (matches api/streaming.py:4669-4685):
|
||||
_partial_already_present = False
|
||||
if _stripped:
|
||||
for _m in s.messages:
|
||||
if not isinstance(_m, dict) or not _m.get('_partial'):
|
||||
continue
|
||||
if str(_m.get('content') or '').strip() == _stripped:
|
||||
_partial_already_present = True
|
||||
break
|
||||
|
||||
assert _partial_already_present is False, (
|
||||
"Tightened dedup must NOT consider 'OK' (a non-partial prior "
|
||||
"assistant reply) as deduping a longer new partial that contains it. "
|
||||
"Without the tightening, the substring check `_stripped in _existing "
|
||||
"or _existing in _stripped` would have falsely matched."
|
||||
)
|
||||
|
||||
def test_exact_partial_match_still_dedups(self):
|
||||
'''Stage-350 Opus SHOULD-FIX (#2151 follow-up): the tighter dedup
|
||||
still correctly deduplicates a partial that is being persisted twice
|
||||
with exactly the same content (e.g. cancel_stream re-entered for the
|
||||
same stream id after STREAMS_LOCK is released).
|
||||
'''
|
||||
from api.models import Session
|
||||
|
||||
s = Session(session_id='sess_exact_dedup', title='Test')
|
||||
s.messages = [
|
||||
{'role': 'user', 'content': 'Hello'},
|
||||
# Prior _partial marker with exact same content as the incoming one
|
||||
{'role': 'assistant', 'content': 'Partial reply text', '_partial': True},
|
||||
]
|
||||
|
||||
_stripped = 'Partial reply text'
|
||||
|
||||
_partial_already_present = False
|
||||
if _stripped:
|
||||
for _m in s.messages:
|
||||
if not isinstance(_m, dict) or not _m.get('_partial'):
|
||||
continue
|
||||
if str(_m.get('content') or '').strip() == _stripped:
|
||||
_partial_already_present = True
|
||||
break
|
||||
|
||||
assert _partial_already_present is True, (
|
||||
"Exact-content match against a prior _partial marker must still "
|
||||
"dedup so cancel_stream re-entry doesn't double-write the partial."
|
||||
)
|
||||
|
||||
def test_non_partial_assistant_with_same_content_does_not_dedup(self):
|
||||
'''Stage-350 Opus SHOULD-FIX (#2151 follow-up): even if a prior
|
||||
assistant message has exactly the same content, if it isn't marked
|
||||
_partial, it does NOT dedup the new partial. This is correct: the
|
||||
prior message was a completed turn from an earlier conversation,
|
||||
and the new _partial belongs to the current cancelled stream.
|
||||
'''
|
||||
from api.models import Session
|
||||
|
||||
s = Session(session_id='sess_nondiluted', title='Test')
|
||||
s.messages = [
|
||||
{'role': 'user', 'content': 'Hello'},
|
||||
# Same content but NOT _partial — this is a completed prior turn
|
||||
{'role': 'assistant', 'content': 'Hi there'},
|
||||
]
|
||||
|
||||
_stripped = 'Hi there'
|
||||
|
||||
_partial_already_present = False
|
||||
if _stripped:
|
||||
for _m in s.messages:
|
||||
if not isinstance(_m, dict) or not _m.get('_partial'):
|
||||
continue
|
||||
if str(_m.get('content') or '').strip() == _stripped:
|
||||
_partial_already_present = True
|
||||
break
|
||||
|
||||
assert _partial_already_present is False, (
|
||||
"A prior assistant message with same content but NOT _partial "
|
||||
"must not dedup the new partial — it's from a completed earlier turn."
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user