Files
hermes-webui/tests/test_745_code_block_newlines.py
T
nesquena-hermes 498b51bfc6 v0.50.218: chat bubble overflow, project color picker, blockquote renderer (#1085)
* fix(css): add overflow-wrap:anywhere to chat bubbles — prevents long URL overflow (#1080)

* fix(projects): rename now works via dblclick timer guard + right-click color picker (#1078)

* fix(renderer): block-level constructs inside blockquotes now render

Fenced code blocks, headings, horizontal rules, and ordered lists inside
blockquotes now render correctly. Six related bugs documented in
blockquote-rendering-bugs.md were collapsed into one architectural fix
in renderMd().

Bugs fixed (all 6):

1. Fenced code blocks inside blockquotes -- > prefixes leaked into the
   <pre> body and the blockquote got fragmented around the rendered
   code, sometimes leaving raw <pre>/<div class="pre-header"> as
   visible text.
2. Blank > continuation lines fragmented multi-paragraph blockquotes
   into separate <blockquote> elements with literal > between them.
3. ## headings inside blockquotes rendered as literal "##" text.
4. Numbered lists inside blockquotes rendered as plain prose.
5. Complex blockquote (mixed headings + code + list + inline code)
   collapsed into a monospace blob with raw markdown syntax leaking
   everywhere.
6. Horizontal rules (---) inside blockquotes rendered as literal text.

Root cause:

The per-line passes for fenced code, headings, hr, ordered lists all ran
BEFORE the blockquote handler and could not match lines that started
with >, so by the time blockquote stripping ran those constructs had
already been mishandled.

Fix:

A new blockquote pre-pass at the top of renderMd():

- Walks lines fence-aware so > -prefixed lines inside non-blockquote
  code fences (e.g. shell prompts in bash code blocks) are not
  miscaptured as a blockquote.
- Groups consecutive > -prefixed lines, strips the > prefix, and
  recursively calls renderMd() on the stripped content. The recursive
  call handles all block-level constructs (fenced code, headings, hr,
  ordered/unordered lists, nested blockquotes) using the same pipeline.
- Wraps the rendered HTML in <blockquote> and stashes it with a \x00Q
  token. Restored at the very end of renderMd() so no later pass can
  mangle the inner HTML.

The old _applyBlockquotes regex-replace is removed entirely along with
its limited inline branches for nested blockquotes and unordered lists.

Behaviour change:

Blockquotes now produce CommonMark-compliant <p> wrapping for text
content (was: bare text directly inside <blockquote>). The visual
output is the same in browsers but the HTML structure is now standard.

Tests:

- 14 new behavioural tests in tests/test_renderer_js_behaviour.py
  drive the actual renderMd() via node and lock all 6 bug fixes.
- .local-review/test_blockquote_bugs.js -- node harness covering the
  same scenarios, runnable manually for fast iteration.
- 2407/2408 tests pass (1 pre-existing macOS-only failure deselected).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(renderer): entity decode before blockquote pre-pass + CSS margin fix

- Move the &gt;/&lt;/&amp; entity-decode to run at the very top of
  renderMd(), before the blockquote pre-pass. Previously decode() ran
  at line 756 (after the pre-pass at line 697), so LLM output containing
  &gt;-encoded blockquotes was never matched by the pre-pass.

- Add .msg-body blockquote p{margin:0} and .preview-md blockquote p{margin:0}
  so the new CommonMark-compliant <p> wrapping inside blockquotes doesn't
  add extra vertical spacing. Prior shape (bare text) had no default p-margins.

- Add Node-driven tests: TestBlockquoteEntityEncodedInput covers &gt; prefix
  and &gt;-encoded fenced code inside blockquotes.

- Add struct test: TestBlockquotePrePassOrdering::test_entity_decode_runs_before_blockquote_pre_pass
  locks decode < _bq_stash ordering in ui.js.

Fixes found during Opus independent review of #1083.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: v0.50.218 release notes, test count 2458, roadmap update

---------

Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 23:08:59 -07:00

109 lines
4.5 KiB
Python

"""
Tests for #745: code blocks losing newlines when not preceded by double blank line.
Root cause: the paragraph-splitter in renderMd() replaced \n with <br> inside
<pre><code> blocks when they were not separated by a double newline from surrounding
text. The fix stashes <pre> blocks (and pre-header divs, mermaid, katex) before
the paragraph split and restores them afterwards.
"""
import re
import subprocess
import sys
import os
UI_JS = os.path.join(os.path.dirname(__file__), '..', 'static', 'ui.js')
def get_ui_js():
return open(UI_JS, encoding='utf-8').read()
class TestCodeBlockNewlinePreservation:
def test_pre_stash_present(self):
"""The _pre_stash variable must exist in ui.js."""
src = get_ui_js()
assert '_pre_stash' in src, "_pre_stash not found in ui.js"
def test_pre_stash_token_E_used(self):
"""Stash token \\x00E must be used for pre-block stashing."""
src = get_ui_js()
assert r'\x00E' in src, r"\x00E stash token not found in ui.js"
def test_stash_before_paragraph_split(self):
"""_pre_stash must be populated BEFORE the parts=s.split line."""
src = get_ui_js()
pre_stash_pos = src.index('_pre_stash=[]')
split_pos = src.index('const parts=s.split(/\\n{2,}/)')
assert pre_stash_pos < split_pos, \
"_pre_stash must be initialised before the paragraph split"
def test_restore_after_paragraph_split(self):
"""_pre_stash restore must happen AFTER the paragraph map/join line."""
src = get_ui_js()
restore_pos = src.index('_pre_stash[+i]')
split_pos = src.index("}).join('\\n');", src.index('const parts=s.split'))
assert restore_pos > split_pos, \
"_pre_stash must be restored after the paragraph split/join"
def test_paragraph_split_bypasses_stash_tokens(self):
"""The paragraph map must bypass lines that start with \\x00E (pre stash).
Also accepts a character class like \\x00[EQ] when other stash tokens
share the same bypass (e.g. \\x00Q for blockquote stash)."""
src = get_ui_js()
# The map line must check for \x00E in its bypass condition
map_line = next(
l for l in src.splitlines()
if 'parts.map' in l and '<br>' in l
)
assert r'\x00E' in map_line or r'\x00[E' in map_line, (
r"paragraph map must bypass \x00E stash tokens (literally or as "
r"part of a character class like \x00[EQ])"
)
def test_pre_regex_covers_pre_header_div(self):
"""The stash regex must match <div class=\"pre-header\"> before <pre>."""
src = get_ui_js()
# Find the replacement regex used to populate _pre_stash
stash_block_idx = src.index('_pre_stash=[]')
stash_block = src[stash_block_idx:stash_block_idx + 400]
assert 'pre-header' in stash_block, \
"pre-stash regex must match <div class=\"pre-header\"> wrappers"
def test_mermaid_covered_by_stash(self):
"""The stash regex must also cover mermaid-block divs."""
src = get_ui_js()
stash_block_idx = src.index('_pre_stash=[]')
stash_block = src[stash_block_idx:stash_block_idx + 400]
assert 'mermaid-block' in stash_block, \
"pre-stash regex must cover mermaid-block divs"
def test_katex_covered_by_stash(self):
"""The stash regex must also cover katex-block divs."""
src = get_ui_js()
stash_block_idx = src.index('_pre_stash=[]')
stash_block = src[stash_block_idx:stash_block_idx + 400]
assert 'katex-block' in stash_block, \
"pre-stash regex must cover katex-block divs"
def test_js_syntax_valid(self):
"""ui.js must pass node --check after the fix."""
result = subprocess.run(
['node', '--check', UI_JS],
capture_output=True, text=True
)
assert result.returncode == 0, \
f"node --check failed:\n{result.stderr}"
def test_stash_token_e_not_used_elsewhere(self):
"""\\x00E must only appear in the pre-stash section (not reused)."""
src = get_ui_js()
occurrences = [
i for i in range(len(src))
if src[i:i+4] == r'\x00' and i + 4 < len(src) and src[i+4] == 'E'
]
# Allow 2 occurrences: the push token and the restore regex
# (may be 3 if there's also a comment mentioning it)
assert len(occurrences) >= 2, \
r"Expected at least 2 uses of \x00E (push + restore)"