Opus pre-release follow-ups for PR #1445

REQUIRED:
- _fully_unquote_path range(3) -> range(10) — defense-in-depth so quadruple-
  encoded .. is rejected by validator instead of slipping through (not
  exploitable but contract violation)
- docs/EXTENSIONS.md trust-model callout moved to top of file with explicit
  'don't enable in untrusted env / don't point at user-writable dir' guidance

NICE-TO-HAVE (taken since Nathan asked for all fixes big and small):
- URL list cap at _MAX_URL_LIST=32 to avoid pathological rendering
- One-shot WARNING log for rejected URLs (silent drop now visible to admin)
- One-shot WARNING log for URL list truncation
- MIME map: ttf (font/ttf), otf (font/otf), wasm (application/wasm)

5 regression tests in tests/test_pr1445_opus_followups.py pin all invariants.
This commit is contained in:
nesquena-hermes
2026-05-02 03:49:40 +00:00
parent 73cb3c1948
commit 4ee9368464
4 changed files with 237 additions and 5 deletions
+6 -2
View File
@@ -5,8 +5,12 @@
### Added
- **Opt-in WebUI extension hooks** (#1445) — adds a deliberately-small, self-hosted extension surface for administrators who want to inject local CSS/JS into the WebUI shell without forking the core repo. Disabled by default; activates only when `HERMES_WEBUI_EXTENSION_DIR` points to an existing directory. Three env vars expose the surface: `HERMES_WEBUI_EXTENSION_DIR` (filesystem root for served assets), `HERMES_WEBUI_EXTENSION_SCRIPT_URLS` (comma-separated same-origin script URLs to inject before `</body>`), `HERMES_WEBUI_EXTENSION_STYLESHEET_URLS` (same-origin stylesheet URLs to inject before `</head>`). New `/extensions/...` static route is auth-gated (NOT in `PUBLIC_PATHS`, unlike `/static/...`) so administrator-supplied code only runs for authenticated sessions. URL validation rejects external schemes, protocol-relative URLs, fragments, traversal (raw + percent-encoded + double-encoded), control characters, quotes, and angle brackets. Filesystem serving sandboxes paths under the configured root via `Path.resolve()` + `relative_to()`, rejects dotfiles, dot-directories, encoded backslashes, and symlink escapes. CSP unchanged — extensions live at same origin so existing `'self'` directive covers them. 7 regression tests in `tests/test_extension_hooks.py` pin the disabled-by-default contract, URL validation against external/protocol-relative/javascript:/data:/API/encoded-traversal, HTML escaping during injection, the auth-gate vs public-static distinction, sandboxed static serving, fail-closed when disabled or unreadable, and symlink-escape rejection. Documentation in `docs/EXTENSIONS.md` (204 lines) covers extension authoring guidance for SPA-style additions, including avoiding destructive DOM mutations like replacing `main.innerHTML`. **Trust model**: extensions are intentionally administrator-controlled — JS injected this way runs in the WebUI origin and can call any authenticated API the logged-in browser session can. The PR explicitly does NOT introduce remote extension loading, a plugin marketplace, Python plugin execution, manifests, a browser-facing config endpoint, or new dependencies. (`api/extensions.py`, `api/routes.py`, `docs/EXTENSIONS.md`, `tests/test_extension_hooks.py`, `README.md`) @ryansombraio — PR #1445
### Fixed (stage-merge)
- Test-only fix: `tests/test_extension_hooks.py::test_extension_route_remains_behind_webui_auth` was using `SimpleNamespace(path=...)` without a `.query` attribute, but `api/auth.py:check_auth` (since v0.50.258's multi-param `?next=` encoding fix) accesses `parsed.query` when constructing the redirect Location. Added `query=""` to the test's namespace and updated the expected Location to include the encoded `?next=` parameter. No behavior change.
### Fixed (Opus pre-release advisor)
- **`_fully_unquote_path` iteration cap raised from 3 to 10** — Opus advisor noted that quadruple-encoded `..` (`%2525252e%2525252e`) collapsed to `%2e%2e` after 3 iterations and slipped through the URL-injection validator. Not exploitable in practice (downstream `Path` doesn't decode `%2e` either, so the literal directory `%2e%2e` won't exist) but the validator's documented contract is "URLs must point to `/extensions/` or `/static/`," and a malformed URL that's neither cleanly that nor cleanly rejected violates the contract. Iteration cap is now 10 (URL strings stabilize in <5 iterations in practice; the cap is defensive). (`api/extensions.py`)
- **Trust-model callout at top of `docs/EXTENSIONS.md`** — moved the strongest trust-model warning ("extensions execute with full WebUI session authority") from the middle of the doc to a blockquote callout at the top, right after the lead paragraph. A casual operator skimming for "should I enable this?" now sees the hard truth before the friendly intro. Also adds explicit "do not point `HERMES_WEBUI_EXTENSION_DIR` at a user-writable directory" guidance. (`docs/EXTENSIONS.md`)
- **URL list cap (32 entries) + reject-URL logging** — caps configured URL lists at 32 entries to avoid pathological page rendering when a misconfigured env var ships thousands of URLs. Also logs a one-shot warning per process for each rejected URL (e.g. when an admin typos `https://...` and the validator drops it as external) so the silent-failure mode of "extension just doesn't load" produces a log signal an admin can find. (`api/extensions.py`)
- **MIME map expansion** — adds `ttf` (`font/ttf`), `otf` (`font/otf`), and `wasm` (`application/wasm`) to the served-MIME table. `.wasm` specifically would fail to instantiate in Chrome served as `text/plain`; the others are ergonomic for older font formats. (`api/extensions.py`)
- **5 regression tests** in `tests/test_pr1445_opus_followups.py` pin the new invariants: quadruple-encoded `..` collapses correctly, the same URL is now rejected by the validator, URL list caps at the configured max with a warning log, rejected URLs log exactly once per process, and the expanded MIME map serves `.ttf`/`.otf`/`.wasm` with the correct Content-Type without charset suffixes for binary types. (`tests/test_pr1445_opus_followups.py`)
## [v0.50.264] — 2026-05-02
+47 -3
View File
@@ -6,6 +6,7 @@ It is disabled by default and never executes or fetches third-party URLs.
"""
import html
import logging
import os
from pathlib import Path
from typing import Dict, List, Optional
@@ -13,6 +14,18 @@ from urllib.parse import unquote, urlsplit
from api.helpers import _security_headers, j
_log = logging.getLogger(__name__)
# Sane bound on configured URLs — real extensions ship 1-3 files. Higher values
# typically indicate a misconfiguration (one giant unsplit string, or a runaway
# generator script that wrote an env-var template without filtering). Capping
# avoids rendering tens of thousands of <script> tags into every page load.
_MAX_URL_LIST = 32
# Tracks rejected URL strings we've already warned about so a misconfigured env
# var doesn't spam the log on every request that re-reads it.
_warned_urls: set = set()
EXTENSION_ROUTE_PREFIX = "/extensions/"
_EXTENSION_DIR_ENV = "HERMES_WEBUI_EXTENSION_DIR"
_EXTENSION_SCRIPT_URLS_ENV = "HERMES_WEBUI_EXTENSION_SCRIPT_URLS"
@@ -32,6 +45,9 @@ _EXTENSION_MIME = {
"webp": "image/webp",
"woff": "font/woff",
"woff2": "font/woff2",
"ttf": "font/ttf",
"otf": "font/otf",
"wasm": "application/wasm",
}
_TEXT_MIME_TYPES = {"text/css", "application/javascript", "text/html", "image/svg+xml", "text/plain"}
@@ -52,9 +68,15 @@ def _extension_root() -> Optional[Path]:
def _fully_unquote_path(path: str) -> str:
"""Decode percent-encoding until stable so encoded dot-segments cannot hide."""
"""Decode percent-encoding until stable so encoded dot-segments cannot hide.
Iterates up to 10 times so even quadruple-encoded inputs like
``%2525252e%2525252e`` collapse to literal ``..`` and are rejected by
the segment-level safety check downstream. URL strings stabilize in
fewer than 5 iterations in practice; the cap is defensive.
"""
previous = path
for _ in range(3):
for _ in range(10):
current = unquote(previous)
if current == previous:
return current
@@ -90,8 +112,30 @@ def _read_url_list(env_name: str) -> List[str]:
urls = []
for item in raw.split(","):
value = item.strip()
if value and _is_safe_asset_url(value):
if not value:
continue
if _is_safe_asset_url(value):
urls.append(value)
if len(urls) >= _MAX_URL_LIST:
# Stop accumulating after the cap. Anything past this point
# would be silently dropped anyway; logging once makes the
# truncation visible to a confused operator.
if env_name not in _warned_urls:
_warned_urls.add(env_name)
_log.warning(
"Extension URL list %s truncated at %d entries",
env_name, _MAX_URL_LIST,
)
break
elif value not in _warned_urls:
# First-time-seen invalid URL: log once per process so a typo
# in HERMES_WEBUI_EXTENSION_*_URLS doesn't disappear silently.
_warned_urls.add(value)
_log.warning(
"Rejected extension URL %r from %s (not a same-origin "
"/extensions/ or /static/ path, or contains unsafe chars)",
value, env_name,
)
return urls
+8
View File
@@ -4,6 +4,14 @@ Hermes WebUI supports a small, opt-in extension surface for self-hosted installs
It lets an administrator serve local static assets and inject same-origin CSS or
JavaScript into the app shell without editing the WebUI source tree.
> **Trust model — read this first.** Extensions execute with full WebUI session
> authority. An extension JS file can call any API the logged-in user can call,
> including reading conversation history, sending messages, modifying settings,
> and triggering tool actions. **Only enable extensions you wrote yourself or
> from sources you trust as much as the WebUI source itself.** If your WebUI is
> shared with users you do not fully trust, do not enable extensions.
> Do not point `HERMES_WEBUI_EXTENSION_DIR` at a user-writable directory.
This is intentionally not a plugin marketplace or dependency system. It is a
safe escape hatch for local dashboards, internal tooling, and workflow-specific
panels that should not live in core Hermes WebUI.
+176
View File
@@ -0,0 +1,176 @@
"""Opus pre-release follow-up tests for stage-265 (PR #1445 extension hooks).
These tests pin the defense-in-depth additions from the Opus advisor review:
- `_fully_unquote_path` iterates up to 10 times (catches quadruple-encoded ..)
- `_read_url_list` caps at `_MAX_URL_LIST` (32) entries
- `_read_url_list` logs once per rejected URL
- MIME map covers `ttf`, `otf`, `wasm` for modern font/wasm assets
"""
import logging
from pathlib import Path
from types import SimpleNamespace
class FakeHandler:
def __init__(self):
self.status = None
self.headers = {}
self.sent_headers = []
self.body = bytearray()
self.wfile = self
def send_response(self, status):
self.status = status
def send_header(self, name, value):
self.sent_headers.append((name, value))
def end_headers(self):
pass
def write(self, data):
self.body.extend(data)
def header(self, name):
for key, value in self.sent_headers:
if key.lower() == name.lower():
return value
return None
def test_fully_unquote_handles_quadruple_encoded(monkeypatch):
"""Quadruple-encoded `..` (`%2525252e%2525252e`) must collapse to literal
`..` so the segment-level safety check rejects it. The original 3-iteration
cap stopped at `%2e%2e` and would have accepted the URL into the validator.
"""
from api.extensions import _fully_unquote_path
# Plain percent-encoding stops at `..` after 1 unquote
assert _fully_unquote_path("/extensions/%2e%2e/api/session") == "/extensions/../api/session"
# Double-encoded after 2 unquotes
assert _fully_unquote_path("/extensions/%252e%252e/api/session") == "/extensions/../api/session"
# Triple-encoded after 3 unquotes
assert _fully_unquote_path("/extensions/%25252e%25252e/api/session") == "/extensions/../api/session"
# Quadruple-encoded after 4 unquotes — the case that slipped through the
# original `range(3)` and reached the validator unchanged
assert _fully_unquote_path("/extensions/%2525252e%2525252e/api/session") == "/extensions/../api/session"
def test_quadruple_encoded_traversal_url_now_rejected(tmp_path, monkeypatch):
"""End-to-end: quadruple-encoded `..` in a configured URL is rejected
by the validator instead of slipping through. Pre-Opus this passed.
"""
root = tmp_path / "extensions"
root.mkdir()
monkeypatch.setenv("HERMES_WEBUI_EXTENSION_DIR", str(root))
monkeypatch.setenv(
"HERMES_WEBUI_EXTENSION_SCRIPT_URLS",
"/extensions/%2525252e%2525252e/api/session, /extensions/legit.js",
)
from api.extensions import get_extension_config
config = get_extension_config()
# Only the legit URL should pass validation; the quadruple-encoded
# traversal must be filtered out
assert config["script_urls"] == ["/extensions/legit.js"]
def test_url_list_caps_at_max(tmp_path, monkeypatch):
"""Configured URL lists cap at _MAX_URL_LIST entries to avoid pathological
rendering when a misconfigured env var ships thousands of URLs.
"""
root = tmp_path / "extensions"
root.mkdir()
monkeypatch.setenv("HERMES_WEBUI_EXTENSION_DIR", str(root))
# Build 100 valid URLs
urls = ", ".join(f"/extensions/script{i}.js" for i in range(100))
monkeypatch.setenv("HERMES_WEBUI_EXTENSION_SCRIPT_URLS", urls)
from api.extensions import get_extension_config, _MAX_URL_LIST
config = get_extension_config()
assert len(config["script_urls"]) == _MAX_URL_LIST
# First N kept (insertion order)
assert config["script_urls"][0] == "/extensions/script0.js"
assert config["script_urls"][-1] == f"/extensions/script{_MAX_URL_LIST - 1}.js"
def test_url_list_logs_rejected_urls_once(tmp_path, monkeypatch, caplog):
"""A misconfigured URL must produce a one-shot warning so an admin who
typos `https://...` (rejected as external) sees a signal in logs instead
of just a silently-not-loading extension.
"""
root = tmp_path / "extensions"
root.mkdir()
monkeypatch.setenv("HERMES_WEBUI_EXTENSION_DIR", str(root))
monkeypatch.setenv(
"HERMES_WEBUI_EXTENSION_SCRIPT_URLS",
"https://evil.example.com/x.js, /extensions/legit.js",
)
# Reset the per-process warning cache so the test doesn't accidentally
# depend on state from other tests in the same run
from api.extensions import _warned_urls
_warned_urls.clear()
caplog.set_level(logging.WARNING, logger="api.extensions")
from api.extensions import get_extension_config
config = get_extension_config()
assert config["script_urls"] == ["/extensions/legit.js"]
# The external URL must surface as a warning in the log
assert any(
"Rejected extension URL" in record.message
and "evil.example.com" in record.message
for record in caplog.records
)
# Second call within the same process must not re-log (one-shot)
caplog.clear()
config2 = get_extension_config()
assert config2["script_urls"] == ["/extensions/legit.js"]
rejection_records = [
r for r in caplog.records
if "Rejected extension URL" in r.message and "evil.example.com" in r.message
]
assert rejection_records == [], (
"Repeated invalid URL should not re-log on every config read"
)
def test_expanded_mime_map_serves_fonts_and_wasm(tmp_path, monkeypatch):
"""`ttf`, `otf`, and `wasm` extensions must serve with the right
Content-Type so browsers don't reject (especially `.wasm`, which Chrome
refuses to instantiate when served as `text/plain`).
"""
root = tmp_path / "extensions"
root.mkdir()
(root / "font.ttf").write_bytes(b"fake ttf binary")
(root / "font.otf").write_bytes(b"fake otf binary")
(root / "module.wasm").write_bytes(b"\x00asm" + b"\x01" * 8)
monkeypatch.setenv("HERMES_WEBUI_EXTENSION_DIR", str(root))
from api.extensions import serve_extension_static
ttf = FakeHandler()
assert serve_extension_static(ttf, SimpleNamespace(path="/extensions/font.ttf")) is True
assert ttf.status == 200
assert ttf.header("Content-Type") == "font/ttf"
otf = FakeHandler()
assert serve_extension_static(otf, SimpleNamespace(path="/extensions/font.otf")) is True
assert otf.status == 200
assert otf.header("Content-Type") == "font/otf"
wasm = FakeHandler()
assert serve_extension_static(wasm, SimpleNamespace(path="/extensions/module.wasm")) is True
assert wasm.status == 200
assert wasm.header("Content-Type") == "application/wasm"
# Binary types must NOT have a charset suffix
assert "charset" not in wasm.header("Content-Type")