CI's pytest invocation imports conftest twice (once via the standard
tests/ discovery, once via repo-root rootdir discovery), producing two
distinct function objects with the same __qualname__ but different `is`
identity. The strict identity assertion failed because each import
created a fresh closure. Switch to __qualname__ substring check — same
guarantee (default-on state has the wrapper installed; fixture restores
the real one) without the multi-import sensitivity.
CI on Python 3.11 still failed test_allow_outbound_network_fixture_*
because the previous module-global toggle (_ALLOW_OUTBOUND=True/False)
was unreliable on the runner — the wrapper's global lookup at call time
sometimes saw False even after the fixture's True assignment.
Switch to monkeypatch-based fixture: instead of toggling a global that
the wrapper checks, restore socket.create_connection and
socket.socket.connect to their REAL captured implementations for the
duration of the test. Pytest's monkeypatch fixture handles teardown so
the wrappers are reinstalled automatically.
Rewrote the two paired tests to check function identity
(socket.create_connection is _hermes_blocked_create_connection vs. is
_REAL_CREATE_CONNECTION) instead of attempting a live outbound to
8.8.8.8:53 — direct identity check is hermetic and doesn't depend on
whether the CI runner has any outbound network access at all.
Two low-severity follow-ups from Opus regrounding review:
1. The IPv6 unique-local fc00::/7 check was `h.startswith('fc') or
h.startswith('fd')` — too loose. It would also classify hostnames
like 'food.example.com' or 'fdsa.test' as 'local' and silently let
them through the block. Tightened to a regex match for canonical
IPv6 syntax (`f[cd][0-9a-f]{0,2}:`) so only actual IPv6 addresses
match. Same fix in both tests/conftest.py and server.py.
2. test_allow_outbound_network_fixture_unblocks was technically
self-passing: it tried to connect to a *.invalid hostname, which is
in the allow-list, so the real socket.create_connection would run
regardless of whether the fixture toggled the block. Replaced with
a public-IP-based test that actually proves the toggle works, plus
a paired test_block_is_active_outside_the_fixture sanity test that
proves the block is on without the fixture.
Both follow-ups noted by Opus advisor as 'defer-OK' but trivial fixes
so landing them in this batch.
Tests should not reach the public internet. Before this commit, an
accidentally-leaking outbound socket from the test_server fixture (real
TLS handshakes to Anthropic / Amazon / OpenRouter, sometimes triggered
by SDK-init paths that found a credential the credential-strip allowlist
missed) was adding 60+s of wall-time to a 100s test run and creating a
class of flaky failures.
This installs a default-deny socket-block at two layers:
1. Pytest process, via tests/conftest.py module-level monkey-patch on
socket.create_connection + socket.socket.connect. Loopback / RFC1918
private / link-local / RFC2606 reserved-TLD destinations pass through;
anything else raises OSError("hermes test network isolation: outbound
to ... blocked"). Tests that legitimately need real outbound opt back
in via the new `allow_outbound_network` fixture (no current callers).
2. Test_server subprocess (server.py), via a HERMES_WEBUI_TEST_NETWORK_BLOCK=1
environment-variable-gated guard at the top of server.py. tests/conftest.py
sets the env var on every test_server spawn. Without this, the subprocess
could make outbound that the pytest-side block can't see (which is exactly
what was happening — verified via `ss -tnp` showing the server.py child
with established ESTAB sockets to [2607:6bc0::10]:443).
In production the env var is unset, so the guard is a no-op.
Companion changes:
- test_dns_resolution_failure refactored to mock socket.getaddrinfo
raising gaierror, instead of relying on a real DNS lookup of a
*.invalid hostname. The test was the one outlier that genuinely
exercised real DNS; mocking matches what every other probe-error test
in the same file already does.
- New tests/test_conftest_network_isolation.py with 9 adversarial
tests proving the block fires for public IPs (including the exact
Anthropic IPv6 and Amazon IPv4 destinations we observed leaking),
the allow-list passes loopback / RFC1918 / link-local / reserved-TLDs,
and the opt-in fixture re-enables real outbound when needed.
Test suite: 5,120 → 5,192 (+72 net new from this commit + the regression
tests in the companion commits). Wall time: 161s → 95s on the same
hardware. No remaining outbound from any test path.