test(conftest): block AWS IMDS probing + expand credential-strip allowlist

Two test-infrastructure fixes surfaced while running the full suite on
this branch. Both prevent accidental outbound network calls from the
pytest process — a class of bug that doesn't show up as test failures
but corrupts timing, leaks credentials, and was responsible for a recent
10× slowdown observation.

## 1. AWS_EC2_METADATA_DISABLED for the whole pytest session

When hermes-agent's bedrock_adapter / botocore credential chain is
imported during tests (e.g. via api/config.py provider-catalog imports),
botocore probes the EC2 Instance Metadata Service at 169.254.169.254
looking for an instance role. On VPS hosts where IMDS is reachable but
rate-limited (HTTP 429) or non-responsive, those probes dominate wall
time — a 161s test run was observed extending to 600+s.

Set `AWS_EC2_METADATA_DISABLED=true` at module load (before any test-file
imports trigger botocore initialisation). This is the documented AWS-
supported way to silence the probe and matches the guard the agent's own
`hermes_cli/doctor.py` already uses inside its parallel-probe block.

Also explicitly re-set the var on the spawned test-server env so it
can't be accidentally cleared by a later `env.update(...)`.

## 2. Expanded credential-strip allowlist

The original strip list covered 6 providers (OpenRouter, OpenAI,
Anthropic, Google, DeepSeek, Xiaomi). Several others leaked through
into the test server subprocess:

- `MEM0_API_KEY`, `XAI_API_KEY`, `MISTRAL_API_KEY`, `OLLAMA_API_KEY`,
  `GROQ_API_KEY`, `TOGETHER_API_KEY`, …
- AWS credentials (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`,
  `AWS_SESSION_TOKEN`, `AWS_PROFILE`, `AWS_BEARER_TOKEN_BEDROCK`)
- Messaging bot tokens (`TELEGRAM_BOT_TOKEN`, `DISCORD_BOT_TOKEN`,
  `SLACK_BOT_TOKEN`, `SIGNAL_API_TOKEN`, `WHATSAPP_API_TOKEN`)
- Memory providers (`HONCHO_API_KEY`, `SUPERMEMORY_API_KEY`)
- Search / browser / image-gen (`FIRECRAWL_API_KEY`, `FAL_KEY`,
  `TAVILY_API_KEY`, `SERPER_API_KEY`, `BRAVE_API_KEY`)
- GitHub tokens (`GH_TOKEN`, `GITHUB_TOKEN`)
- Azure OpenAI (`AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`)

A real outbound TLS connection to a provider's IPv6 endpoint was
observed during a test run on this host before the strip was expanded.
The test server uses a mock config and has no business making real API
calls.

## Test status

5,151 passed / 11 skipped / 1 xfailed / 2 xpassed / 0 regressions in
139s on Python 3.11. Down from 147s before the fixes (and from
intermittent 10×-slowdowns on IMDS-rate-limited hosts). All API/feature
contracts unchanged.

## Security audit of remaining test-suite host references

Every IP / URL / hostname referenced in `tests/**.py` was classified:
- Loopback (127.0.0.1, localhost, ::1, 0.0.0.0)
- RFC1918 private (10.*, 172.16-31.*, 192.168.*)
- RFC 5737 TEST-NET-3 documentation (203.0.113.*)
- RFC 2606 reserved docs domains (*.example.com, *.example.local,
  *.example.test)
- Security-attack input strings used only as parser/validator input
  (evil.com, attacker, evil.example.com — never resolved or contacted)
- Real provider/CDN endpoints used only as `base_url` config strings
  or CSP-allowlist assertions — never actually fetched
- 8.8.8.8 used only as a "non-loopback example" in `_is_local_from_handler()`
  unit tests

No suspicious egress destinations.
This commit is contained in:
nesquena-hermes
2026-05-11 04:49:46 +00:00
parent 2dbee503c2
commit 1a2cf2812c
+59 -6
View File
@@ -153,6 +153,25 @@ def pytest_configure(config):
config.addinivalue_line("markers", "requires_agent_modules: skip when hermes-agent Python modules are not importable")
# ── Disable AWS IMDS probing for the pytest session ────────────────────────
# Background: when hermes-agent's bedrock_adapter / botocore credential chain
# runs during test execution (e.g. provider catalog enumeration triggered by
# api/config.py imports), botocore probes the EC2 Instance Metadata Service at
# 169.254.169.254 looking for an instance role. On VPS hosts where IMDS is
# reachable but rate-limited (HTTP 429) or non-responsive, this dominates wall
# time and turns a 161s test run into 600+s.
#
# Tests have no legitimate reason to call IMDS — the bedrock-related tests use
# explicit mocks or env-var creds. Setting AWS_EC2_METADATA_DISABLED before
# anything imports botocore is the supported way to silence the probe (matches
# the guard the hermes_cli/doctor.py command already uses in its parallel-probe
# block).
#
# Setting this here instead of in a fixture so it lands BEFORE any test-file
# imports trigger botocore initialisation.
os.environ.setdefault("AWS_EC2_METADATA_DISABLED", "true")
# ── Environment isolation for tests ────────────────────────────────────────
# HERMES_WEBUI_SKIP_ONBOARDING is set by hosting providers (e.g. Agent37) and
# by some isolated test harnesses to short-circuit the onboarding wizard.
@@ -304,14 +323,48 @@ def test_server():
# os.environ already set at module level above; no-op here.
env = os.environ.copy()
# Strip real provider keys so test subprocess never inherits production credentials.
# The test server uses a mock/isolated config — no real API calls are made.
# Strip ANY real credential env var so the test subprocess never inherits
# production creds. The test server uses a mock/isolated config — no real
# API calls are made, no real OAuth flow runs, no real cloud SDK should
# ever be initialised with usable credentials.
#
# Without this strip, a stray credential left in the runner's env was
# observed making outbound TLS to a real provider during test runs.
# See investigation notes in pytest-pitfalls SKILL §B.3.
_CRED_ENV_PREFIXES = (
# LLM providers
'OPENROUTER_API_KEY', 'OPENAI_API_KEY', 'OPENAI_BASE_URL',
'ANTHROPIC_API_KEY', 'ANTHROPIC_AUTH_TOKEN',
'GOOGLE_API_KEY', 'GOOGLE_APPLICATION_CREDENTIALS',
'DEEPSEEK_API_KEY', 'XIAOMI_API_KEY',
'XAI_API_KEY', 'MISTRAL_API_KEY', 'OLLAMA_API_KEY',
'GROQ_API_KEY', 'TOGETHER_API_KEY', 'PERPLEXITY_API_KEY',
'CEREBRAS_API_KEY', 'COHERE_API_KEY', 'FIREWORKS_API_KEY',
'NOUS_API_KEY', 'NOVITA_API_KEY', 'TENCENT_API_KEY',
'BIGMODEL_API_KEY', 'GLM_API_KEY', 'STEPFUN_API_KEY',
'MINIMAX_API_KEY', 'LM_API_KEY', 'LMSTUDIO_API_KEY',
'AZURE_OPENAI_API_KEY', 'AZURE_OPENAI_ENDPOINT',
# AWS — must be stripped or botocore probes IMDS / picks up real creds
'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY', 'AWS_SESSION_TOKEN',
'AWS_PROFILE', 'AWS_BEARER_TOKEN_BEDROCK',
# Memory providers, telemetry, dashboards
'MEM0_API_KEY', 'HONCHO_API_KEY', 'SUPERMEMORY_API_KEY',
# Messaging / gateway
'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN', 'SLACK_BOT_TOKEN',
'SIGNAL_API_TOKEN', 'WHATSAPP_API_TOKEN',
# Browser / image-gen / search
'FIRECRAWL_API_KEY', 'FAL_KEY', 'TAVILY_API_KEY',
'SERPER_API_KEY', 'BRAVE_API_KEY',
# Github tokens (PR/issue tools shouldn't be exercised in tests)
'GH_TOKEN', 'GITHUB_TOKEN',
)
for _k in list(env):
if any(_k.startswith(p) for p in (
'OPENROUTER_API_KEY', 'OPENAI_API_KEY', 'ANTHROPIC_API_KEY',
'GOOGLE_API_KEY', 'DEEPSEEK_API_KEY', 'XIAOMI_API_KEY',
)):
if any(_k.startswith(p) for p in _CRED_ENV_PREFIXES):
del env[_k]
# Belt-and-suspenders: keep IMDS disabled in the spawn env too (we set it
# at module level above for the pytest process, but make it explicit here
# so it's never accidentally cleared by an env.update later).
env["AWS_EC2_METADATA_DISABLED"] = "true"
env.update({
"HERMES_WEBUI_PORT": str(TEST_PORT),
"HERMES_WEBUI_HOST": "127.0.0.1",