fix(bootstrap): add --foreground mode for process supervisors (#1458 Bug #1)

Issue #1458 reports persistent-host crashes (≥1/day) when running the WebUI
under launchd KeepAlive on macOS. Root cause: `bootstrap.py` calls
`subprocess.Popen([python, "server.py"], start_new_session=True)`, probes
/health, then exits 0. Under any process supervisor (launchd, systemd,
supervisord, runit, s6), the supervisor sees its tracked PID exit, marks
the program as "completed," and respawns it. The new bootstrap fails to
bind port 8787 (orphaned server still has it), exits non-zero, supervisor
respawns again — loop until the orphan crashes for some other reason and
the next respawn finds the port free.

This PR addresses Bug #1 of the three failure modes tracked in #1458:
the `bootstrap.py` double-fork breaking process supervisors. Bug #2
(state.db FD leak) and Bug #3 (HTTP-unhealthy wedge) remain open under
the same issue — they need diagnosis data before a fix can land.

Changes
-------

1. `bootstrap.py`:
   - New `--foreground` argparse flag with help text mentioning launchd /
     systemd / supervisord.
   - New `_detect_supervisor()` that returns the env var name for any
     supervisor it detects: `INVOCATION_ID` / `JOURNAL_STREAM` /
     `NOTIFY_SOCKET` (systemd, s6), `XPC_SERVICE_NAME` (launchd),
     `SUPERVISOR_ENABLED` (supervisord), or `HERMES_WEBUI_FOREGROUND` for
     the explicit user opt-in. Truthy values for the explicit opt-in:
     `1` / `true` / `yes` / `on` (case-insensitive).
   - `main()` branches on `args.foreground or _detect_supervisor()`:
     - **Foreground path:** chdir to `agent_dir or REPO_ROOT`, then
       `os.execv(python, [python, server_path])` to replace the bootstrap
       process image with the server. The supervisor sees the long-lived
       server as the original child. No `wait_for_health` probe — the
       supervisor's KeepAlive / Restart=on-failure handles liveness.
     - **Default path:** unchanged. Spawn server as detached child via
       `Popen + start_new_session=True`, probe /health, return 0. This
       still works for interactive `bash start.sh` invocations.
   - Resolved env vars (HOST/PORT/STATE_DIR/AGENT_DIR) are now mutated on
     `os.environ` directly instead of into a local `env` copy so they
     are inherited across `os.execv`.

2. `docs/supervisor.md` (new): runnable launchd plist, systemd .service,
   and supervisord conf examples + a diagnostic recipe (`lsof` + ppid
   chain) for catching the orphan-loop in production.

3. `.gitignore`: allowlist `docs/supervisor.md` (the directory uses an
   opt-in pattern; matches the existing `!docs/docker.md` precedent).

4. `tests/test_bootstrap_foreground.py` (new): 35 regression tests
   covering the argparse flag, `_detect_supervisor()` behavior across all
   five supervisor env vars, the explicit opt-in's truthy/falsy values,
   and `main()`'s execv-vs-Popen routing decision under each input
   combination. `os.execv` is monkeypatched in the routing tests — we
   pin the structural choice (which call is made, with which args, in
   which cwd, with which env) not the post-exec behavior.

Why this scope and no more
--------------------------

Bug #2 (state.db FD leak) lists 5 candidate paths and asks the reporter
for `lsof -p <pid> | sort | uniq -c | sort -rn | head -20` output to
disambiguate. Until that data lands, any "fix" would be speculative —
explicitly out of scope per the contributor-pickup comment on the issue.

Bug #3 (launchd-running, port-listening, HTTP-unhealthy) was added in
@stefanpieter's reply comment. Diagnosis is in flight; no concrete fix
shape yet. Also out of scope.

Running locally end-to-end verifies the behavior:

```
[bootstrap] Starting Hermes Web UI on http://127.0.0.1:8789 (foreground mode: --foreground)
$ pgrep -af 'server.py'
2997632 /home/.../python /tmp/wt-fix-1458/server.py
$ ps -o ppid -p 2997632
2997581   ← bash that ran bootstrap.py — same PID as the original bootstrap
$ ps -p 2997581 -o cmd
... bootstrap.py ...   ← but exec'd into server.py
```

The same PID that bash forked for `bootstrap.py` is now `server.py`.
A supervisor watching that PID would correctly observe the long-lived
server. No double-fork.

Verification
------------

- 3811 tests pass (`pytest tests/` — full suite, +51 from this PR plus
  master-merge-in)
- All 35 new bootstrap-foreground tests pass
- `bash scripts/run-browser-tests.sh` PASS (HTTP API checks against worktree)
- `bash scripts/webui_qa_agent.sh 8789` PASS (23/23 visual QA)
- Live verified: server starts cleanly under both `--foreground` and
  `HERMES_WEBUI_FOREGROUND=1`; PID lineage confirms no double-fork

Closes #1458 (Bug #1 only). Bugs #2 and #3 remain tracked under the
issue.
This commit is contained in:
Hermes Bot
2026-05-02 17:37:54 +00:00
parent 5650d1107a
commit f84b6a4e2f
4 changed files with 648 additions and 9 deletions
+1
View File
@@ -41,6 +41,7 @@ docs/*
!docs/ui-ux/
!docs/ui-ux/**
!docs/docker.md
!docs/supervisor.md
# Local-only PR review harness: rendering drivers, sample bank, fixtures.
# Used by Claude during deep reviews; never shared in the repo.
+82 -9
View File
@@ -208,9 +208,54 @@ def parse_args() -> argparse.Namespace:
action="store_true",
help="Fail instead of attempting the official Hermes installer.",
)
parser.add_argument(
"--foreground",
action="store_true",
help=(
"Run server.py in this process (via os.execv) instead of spawning a "
"child. Use this under launchd / systemd / supervisord so the "
"supervisor sees the long-lived server as the original child. "
"Implies --no-browser. Skips the post-launch health probe — the "
"supervisor's own KeepAlive / Restart=on-failure handles liveness."
),
)
return parser.parse_args()
# Env vars whose presence indicates this process was launched by a supervisor
# that wants to manage the server's lifecycle (KeepAlive, Restart=always, etc.).
# When any is set, we auto-promote to --foreground so we don't double-fork.
#
# - INVOCATION_ID systemd (set on every service activation)
# - JOURNAL_STREAM systemd (set when stdio is wired to the journal)
# - NOTIFY_SOCKET systemd Type=notify, s6 sd_notify-style
# - XPC_SERVICE_NAME launchd (set to the Label of the running plist)
# - SUPERVISOR_ENABLED supervisord
# - HERMES_WEBUI_FOREGROUND explicit user opt-in (=1 / true / yes / on)
_SUPERVISOR_ENV_VARS = (
"INVOCATION_ID",
"JOURNAL_STREAM",
"NOTIFY_SOCKET",
"XPC_SERVICE_NAME",
"SUPERVISOR_ENABLED",
)
def _detect_supervisor() -> str | None:
"""Return the name of the detected supervisor env var, or None.
Pure inspection of os.environ — no side effects. Returned name is the env
var that triggered detection, useful for log messages and for tests.
"""
explicit = os.environ.get("HERMES_WEBUI_FOREGROUND", "").strip().lower()
if explicit in ("1", "true", "yes", "on"):
return "HERMES_WEBUI_FOREGROUND"
for name in _SUPERVISOR_ENV_VARS:
if os.environ.get(name):
return name
return None
def main() -> int:
args = parse_args()
ensure_supported_platform()
@@ -229,21 +274,49 @@ def main() -> int:
os.getenv("HERMES_WEBUI_STATE_DIR", str(Path.home() / ".hermes" / "webui"))
).expanduser()
state_dir.mkdir(parents=True, exist_ok=True)
log_path = state_dir / f"bootstrap-{args.port}.log"
env = os.environ.copy()
env["HERMES_WEBUI_HOST"] = args.host
env["HERMES_WEBUI_PORT"] = str(args.port)
env.setdefault("HERMES_WEBUI_STATE_DIR", str(state_dir))
# Mutate os.environ so child (or post-execv) inherits the resolved values.
os.environ["HERMES_WEBUI_HOST"] = args.host
os.environ["HERMES_WEBUI_PORT"] = str(args.port)
os.environ.setdefault("HERMES_WEBUI_STATE_DIR", str(state_dir))
if agent_dir:
env["HERMES_WEBUI_AGENT_DIR"] = str(agent_dir)
os.environ["HERMES_WEBUI_AGENT_DIR"] = str(agent_dir)
server_cwd = str(agent_dir or REPO_ROOT)
server_path = str(REPO_ROOT / "server.py")
# --foreground (or auto-detected supervisor): replace this process with the
# server. The supervisor sees the long-lived server as the original child,
# so KeepAlive / Restart=always / autorestart=true work correctly. No
# health probe — the supervisor's own restart-on-exit handles liveness.
foreground_reason = "--foreground" if args.foreground else _detect_supervisor()
if foreground_reason:
info(
f"Starting Hermes Web UI on http://{args.host}:{args.port} "
f"(foreground mode: {foreground_reason})"
)
try:
os.chdir(server_cwd)
except OSError as exc:
raise RuntimeError(
f"Could not chdir to {server_cwd!r} before exec: {exc}"
) from exc
# os.execv replaces the current process image. Anything after this line
# only runs if execv itself fails (it raises OSError on failure).
os.execv(python_exe, [python_exe, server_path])
# Unreachable — execv either replaces the process or raises.
raise RuntimeError("os.execv returned unexpectedly")
# Default (legacy) path: spawn the server as a detached child, probe
# /health, then return. Suitable for an interactive `bash start.sh` run.
log_path = state_dir / f"bootstrap-{args.port}.log"
info(f"Starting Hermes Web UI on http://{args.host}:{args.port}")
with log_path.open("ab") as log_file:
proc = subprocess.Popen(
[python_exe, str(REPO_ROOT / "server.py")],
cwd=str(agent_dir or REPO_ROOT),
env=env,
[python_exe, server_path],
cwd=server_cwd,
env=os.environ.copy(),
stdout=log_file,
stderr=subprocess.STDOUT,
start_new_session=True,
+207
View File
@@ -0,0 +1,207 @@
# Running Hermes Web UI under a process supervisor
Use a process supervisor (launchd, systemd, supervisord, runit, s6) when you
want the Web UI to start at boot, restart on crash, or be managed alongside
other services.
## TL;DR
Pass ``--foreground`` to ``bootstrap.py`` (or ``bash start.sh``):
```bash
bash start.sh --foreground
```
Or set ``HERMES_WEBUI_FOREGROUND=1`` in the environment. The Web UI will
auto-detect launchd / systemd / supervisord even without the flag, but being
explicit is safer.
## Why ``--foreground`` matters
Without it, ``bootstrap.py`` does this:
1. Spawn ``server.py`` as a detached subprocess (``start_new_session=True``)
2. Probe ``/health`` until the server is up
3. Exit 0
That works for an interactive shell run (``./start.sh`` returns to your
prompt with the server alive in the background). It is **broken** under any
process supervisor: the supervisor sees its tracked PID exit, marks the job
as completed, and respawns ``bootstrap.py``. The respawn fails to bind port
8787 (the orphaned server still has it), exits non-zero, supervisor
respawns again — loop.
In foreground mode, ``bootstrap.py`` does its setup work and then calls
``os.execv`` to replace its own process with ``server.py``. The supervisor
sees the long-lived server as the original child. ``KeepAlive=true`` /
``Restart=always`` work correctly.
## launchd (macOS)
``~/Library/LaunchAgents/com.example.hermes-webui.plist``:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.example.hermes-webui</string>
<key>ProgramArguments</key>
<array>
<string>/bin/bash</string>
<string>/Users/yourname/hermes-webui/start.sh</string>
<string>--foreground</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/yourname/hermes-webui</string>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/Users/yourname/.hermes/webui/launchd-stdout.log</string>
<key>StandardErrorPath</key>
<string>/Users/yourname/.hermes/webui/launchd-stderr.log</string>
<key>EnvironmentVariables</key>
<dict>
<key>HOME</key>
<string>/Users/yourname</string>
<key>PATH</key>
<string>/usr/local/bin:/usr/bin:/bin</string>
</dict>
</dict>
</plist>
```
Load:
```bash
launchctl load ~/Library/LaunchAgents/com.example.hermes-webui.plist
launchctl print gui/$(id -u)/com.example.hermes-webui # check state
```
Reload after editing the plist:
```bash
launchctl unload ~/Library/LaunchAgents/com.example.hermes-webui.plist
launchctl load ~/Library/LaunchAgents/com.example.hermes-webui.plist
```
launchd sets ``XPC_SERVICE_NAME`` automatically, so even without the
``--foreground`` argument the Web UI will auto-promote to foreground mode.
The flag is still recommended as documentation of intent.
## systemd (Linux)
``~/.config/systemd/user/hermes-webui.service``:
```ini
[Unit]
Description=Hermes Web UI
After=network.target
[Service]
Type=simple
WorkingDirectory=%h/hermes-webui
ExecStart=/bin/bash %h/hermes-webui/start.sh --foreground
Restart=on-failure
RestartSec=5
# Optional: route stdout/stderr to journald instead of files
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=default.target
```
Enable + start:
```bash
systemctl --user daemon-reload
systemctl --user enable --now hermes-webui.service
journalctl --user -u hermes-webui.service -f
```
systemd sets ``INVOCATION_ID`` and ``JOURNAL_STREAM`` (when stdio is wired to
the journal), both of which auto-promote to foreground mode.
## supervisord (cross-platform)
``/etc/supervisor/conf.d/hermes-webui.conf``:
```ini
[program:hermes-webui]
command=/bin/bash /home/youruser/hermes-webui/start.sh --foreground
directory=/home/youruser/hermes-webui
user=youruser
autostart=true
autorestart=true
stopsignal=TERM
stopwaitsecs=10
stdout_logfile=/var/log/hermes-webui.out.log
stderr_logfile=/var/log/hermes-webui.err.log
environment=HOME="/home/youruser",PATH="/usr/local/bin:/usr/bin:/bin"
```
Reload + start:
```bash
sudo supervisorctl reread
sudo supervisorctl update
sudo supervisorctl status hermes-webui
```
supervisord sets ``SUPERVISOR_ENABLED``, which auto-promotes to foreground
mode.
## Auto-detected env vars (full list)
These trigger ``--foreground`` behavior even when the flag is not passed:
| Env var | Set by | Notes |
|---|---|---|
| ``INVOCATION_ID`` | systemd | Set on every service activation |
| ``JOURNAL_STREAM`` | systemd | Set when stdio is wired to journald |
| ``NOTIFY_SOCKET`` | systemd ``Type=notify`` / s6 | sd_notify-style notification socket |
| ``XPC_SERVICE_NAME`` | launchd | Set to the plist Label |
| ``SUPERVISOR_ENABLED`` | supervisord | Always set under supervisord |
| ``HERMES_WEBUI_FOREGROUND`` | you | Explicit opt-in; accepts ``1`` / ``true`` / ``yes`` / ``on`` |
If you're running under a supervisor that is not in the list and your tracked
PID keeps exiting, set ``HERMES_WEBUI_FOREGROUND=1`` in the service
environment.
## Diagnostic recipe
If the Web UI keeps getting respawned and you suspect the double-fork loop:
```bash
# Check the running PID for the server
lsof -iTCP:8787 -sTCP:LISTEN
# Get its parent — should be the supervisor itself, NOT init (PID 1)
PID=$(lsof -tiTCP:8787 -sTCP:LISTEN)
ps -p "$PID" -o pid,ppid,cmd
ps -p "$(ps -o ppid= -p "$PID" | tr -d ' ')" -o pid,cmd
```
A healthy foreground-mode setup looks like:
```
PID PPID CMD
12345 6789 /path/to/python /path/to/server.py
6789 1 /sbin/launchd # or /usr/lib/systemd/systemd, etc.
```
If PPID is ``1`` (init) when it should be the supervisor, the orphan-server
loop is happening — re-check that ``--foreground`` (or one of the env vars)
is reaching the process.
+358
View File
@@ -0,0 +1,358 @@
"""
Tests for bootstrap.py --foreground / supervisor auto-detect (issue #1458, Bug #1).
Background
----------
Issue #1458 reports: under launchd / systemd / supervisord with KeepAlive=true
or Restart=always, bootstrap.py exits after spawning the server child via
``Popen + wait_for_health``. The supervisor sees the parent exit, marks the
program as "completed," and respawns it — but the original server child is
still holding port 8787 in a detached process group. The new bootstrap fails
to bind, exits non-zero, supervisor respawns again, loops until something
crashes the orphan and frees the port.
The fix
-------
Add ``--foreground`` flag and supervisor-environment auto-detection. In
foreground mode we replace the current process via ``os.execv`` with the
server, so the supervisor sees the long-lived server as the original child.
The legacy ``Popen + wait_for_health`` path is preserved for interactive
``bash start.sh`` runs.
Coverage
--------
1. ``--foreground`` is a recognized argparse flag
2. ``_detect_supervisor()`` returns None on a clean env
3. ``_detect_supervisor()`` returns the env-var name on each known supervisor
(``INVOCATION_ID`` / ``JOURNAL_STREAM`` / ``NOTIFY_SOCKET`` /
``XPC_SERVICE_NAME`` / ``SUPERVISOR_ENABLED``)
4. ``_detect_supervisor()`` returns ``HERMES_WEBUI_FOREGROUND`` for the
explicit opt-in, accepting ``1``/``true``/``yes``/``on`` (case-insensitive)
5. ``_detect_supervisor()`` ignores ``HERMES_WEBUI_FOREGROUND=0`` /
``=false`` / ``=`` and falls through to env-var probing
6. ``main()`` calls ``os.execv`` (NOT ``subprocess.Popen``) when
``--foreground`` is passed
7. ``main()`` calls ``os.execv`` (NOT ``subprocess.Popen``) when a supervisor
env var is set even without the explicit flag
8. Default ``main()`` path (no flag, clean env) still uses ``Popen``
9. Foreground path chdir's to ``agent_dir or REPO_ROOT`` before execv (matches
the cwd the legacy Popen uses)
10. Foreground path exports ``HERMES_WEBUI_HOST`` / ``HERMES_WEBUI_PORT`` /
``HERMES_WEBUI_AGENT_DIR`` / ``HERMES_WEBUI_STATE_DIR`` to ``os.environ``
so the post-exec server picks them up
11. Foreground path skips ``wait_for_health`` (no client to retry from)
12. ``--foreground`` help text mentions launchd / systemd / supervisord
These tests do NOT actually exec — ``os.execv`` is monkeypatched. We're
pinning the structural choice (which path runs, which cwd, which env) not the
post-exec behavior (which is the OS kernel's job).
"""
from __future__ import annotations
import os
import subprocess
import sys
from pathlib import Path
from unittest.mock import patch
import pytest
REPO_ROOT = Path(__file__).parent.parent
sys.path.insert(0, str(REPO_ROOT))
# ---------- helpers --------------------------------------------------------
@pytest.fixture
def clean_env(monkeypatch):
"""Strip all known supervisor env vars so detection starts from a clean state."""
for name in (
"INVOCATION_ID",
"JOURNAL_STREAM",
"NOTIFY_SOCKET",
"XPC_SERVICE_NAME",
"SUPERVISOR_ENABLED",
"HERMES_WEBUI_FOREGROUND",
):
monkeypatch.delenv(name, raising=False)
@pytest.fixture
def import_bootstrap():
"""Import bootstrap freshly each test to avoid module-level state bleed."""
# bootstrap.py runs ``_load_repo_dotenv()`` at import time; that's idempotent.
if "bootstrap" in sys.modules:
del sys.modules["bootstrap"]
import bootstrap as bs
return bs
# ---------- argparse coverage ---------------------------------------------
class TestForegroundFlag:
def test_foreground_is_recognized_flag(self, import_bootstrap, monkeypatch):
monkeypatch.setattr(sys, "argv", ["bootstrap.py", "--foreground"])
args = import_bootstrap.parse_args()
assert args.foreground is True
assert args.port == import_bootstrap.DEFAULT_PORT # default preserved
assert args.host == import_bootstrap.DEFAULT_HOST
def test_foreground_default_is_false(self, import_bootstrap, monkeypatch):
monkeypatch.setattr(sys, "argv", ["bootstrap.py"])
args = import_bootstrap.parse_args()
assert args.foreground is False
def test_foreground_help_mentions_supervisors(self, import_bootstrap, monkeypatch, capsys):
# argparse prints help and exits — capture and verify content.
monkeypatch.setattr(sys, "argv", ["bootstrap.py", "--help"])
with pytest.raises(SystemExit):
import_bootstrap.parse_args()
out = capsys.readouterr().out
assert "--foreground" in out
assert "launchd" in out
assert "systemd" in out
assert "supervisord" in out
# ---------- _detect_supervisor() ------------------------------------------
class TestDetectSupervisor:
def test_clean_env_returns_none(self, import_bootstrap, clean_env):
assert import_bootstrap._detect_supervisor() is None
@pytest.mark.parametrize("var", [
"INVOCATION_ID",
"JOURNAL_STREAM",
"NOTIFY_SOCKET",
"XPC_SERVICE_NAME",
"SUPERVISOR_ENABLED",
])
def test_each_supervisor_var_triggers(self, import_bootstrap, clean_env, monkeypatch, var):
monkeypatch.setenv(var, "anything-truthy")
assert import_bootstrap._detect_supervisor() == var
@pytest.mark.parametrize("value", ["1", "true", "TRUE", "yes", "Yes", "on", "ON"])
def test_explicit_opt_in_truthy_values(self, import_bootstrap, clean_env, monkeypatch, value):
monkeypatch.setenv("HERMES_WEBUI_FOREGROUND", value)
assert import_bootstrap._detect_supervisor() == "HERMES_WEBUI_FOREGROUND"
@pytest.mark.parametrize("value", ["0", "false", "FALSE", "no", "off", "", " "])
def test_explicit_opt_in_falsy_values_fall_through(self, import_bootstrap, clean_env, monkeypatch, value):
# When HERMES_WEBUI_FOREGROUND is falsy, we should NOT short-circuit on it.
# If no other supervisor var is set, returns None.
monkeypatch.setenv("HERMES_WEBUI_FOREGROUND", value)
assert import_bootstrap._detect_supervisor() is None
def test_explicit_opt_in_takes_precedence_over_supervisor_var(self, import_bootstrap, clean_env, monkeypatch):
# Both set → explicit flag wins (returned name reflects user intent).
monkeypatch.setenv("HERMES_WEBUI_FOREGROUND", "1")
monkeypatch.setenv("INVOCATION_ID", "deadbeef")
assert import_bootstrap._detect_supervisor() == "HERMES_WEBUI_FOREGROUND"
# ---------- main() routing ------------------------------------------------
class TestMainForegroundRouting:
"""Verify which code path main() takes under each input combination.
These are STRUCTURAL tests — they pin which call (execv vs Popen) is made,
not the result. We monkeypatch every external side effect so main() runs
in a hermetic environment.
"""
@pytest.fixture
def stub_main_dependencies(self, monkeypatch, tmp_path):
"""Stub out everything main() calls except the routing decision."""
import bootstrap as bs
monkeypatch.setattr(bs, "ensure_supported_platform", lambda: None)
monkeypatch.setattr(bs, "discover_agent_dir", lambda: tmp_path / "agent")
monkeypatch.setattr(bs, "hermes_command_exists", lambda: True)
monkeypatch.setattr(bs, "discover_launcher_python", lambda *a: "/usr/bin/python3")
monkeypatch.setattr(bs, "ensure_python_has_webui_deps", lambda p: p)
monkeypatch.setattr(bs, "wait_for_health", lambda *a, **kw: True)
monkeypatch.setattr(bs, "open_browser", lambda *a, **kw: None)
monkeypatch.setenv("HERMES_WEBUI_STATE_DIR", str(tmp_path / "state"))
# Make agent_dir exist so chdir doesn't fail.
(tmp_path / "agent").mkdir(parents=True, exist_ok=True)
return bs
def test_default_path_uses_popen(self, stub_main_dependencies, clean_env, monkeypatch):
bs = stub_main_dependencies
monkeypatch.setattr(sys, "argv", ["bootstrap.py", "--no-browser"])
execv_calls = []
popen_calls = []
monkeypatch.setattr(os, "execv", lambda *a: execv_calls.append(a))
class FakePopen:
pid = 12345
def __init__(self, *args, **kwargs):
popen_calls.append((args, kwargs))
monkeypatch.setattr(subprocess, "Popen", FakePopen)
rc = bs.main()
assert rc == 0
assert len(popen_calls) == 1, "Default path should call subprocess.Popen exactly once"
assert len(execv_calls) == 0, "Default path must NOT call os.execv"
def test_foreground_flag_uses_execv(self, stub_main_dependencies, clean_env, monkeypatch):
bs = stub_main_dependencies
monkeypatch.setattr(sys, "argv", ["bootstrap.py", "--foreground"])
execv_calls = []
popen_calls = []
# execv normally replaces the process; we capture+raise SystemExit so
# main() returns control to us instead of falling through to the
# legacy Popen branch.
def fake_execv(path, argv):
execv_calls.append((path, argv))
raise SystemExit(0)
monkeypatch.setattr(os, "execv", fake_execv)
monkeypatch.setattr(os, "chdir", lambda p: None)
def fake_popen(*args, **kwargs):
popen_calls.append((args, kwargs))
return None
monkeypatch.setattr(subprocess, "Popen", fake_popen)
with pytest.raises(SystemExit) as ei:
bs.main()
assert ei.value.code == 0
assert len(execv_calls) == 1, "--foreground must call os.execv exactly once"
assert len(popen_calls) == 0, "--foreground must NOT call subprocess.Popen"
path, argv = execv_calls[0]
assert path == "/usr/bin/python3"
# argv[0] is the program name (convention), argv[1] is the script
assert argv[0] == "/usr/bin/python3"
assert argv[1].endswith("server.py")
@pytest.mark.parametrize("var", [
"INVOCATION_ID",
"JOURNAL_STREAM",
"NOTIFY_SOCKET",
"XPC_SERVICE_NAME",
"SUPERVISOR_ENABLED",
])
def test_supervisor_env_var_auto_promotes_to_execv(self, stub_main_dependencies, clean_env, monkeypatch, var):
bs = stub_main_dependencies
monkeypatch.setattr(sys, "argv", ["bootstrap.py"]) # no --foreground
monkeypatch.setenv(var, "deadbeef")
execv_calls = []
popen_calls = []
def fake_execv(path, argv):
execv_calls.append((path, argv))
raise SystemExit(0)
monkeypatch.setattr(os, "execv", fake_execv)
monkeypatch.setattr(os, "chdir", lambda p: None)
def fake_popen(*args, **kwargs):
popen_calls.append((args, kwargs))
return None
monkeypatch.setattr(subprocess, "Popen", fake_popen)
with pytest.raises(SystemExit):
bs.main()
assert len(execv_calls) == 1, f"{var} must auto-promote to execv"
assert len(popen_calls) == 0, f"{var} must NOT use Popen"
def test_explicit_opt_in_env_auto_promotes_to_execv(self, stub_main_dependencies, clean_env, monkeypatch):
bs = stub_main_dependencies
monkeypatch.setattr(sys, "argv", ["bootstrap.py"]) # no --foreground flag
monkeypatch.setenv("HERMES_WEBUI_FOREGROUND", "1")
execv_calls = []
def fake_execv(path, argv):
execv_calls.append((path, argv))
raise SystemExit(0)
monkeypatch.setattr(os, "execv", fake_execv)
monkeypatch.setattr(os, "chdir", lambda p: None)
monkeypatch.setattr(subprocess, "Popen", lambda *a, **kw: None)
with pytest.raises(SystemExit):
bs.main()
assert len(execv_calls) == 1
class TestForegroundEnvAndCwd:
"""The post-execv server.py inherits os.environ and cwd from us."""
@pytest.fixture
def setup(self, monkeypatch, tmp_path):
import bootstrap as bs
monkeypatch.setattr(bs, "ensure_supported_platform", lambda: None)
agent_dir = tmp_path / "agent"
agent_dir.mkdir()
monkeypatch.setattr(bs, "discover_agent_dir", lambda: agent_dir)
monkeypatch.setattr(bs, "hermes_command_exists", lambda: True)
monkeypatch.setattr(bs, "discover_launcher_python", lambda *a: "/usr/bin/python3")
monkeypatch.setattr(bs, "ensure_python_has_webui_deps", lambda p: p)
monkeypatch.setattr(bs, "wait_for_health", lambda *a, **kw: True)
monkeypatch.setattr(bs, "open_browser", lambda *a, **kw: None)
# State-dir + every var we care about is captured.
monkeypatch.setenv("HERMES_WEBUI_STATE_DIR", str(tmp_path / "state"))
return bs, agent_dir
def test_foreground_chdirs_to_agent_dir_before_exec(self, setup, monkeypatch, clean_env):
bs, agent_dir = setup
monkeypatch.setattr(sys, "argv", ["bootstrap.py", "--foreground", "--host", "127.0.0.1", "9999"])
chdir_calls = []
monkeypatch.setattr(os, "chdir", lambda p: chdir_calls.append(p))
def fake_execv(*a):
raise SystemExit(0)
monkeypatch.setattr(os, "execv", fake_execv)
with pytest.raises(SystemExit):
bs.main()
assert len(chdir_calls) == 1
assert chdir_calls[0] == str(agent_dir)
def test_foreground_exports_resolved_env_vars(self, setup, monkeypatch, clean_env):
bs, agent_dir = setup
monkeypatch.setattr(sys, "argv", [
"bootstrap.py", "--foreground", "--host", "0.0.0.0", "9119"
])
monkeypatch.setattr(os, "chdir", lambda p: None)
def fake_execv(*a):
raise SystemExit(0)
monkeypatch.setattr(os, "execv", fake_execv)
with pytest.raises(SystemExit):
bs.main()
# Post-execv server.py inherits these — verify we set them on os.environ
# (not just a local copy).
assert os.environ["HERMES_WEBUI_HOST"] == "0.0.0.0"
assert os.environ["HERMES_WEBUI_PORT"] == "9119"
assert os.environ["HERMES_WEBUI_AGENT_DIR"] == str(agent_dir)
# state-dir was already set by the fixture; verify it survived.
assert "HERMES_WEBUI_STATE_DIR" in os.environ
def test_foreground_does_not_call_wait_for_health(self, setup, monkeypatch, clean_env):
bs, _ = setup
monkeypatch.setattr(sys, "argv", ["bootstrap.py", "--foreground"])
monkeypatch.setattr(os, "chdir", lambda p: None)
wait_calls = []
monkeypatch.setattr(bs, "wait_for_health", lambda *a, **kw: (wait_calls.append(a), True)[1])
def fake_execv(*a):
raise SystemExit(0)
monkeypatch.setattr(os, "execv", fake_execv)
with pytest.raises(SystemExit):
bs.main()
# In foreground mode there's no parent left to retry from — the
# supervisor's KeepAlive handles it. wait_for_health must not run.
assert len(wait_calls) == 0