Skip to content

feat: reproducible, scriptable Deploy (tag pin, clone reuse, --yes, distro check)#12

Open
mehowz wants to merge 5 commits into
zenon-network:masterfrom
ZenonOrg:feat/deploy-robustness
Open

feat: reproducible, scriptable Deploy (tag pin, clone reuse, --yes, distro check)#12
mehowz wants to merge 5 commits into
zenon-network:masterfrom
ZenonOrg:feat/deploy-robustness

Conversation

@mehowz

@mehowz mehowz commented Apr 23, 2026

Copy link
Copy Markdown

Summary

Operator-facing polish for the Deploy flow, bundled as one PR because
all four changes pay each other off: they turn Deploy from an
interactive-only tool into something reproducible and script-friendly.

Stacks on #11 — please review / merge that first. The diff this
PR adds over #11 is 256 insertions / 32 deletions across the three
commits below.

Changes (one commit per logical change)

1. Distro detection — commit `55c931e`

Today the tool assumes `apt`. On RHEL / Arch / NixOS / Alpine the
first apt call fails with a cryptic error and the operator has to
guess. New `_readOsRelease` / `_isDebianFamily` helpers parse
`/etc/os-release` and abort with one actionable message naming the
supported distros and listing the prereqs to install manually
(`git`, `build-essential`, `linux-libc-dev`, `wget`, `go 1.22+`).

Derivatives recognised via `ID_LIKE` (Mint, Raspbian, etc.), not just
the narrow `ID` check.

2. Pin go-zenon to a tagged release + reuse existing clone — commit `428d5fd`

Two related reproducibility wins:

  • Tag pin (`znnRefTag = 'v0.0.8'`). Previously `git clone`
    pulled `master` at whatever commit it happened to be when you ran
    Deploy — so two operators following the same runbook a day apart
    could land different `znnd` binaries. Tag pin makes output
    deterministic across operators and time. Bumped in lockstep with
    controller releases when a new go-zenon tag is the recommended
    mainnet version.
  • Clone reuse. If `/root/go-zenon` already exists and
    `remote.origin.url` matches `znnGithubUrl`, the tool now
    `git fetch` + `git reset --hard refs/tags/$znnRefTag` instead
    of deleting ~200MB and re-downloading. Foreign directories
    (different remote, scratch files) still get nuked and cloned fresh
    — only a legit prior clone is reused. Accelerates re-deploys from
    several minutes to seconds when the tag hasn't changed.

Clone now uses `--depth 1 --branch $znnRefTag` on first deploy too.

3. Non-interactive `--yes --deploy` mode + action flags — commit `c31ff27`

Small argparse block at the top of `main` so the controller can be
driven from Ansible / Terraform / bash scripts without TTY.

Flag Effect
`--deploy` / `--status` / `--start-service` / `--stop-service` / `--resync` Skip the menu, go straight to the matching action
`-y` / `--yes` Auto-confirm every prompt (echoes what was answered so the run log shows the choice)
`-h` / `--help`, `-v` / `--version` Info commands, no side effects

Semantics:

  • All `confirm()` calls in Deploy / Resync now go through
    `_confirmOrDefault`, returning the default in `--yes` mode.
  • Existing-keystore password prompt reads `ZNN_KEYSTORE_PASSWORD`
    when `--yes` is set; aborts with an actionable error if missing,
    and retries are disabled in `--yes` mode to fail fast.
  • Fresh-keystore generation (the common first-deploy path) was
    already unattended-friendly — the existing
    `RandomStringGenerator` fallback produces a password without
    prompting.
  • Passing two action flags (e.g., `--deploy --status`) exits `2`
    with a clear error instead of silently picking one.

Interactive mode is unchanged when no flags are passed — menu, prompts,
retry counters all behave as before.

Test plan

Docker matrix against Ubuntu 24.04, all green:

  • `--help` / `--version` print + exit 0
  • Unknown arg → exit 2, clear message
  • Conflicting action flags → exit 2, clear message
  • Fake Arch `/etc/os-release` (bind-mount) → distro abort with
    "supports Debian and Ubuntu family" message
  • Pre-seeded `/root/go-zenon` with matching remote → reuse path
    fires, fetch + reset attempted (not delete + re-clone)
  • `--yes --deploy` with `</dev/null` stdin → enters Deploy flow,
    skips install, no hang on menu / confirm / ask
  • Regression: hang-fix + dpkg-lock detection from fix: Deploy hangs indefinitely on Ubuntu 24.04 during apt install #11 still green

Scope

Interactive behaviour is identical when no new flags are passed — this
PR is strictly additive. No CLI flag defaults change, no existing
prompts changed, no prompts removed.

mehowz added 5 commits April 23, 2026 19:03
The Deploy flow calls apt, wget, tar, git, and go build via
Process.runSync with no timeout, no DEBIAN_FRONTEND=noninteractive,
no streamed stdout. On Ubuntu 24.04 the first `apt -y install
linux-kernel-headers` call can block indefinitely on a dpkg lock
(unattended-upgrades) or an interactive prompt, and the operator
sees no output between "Git installation detected" and the hang.
Reproduced on a fresh Hetzner 24.04 host; only workaround was to
pre-install `golang-go build-essential` manually and rebuild znnd
from source.

This change:

- Adds `_runStreaming` that uses `Process.start`, streams stdout/
  stderr live, accepts a timeout, and SIGTERM→SIGKILL if exceeded
- Adds `_isDpkgLocked` precheck via `fuser /var/lib/dpkg/lock-frontend`
  so the deploy aborts with an actionable message instead of blocking
- Adds `_hasCommand` / `_hasDebPackage` helpers so already-installed
  tools (git, build-essential, linux-libc-dev, wget, Go) are skipped
  cleanly — a host with manual `apt install golang-go build-essential`
  now sails through prereqs instead of re-running every apt step
- Routes all apt invocations through `_aptInstall` with
  DEBIAN_FRONTEND=noninteractive + confold/confdef options so dpkg
  never waits on config-file prompts
- Replaces `linux-kernel-headers` (transitional name, may not exist
  on Ubuntu 22.04+) with `linux-libc-dev`, accepting either package
  in the skip check for legacy compatibility
- Makes `_buildFromSource` pick `go` from PATH if /usr/local/go is
  absent, matching the detection in the prereq step
- 15-minute timeout on `go build` (slow hosts), 10-min on apt,
  5-min on git clone / wget, 2-min on tar extract

Caller in main() awaits both async functions — both were bool, now
Future<bool>. No other call sites.
…r bound

- goLinuxDlUrl bumped to go1.22.12 with verified go.dev/dl SHA256.
  go-zenon's go.mod still says `go 1.20`, so 1.20.x would technically
  compile it, but Go only maintains security updates for the two most
  recent minor versions. 1.22 is the current LTS floor.
- pubspec.yaml sdk constraint relaxed from '>=2.14.0 <3.0.0' to
  '>=2.14.0 <4.0.0'. The existing constraint prevents `dart pub get`
  on any Dart 3.x release, which is what `dart-lang/setup-dart@v1.5.0`
  (used by the release workflow) installs by default. dcli 3.0.2
  targets Dart 3.0 exactly; keep that implicit via pubspec.lock.

Verified in Docker (Ubuntu 24.04 target):
- target-prepped (golang-go + build-essential pre-installed): controller
  detects each prereq, skips all apt calls, advances to `git clone` +
  `go build` with streamed output. Reproduces the zenonorg5 operator's
  working scenario.
- target-locked (flock holding /var/lib/dpkg/lock-frontend): controller
  aborts in ~2s with the actionable lock-contention error instead of
  hanging. No apt call issued.
Today the Deploy path assumes apt. On a CentOS / RHEL / Arch / NixOS
host the first apt call fails with a cryptic error and the operator
has to guess what went wrong. Parse /etc/os-release up front and
abort with a one-shot actionable message: which distros are
supported, what prereqs to install manually, and that Deploy will
transparently skip the install step on a pre-prepped host thanks to
the _hasCommand / _hasDebPackage detection already landed.

Recognizes derivatives via ID_LIKE (e.g., Linux Mint, Raspbian) not
just the narrow ID==debian/ubuntu check.
Two related reproducibility fixes:

- Clone pins to `--branch znnRefTag --depth 1`. Previously Deploy
  pulled go-zenon master at whatever commit happened to be there at
  deploy time, so two operators following the same runbook a day
  apart could land different znnd binaries. Tag pin makes the deploy
  output deterministic across operators and across time. Bumped in
  lockstep with controller releases when a new go-zenon tag is the
  recommended mainnet version (v0.0.8 for 0.0.5 of this tool).
- Existing clone detection: if /root/go-zenon exists and its origin
  remote matches znnGithubUrl, fetch + hard-reset to the pinned tag
  instead of deleting and re-downloading ~200MB of modules. Foreign
  directories (different remote, scratch files) still get nuked and
  fresh-cloned — only a legit prior clone is reused. Accelerates
  re-deploys from several minutes to seconds when the tag hasn't
  changed.
Adds a small argparse block at the top of main() so the controller
can be driven from Ansible / Terraform / bash scripts without TTY.

Flags:
  --deploy / --status / --start-service / --stop-service / --resync
      Jump straight to the matching menu action.
  -y / --yes
      Auto-confirm every prompt (echoes what was answered so the run
      log shows the choice).
  -h / --help, -v / --version
      Info commands, no side effects.

Semantics:
  - All confirm() calls in Deploy / Resync now go through
    _confirmOrDefault, returning the default in --yes mode.
  - Existing-keystore password prompt reads ZNN_KEYSTORE_PASSWORD
    when --yes is set; aborts with an actionable error if missing
    rather than blocking on ask(). Retries are disabled in --yes mode
    to fail fast.
  - Fresh-keystore generation (the common first-deploy path) is
    already unattended-friendly — the existing RandomStringGenerator
    fallback produces a password without any prompt.

Conflict detection: specifying two action flags (e.g., --deploy
--status) exits 2 with a clear error instead of silently picking one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant