Skip to content

feat: direct Proxmox API fast-path delegation on devkit lifecycle scripts #119

Description

@hyde-repo

feat: direct Proxmox API fast-path delegation on devkit lifecycle + snapshot scripts

Umbrella: #114
Wave: WAVE_04 (since 2026-06-05)

Problem

Devkit lifecycle scripts (list, start, stop, stop_force, pause, resume, list_vm_and_extract_vm_name) and VM snapshot scripts (create_snapshot, revert_snapshot, list_snapshot, delete_snapshot) go through ansible-playbook. Each call incurs Python cold-start (~1-2s) even when the Proxmox HTTPS API would answer in ~50ms. Scenario scripts (demo_lab.delete_vms_only.sh, delete_all, reset, setup) and range42-context snapshot / revert / snapshot-list chain many of these calls and pay the cumulative cost.

Solution

Add a direct-API fast-path next to each lifecycle script, plus an auto-delegation block in the existing slow-path script. The fast-path becomes the default when the API is reachable ; the existing ansible+ssh path remains a transparent fallback.

  1. New helper proxmox__inc.api_reachable.sh : decrypts the vault, GETs /api2/json/nodes/<node>/qemu with PVEAPIToken auth, returns 0 on HTTP 200, 1 otherwise. 2s timeout. Silent on stdout/stderr.

  2. Eleven new <script>_with_api.to.jsons.sh that call the API directly via curl -sk, mirror the canonical JSON shape (source set to "proxmox-api"), and conserve the pipeline stdin pattern (plain text or JSON lines, one line per vm_id). UPID polling preserved per action :

    • 30s : start, stop, stop_force
    • 10s : pause, resume
    • 60s : snapshot create / delete
    • 120s : snapshot revert (can be slow on big disks)
    • none : list, list_vm_and_extract_vm_name, snapshot list (GET only)
    • snapshot scripts accept richer stdin (vm_id + optional vm_snapshot_name + vm_snapshot_description ; same shape as the slow path).
  3. Existing slow-path scripts get an early delegation block, placed before proxmox__inc.warmup_checks.sh :

    if [[ "${RANGE42_PROXMOX_API_FORCE:-auto}" != "off" ]]; then
      if proxmox__inc.api_reachable.sh ; then
        devkit_utils.text.echo_trace.to.text.to.stderr.sh "..."
        exec <script>_with_api.to.jsons.sh "$@"
      else
        devkit_utils.text.echo_trace.to.text.to.stderr.sh "..."
      fi
    fi

    exec keeps stdin / args / signals consistent. The transparent decision is announced on stderr via devkit_utils.text.echo_trace.to.text.to.stderr.sh.

  4. Per-VM errors inside the fast-path loop (VM not running, transient API hiccup, non-extractable vm_id line) emit a :: TRACE :: on stderr and continue to the next VM. The batch is never aborted by a single bad VM.

  5. Override env var :

    • RANGE42_PROXMOX_API_FORCE=off keeps the ansible slow path unconditionally (debug, known-bad API).
    • RANGE42_PROXMOX_API_FORCE=auto (default) probes once per script invocation and delegates if reachable.

Flow

        ┌─────────────────────────────────────────┐
        │  proxmox_vm.list.to.jsons.sh            │  <- entry point (scenarios + range42-context call these)
        │  proxmox_vm.vm_id.{start,stop,           │
        │    stop_force,pause,resume,              │
        │    list_vm_and_extract_vm_name}.sh       │
        │  proxmox_snapshot_vm.vm_id.{list,        │
        │    create,revert,delete}_snapshot.sh    │
        └───────────────────┬─────────────────────┘
                            │
                            ▼
                ┌───────────────────────┐
                │  API reachable ?      │
                └───┬─────────────┬─────┘
                    │ yes         │ no
                    ▼             ▼
        ┌───────────────────┐  ┌──────────────────────┐
        │ exec _with_api    │  │ existing ansible     │
        │ (curl direct)     │  │ + ssh logic          │
        └───────────────────┘  └──────────────────────┘

Impact on scenarios and range42-context

Scenario scripts and range42-context commands gain the perf transparently because they call the slow-path entry point, which now auto-delegates. No scenario-side or range42-context-side change is required.

  • demo_lab.delete_vms_only.sh, delete_all, reset, setup : faster, no diff.
  • range42-context snapshot, revert, snapshot-list : faster, no diff.
  • pipeline pattern conserved : list | pause and similar chains still work with both paths emitting the same JSON shape (the source field is the only visible difference : proxmox vs proxmox-api).

References

Metadata

Metadata

Assignees

No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions