Problem
The deployment workspace lock has a heartbeat mechanism, but it is never pumped during a running deploy — so any deploy longer than the stale threshold can be preempted by a second concurrent attempt.
Evidence
app/core/locks.py — lock is stale after 3 * heartbeat_interval_s (default 30s → 90s); heartbeat() exists at locks.py:89 but is not called from the deploy path.
app/core/deploy_trigger.py:80 — acquire_lock(...) is called once at start; no heartbeat pump loop for the duration of the ansible-runner execution.
Impact (High / P1)
Real deploys routinely exceed 90s. After the lock goes stale, a second POST to deploy the same workspace can acquire the lock and run concurrently against the same target — racing inventory/state writes.
Suggested fix
Pump heartbeat() on an interval (apscheduler job or a background task tied to the attempt lifecycle) while the runner is active; release on completion. Alternatively bump the stale threshold to exceed max expected deploy time, but a real heartbeat is preferred.
Problem
The deployment workspace lock has a heartbeat mechanism, but it is never pumped during a running deploy — so any deploy longer than the stale threshold can be preempted by a second concurrent attempt.
Evidence
app/core/locks.py— lock is stale after3 * heartbeat_interval_s(default 30s → 90s);heartbeat()exists atlocks.py:89but is not called from the deploy path.app/core/deploy_trigger.py:80—acquire_lock(...)is called once at start; no heartbeat pump loop for the duration of theansible-runnerexecution.Impact (High / P1)
Real deploys routinely exceed 90s. After the lock goes stale, a second
POSTto deploy the same workspace can acquire the lock and run concurrently against the same target — racing inventory/state writes.Suggested fix
Pump
heartbeat()on an interval (apscheduler job or a background task tied to the attempt lifecycle) while the runner is active; release on completion. Alternatively bump the stale threshold to exceed max expected deploy time, but a real heartbeat is preferred.