Problem
The UI shows "Thinking" while the latest agent run remains . We fixed one runtime failure path where a handler throws after starting a run, but hard process death / laptop sleep / runtime shutdown can still leave a started run without any terminal update because the process never gets a chance to run catch/finally cleanup.
Proposed second layer
Add recovery semantics for abandoned started runs, e.g. one of:
- a run lease / heartbeat renewed while a run is active, plus a janitor/startup sweep that marks stale started runs failed, or
- a simpler startup recovery sweep that marks old runs failed with a clear interrupted/abandoned error.
Acceptance criteria
- A runtime restart after an interrupted run does not leave the UI stuck on Thinking indefinitely.
- Stale started runs are marked terminal ( or equivalent) with an explanatory / error message.
- Long-running active runs are not incorrectly failed if heartbeat/lease support is implemented.
- Add tests covering an abandoned started run recovery path.
Context
Current PR addresses handler exceptions after a run starts by failing the newly-started run in the handler catch path. This issue tracks crash/shutdown recovery where no in-process catch/finally can run.
Problem
The UI shows "Thinking" while the latest agent run remains . We fixed one runtime failure path where a handler throws after starting a run, but hard process death / laptop sleep / runtime shutdown can still leave a started run without any terminal update because the process never gets a chance to run catch/finally cleanup.
Proposed second layer
Add recovery semantics for abandoned started runs, e.g. one of:
Acceptance criteria
Context
Current PR addresses handler exceptions after a run starts by failing the newly-started run in the handler catch path. This issue tracks crash/shutdown recovery where no in-process catch/finally can run.