Skip to content

Bound slave timeout killing#67

Open
salva wants to merge 1 commit into
masterfrom
fix/slave-timeout-kill-escalation
Open

Bound slave timeout killing#67
salva wants to merge 1 commit into
masterfrom
fix/slave-timeout-kill-escalation

Conversation

@salva

@salva salva commented Jun 4, 2026

Copy link
Copy Markdown
Owner

Summary

Prevent _waitpid from waiting indefinitely after a slave process times out.

Changes

  • Add a bounded slave signal escalation sequence.
  • Send TERM initially, escalate to KILL, and stop retrying after the existing 20-attempt style limit.
  • Keep the existing timeout error reporting.

Fixes #38.

Testing

  • perl -Ilib -c lib/Net/OpenSSH.pm
  • perl -Ilib t/1_run.t

Copilot AI review requested due to automatic review settings June 4, 2026 11:59

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses issue #38 by preventing Net::OpenSSH::_waitpid from potentially waiting forever after a slave operation times out and the child ignores TERM, by introducing a bounded signal escalation sequence.

Changes:

  • Add a slave-specific kill escalation schedule (TERMKILL).
  • Rate-limit kill attempts to once per second and stop retrying after the existing “>20 attempts” limit.
  • Preserve the existing timeout error reporting (OSSH_SLAVE_TIMEOUT).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/Net/OpenSSH.pm
Comment on lines 701 to +702
my @kill_signal = qw(0 0 TERM TERM TERM KILL);
my @slave_kill_signal = qw(TERM TERM TERM KILL);
Comment thread lib/Net/OpenSSH.pm
kill $sig => $pid;
}
$self->_or_set_error(OSSH_SLAVE_TIMEOUT, "ssh slave failed", "timed out");
return undef if $kill_count > 20;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Slave timeout handling can hang forever after TERM

2 participants