Skip to content

feat(hook): receipt failed tool calls via Claude Code PostToolUseFailure#856

Open
ojongerius wants to merge 2 commits into
mainfrom
worktree-issue-853-failed-receipts
Open

feat(hook): receipt failed tool calls via Claude Code PostToolUseFailure#856
ojongerius wants to merge 2 commits into
mainfrom
worktree-issue-853-failed-receipts

Conversation

@ojongerius

Copy link
Copy Markdown
Contributor

Closes #853.

Problem

In the attribution demo (#850), three agents raced on a shared file; one agent's Edit lost the race (old_string no longer matched), retried, and succeeded. The retry showed in the chain as extra reads, but the failed edit attempt produced no receipt — leaving a lost update / failed action with no failure row in the audit trail, only inferable retry activity.

Root cause: Claude Code fires PostToolUse only on success and a separate PostToolUseFailure event on failure. The hook only handled PostToolUse, so failed (and interrupted) tool calls were never receipted.

Change

  • detect() now recognises PostToolUseFailure as a claude-code frame.
  • readClaudeCode maps a PostToolUseFailure frame to decision="allowed" carrying the frame's error. The daemon already maps decision="allowed" + a non-empty error to outcome.status=failure (daemon/internal/pipeline/build.go), so no daemon change is needed — the gap was purely the hook never emitting these frames.
  • A PostToolUseFailure frame carries error (always a non-empty string from Claude Code) and is_interrupt, but no tool_response. It still carries tool_input, so the action target and parameters hash are captured as usual.
  • Defensive: a blank error is replaced with "tool call failed" (or "tool call interrupted" when is_interrupt is set) so a failure frame is never silently downgraded to success by the daemon's empty-error → success rule.

Enabling it

Add a PostToolUseFailure block to ~/.claude/settings.json alongside the existing PostToolUse one (documented in hook/README.md).

Tests

  • TestReadClaudeCode_PostToolUseFailure — error passes through; decision=allowed; input + target captured; no output; empty-error and interrupt fallbacks; success frames carry no error.
  • TestDetectPostToolUseFailure detected as claude-code.
  • TestIntegration_ClaudeCodeFailureFrame — end-to-end: the failure error reaches the listener on the wire.

go vet ./... and go test ./... pass in the hook module.

Notes

No schema change. The PostToolUseFailure payload shape was confirmed against the installed Claude Code (v2.1.177): {hook_event_name, tool_name, tool_input, tool_use_id, error, is_interrupt, duration_ms}.

Claude Code fires PostToolUse only on success and PostToolUseFailure on
failure. A hook wired to PostToolUse alone left an errored or interrupted
tool call — e.g. a lost concurrent write whose Edit no longer matched —
with no receipt at all, only inferable retry activity (#853).

The hook now handles PostToolUseFailure: detect() recognises it, and
readClaudeCode maps it to decision="allowed" with a non-empty error,
which the daemon already records as outcome.status=failure. A blank error
is replaced with "tool call failed" ("tool call interrupted" when
is_interrupt is set) so a failure frame is never silently downgraded to
success by the daemon's empty-error rule. Failure frames carry tool_input
(target + parameters still captured) but no tool_response.

Docs updated with the PostToolUseFailure settings.json block.
Code review follow-ups on the failure-frame handling:

- Decode `error` as json.RawMessage and coerce leniently. A non-string
  error value (object/number/array) previously aborted the whole-frame
  unmarshal, exiting the hook 1 and dropping the failure receipt — the
  exact loss this feature prevents. Now a schema variation degrades to
  the raw JSON text instead.
- Cap the failure message at 16 KiB (rune-safe). The error text was
  uncapped at every layer (only the 1 MiB whole-frame limit bounded it),
  so a very large message could push the frame over MaxFrameSize and make
  Emit fail. Truncation degrades to a truncated receipt, not no receipt.
- Reword the struct doc (was "always non-empty", contradicting the
  fallback below it; was "PostToolUse and PreToolUse" only).

Tests: non-string error parses without error; oversized error is
truncated with a marker rather than dropped.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failed tool calls aren't receipted — a lost concurrent write leaves no failure row

1 participant