Agent halts after Edit/Write phase with stop_reason: end_turn, never invokes Bash; multi-step prompts fail to complete (claude-sonnet-4-6)

## Summary

A previously-working multi-step prompt no longer completes. The agent reads files, makes Edits/Writes, then voluntarily halts with `stop_reason: end_turn` and an empty `result` string — never invoking Bash for the verification step, git ops, or `gh pr create` that the prompt explicitly directs. TodoWrite items for the unfinished steps are left in `pending`.

Reproducible across **seven runs in ~14 hours**; the immediately preceding baseline ran the same prompt successfully ~16 hours before the first failure.

## Environment

- Action SHAs tested: `8a953dedac4f533f912f13656070914693ed0575` (now-dangling — `gh api repos/anthropics/claude-code-action/git/commits/8a953ded...` returns 404) and `62238ddb33772a079b0a6d8665a1ff3043583067` (current `v1` tag target as of 2026-05-06 07:48Z).
- Bundled `@anthropic-ai/claude-agent-sdk` versions tested: `0.2.114` and `0.2.131`. Both confirmed via `bun install` output.
- Model: `claude-sonnet-4-6` (unchanged across all runs).
- `--max-turns 75`. `--allowedTools` includes the relevant Bash patterns: `Bash(pnpm install:*)`, `Bash(pnpm run lint:*)`, `Bash(pnpm run typecheck:*)`, `Bash(pnpm run test:unit:*)`, `Bash(pnpm run build:*)`, `Bash(pnpm exec prettier:*)`, `Bash(pnpm exec prisma generate:*)`, `Bash(git checkout:*)`, `Bash(git add:*)`, `Bash(git commit:*)`, `Bash(git push:*)`, `Bash(gh pr create:*)`.
- `permission_denials: []` in the result envelope (no permission blocks).

## Symptom

In the working baseline (~16h before first failure):
- 42 Bash invocations
- Multi-step prompt completed: install → prisma generate → prettier → lint → typecheck → test:unit → build → branch → commit → push → `gh pr create` → done
- Drafted PR successfully

In all seven subsequent runs:
- **0 Bash invocations**
- Agent reads spec + relevant files (Read/Grep/Glob/ToolSearch)
- Creates TodoWrite items including "Run full verification suite" and "Create branch, commit, push, open draft PR"
- Makes a sequence of Edit/Write tool calls implementing parts of the spec
- Halts with `stop_reason: end_turn`, `result: ""` (empty), `is_error: false`
- Pending TodoWrite items remain pending; nothing else happens
- Action returns `outcome: success`
- **Net effect**: silently broken — looks like a normal completion to the orchestrating workflow, but no work product is produced

## Tool-name histogram of a typical broken run

(post-bump, SDK 0.2.131)

```
   ~16  Read
     5  TodoWrite
     5  Grep
     4  Glob
     4  Edit
     2  Write
     1  ToolSearch
     0  Bash         ← here's the problem
```

## Reproducibility matrix

| Run | Action SHA | SDK | Prompt state | Bash | PR |
|---|---|---|---|---|---|
| 1 (baseline) | 8a953ded | 0.2.114 | working baseline | **42** | yes |
| 2 | 8a953ded | 0.2.114 | + 5-line skip-rerun rule | 0 | no |
| 3 | 8a953ded | 0.2.114 | + 5-line skip-rerun rule | 0 | no |
| 4 | 8a953ded | 0.2.114 | skip-rerun removed | 0 | no |
| 5 | 8a953ded | 0.2.114 | byte-identical to baseline | 0 | no |
| 6 | 8a953ded | 0.2.114 | byte-identical to baseline | 0 | no |
| 7 | **62238ddb** | **0.2.131** | byte-identical to baseline | 0 | no |

The only invariant across all seven runs is `model: claude-sonnet-4-6`. The working baseline used the same model.

## Hypothesis

A behavioral change occurred in `claude-sonnet-4-6` (or its serving stack) between ~2026-05-05 21:38Z (last working run) and ~2026-05-05 22:45Z (first failed run) that affects how the model handles long agentic loops with our prompt shape. The model decides it's "done" after the Edit/Write phase even though the prompt's procedure explicitly requires verification + git ops afterwards.

## Asks

1. **Is there a known regression** in `claude-sonnet-4-6` around `end_turn` semantics for long agentic loops?
2. **Should `claude-code-action` surface a warning** when `result` is empty AND `stop_reason` is `end_turn` AND TodoWrite items are pending? Right now `outcome: success` masks this failure mode — orchestrating workflows think the run succeeded.
3. **If reproducible on your side**, can you suggest a temporary workaround (different model, different prompt shape, etc.)?

Happy to share the full streamed transcripts (1–1.5 MB each, JSONL) — let me know how you'd like them. Also happy to share the full prompt text if helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent halts after Edit/Write phase with stop_reason: end_turn, never invokes Bash; multi-step prompts fail to complete (claude-sonnet-4-6) #1292

Summary

Environment

Symptom

Tool-name histogram of a typical broken run

Reproducibility matrix

Hypothesis

Asks

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Run	Action SHA	SDK	Prompt state	Bash	PR
1 (baseline)	8a953ded	0.2.114	working baseline	42	yes
2	8a953ded	0.2.114	+ 5-line skip-rerun rule	0	no
3	8a953ded	0.2.114	+ 5-line skip-rerun rule	0	no
4	8a953ded	0.2.114	skip-rerun removed	0	no
5	8a953ded	0.2.114	byte-identical to baseline	0	no
6	8a953ded	0.2.114	byte-identical to baseline	0	no
7	`62238dd`	0.2.131	byte-identical to baseline	0	no

Agent halts after Edit/Write phase with stop_reason: end_turn, never invokes Bash; multi-step prompts fail to complete (claude-sonnet-4-6) #1292

Description

Summary

Environment

Symptom

Tool-name histogram of a typical broken run

Reproducibility matrix

Hypothesis

Asks

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions