Skip to content

[pull] main from hookdeck:main#149

Merged
pull[bot] merged 2 commits into
erickirt:mainfrom
hookdeck:main
Jun 12, 2026
Merged

[pull] main from hookdeck:main#149
pull[bot] merged 2 commits into
erickirt:mainfrom
hookdeck:main

Conversation

@pull

@pull pull Bot commented Jun 12, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

alexbouchardd and others added 2 commits June 12, 2026 13:46
…ailing checks (#950)

Trigger: PR #945 hit a heuristic FAIL on scenario 02 because the TS SDK
flattened `outpost.publish.event(...)` to `outpost.publish(...)` in v1.3.0
(commit d875c66), but the eval check + scenario criterion + prompt were
never updated. Prior runs masked it: the agent often echoed the literal
"publish.event" from the prompt in comments, accidentally satisfying the
string-presence check. This run stuck to the SDK README wording and the
check exit-1'd with no clue in the GH Actions log — only in the artifact.

Impact: PR #945 (docs only) was blocked by an unrelated stale check. After
this fix the heuristic matches the current SDK shape, and any future
heuristic/LLM failure prints the failing check id and detail directly in
the main CI log.

Changes:
- scoreScenario02 regex now matches `.publish(` and keeps `publish.event`
  as a fallback for older transcripts.
- run-agent-eval logs each failing heuristic check + LLM criterion on
  pass=false (was: silent exit 1).
- Scenario 02 success criterion lists `outpost.publish` not `publish.event`.
- Prompt's TS counter-example uses `outpost.publish({ ... })`.
- Trajectory SDK hint pattern renamed `ts_publish` with `/\.publish\s*\(/`.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pull pull Bot locked and limited conversation to collaborators Jun 12, 2026
@pull pull Bot added the ⤵️ pull label Jun 12, 2026
@pull pull Bot merged commit 6a3eb14 into erickirt:main Jun 12, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants