Skip to content

claim-grounding: agent prose state-claims have no turn-end gate #620

Description

@ahuimanu

Observed

A live failure this session. I (the agent) asserted to the operator that
"OBPI-0.0.37-22 is attested-complete" by relaying the handoff document's prose,
BEFORE reading the ledger. The claim happened to be true — but only by luck,
confirmed after the fact at .gzkit/ledger.jsonl:9937
(obpi_receipt_emitted / receipt_event:"completed" / human_attestation:true).

General shape: an agent's conversational state-claims to the operator —
"OBPI-X is attested-complete", "the lock is held", "tests pass", "the tree is
clean" — are emitted as prose with NO mechanical gate verifying them against
Layer-1/Layer-2 truth. Correctness on this surface rests entirely on author
discipline (goodwill class). This is V.I.B.E.S. on the one surface the
governance does not cover.

Expected

Per the anti-vibing mantra (AGENTS.md § MAKE LLM STOCHASTIC VIBES INERT):
"gzkit's purpose is to make stochastic LLM vibing structurally inert."
Operative claim 1: "Governance is the steering and accountability surface for
agent-driven work, not overhead." A state-claim surface trustable only by
goodwill is exactly the surface the doctrine says should be structurally gated,
not left to discipline.

AGENTS.md § Never #7 already names the correct agent behavior ("Do not read
YAML frontmatter status: Completed as proof of completion — read the ledger")
— but it is prose-only; nothing fails closed when an agent skips it.

Canonical contradiction

ADR-0.0.70's Stop-hook proves the turn-end seam CAN hold a fail-closed gate,
but its scope does not reach this surface. Per
docs/design/adr/foundation/ADR-0.0.70-turn-end-feedback-and-correction-mining/obpis/OBPI-0.0.70-01-stop-hook-turn-end-feedback.md:83:
"The hook MUST scope checks to git-dirty *.py files only" — i.e. ruff lint.
That hook (a) checks lint state, not claim-vs-receipt; (b) fires AFTER the
agent's prose is already emitted (it can force a continue, never retract a
sentence already said); (c) fails open by design (:84). The last-turn vibe
touched no Python file and produced no lint finding — it sails straight past
this hook.

So the Stop seam is reachable by a mechanical gate (0.0.70 = existence proof),
but the conversational-claim surface is uncovered.

Class of failure

T1→T2 doctrine drift: a declared invariant (anti-vibing mantra; § Never #7)
with no mechanical fail-close — the same family as GHI #574 (resume
"advise-not-execute" gate is prose, not mechanized) and the canonical
#459/#460 sibling-cut class. Input family: ANY agent turn that emits a
present-tense state/completion claim about governance truth without a
co-present receipt.

Why this is hard (honest framing — do NOT overclaim a fix)

Deterministic claim-checking is not decidable the way ruff is. A stochastic
LLM judge policing a stochastic claimer just relocates the vibe (turtles). The
root cause is known; the FIX SHAPE is the open question and needs an operator
design conversation. Candidate shapes to investigate, root-cause-first:

  • (a) Structured-claim discipline. Completion/state assertions must carry an
    inline receipt token (ledger event id / ARB receipt id / file:line) whose
    PRESENCE a Stop-hook regex can verify deterministically — gating form
    (claim-without-citation), not truth.
  • (b) Telemetry-only turn-end sensor. Logs unbacked-claim patterns to feed
    the session-correction-mining ladder (ADR-0.0.70 OBPI-02). Pairs with
    GHI correction-mining: miner has no negative-signal run telemetry #614 (miner lacks negative-signal telemetry).

Scope hint (advisory, for routing)

  • Estimated diff: larger / unknown — needs design conversation before sizing.
  • Surfaces touched (candidate): src/gzkit/hooks/scripts/quality.py (Stop-hook
    template source), a new turn-end sensor, the ADR-0.0.70 OBPI-02 mining surface.
  • In-flight vs. new feature: planned (new mechanization fence).

Campaign home

Phase E (ceremony/harness mechanization) — additive proof-mechanization,
sibling to E.4 (dispatch drift-guards) and E.5 (governance-friction drainage),
descendant of B.0 / ADR-0.0.70. Booked as campaign item E.6. The campaign
owns sequencing the true root-cause fix per the 2026-06-14 governance-friction
drainage amendment (file at the moment of discovery; discover root cause before
booking the fix-shape). Open-with-blocker, not a same-session destination:
the fix shape is genuinely a pending operator design conversation, not
dead-lettered.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    defectTracked defect discovered during implementation or governance workruntimeRuntime-affecting change (eligible for patch release)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions