Skip to content

fix: align the LLM's conversational view with the SMS transcript#37

Merged
1xabhay merged 6 commits into
mainfrom
fix/llm-context-fidelity
Jun 11, 2026
Merged

fix: align the LLM's conversational view with the SMS transcript#37
1xabhay merged 6 commits into
mainfrom
fix/llm-context-fidelity

Conversation

@1xabhay

@1xabhay 1xabhay commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Why

Prod utterance eb02e4ed (Bedrock / Llama 4 Maverick) showed the bot repeatedly denying it had chat history it actually had, hallucinating the content of a hub opening it couldn't see, and confusing times across days. Root causes: hub openings were stripped from history (only the first-ever one was injected into the system prompt — stale since #35), failed sends stayed in history, multi-day history had no day boundaries, and the prompt never told the model what it remembers.

What

Five test-gated steps, in deploy-safe order:

  1. Bedrock engine normalizationnormalize_converse_messages merges consecutive same-role turns, prepends a [start of conversation] user placeholder for assistant-first history, drops empty/None content. Owns the Converse user-first/alternating constraint at the engine boundary so the transcript layer doesn't have to lie. No-op for current traffic.
  2. Transcript fidelity — history = exactly what crossed SMS: hub openings included as assistant turns ([Opening message] injection and get_opening_message removed), bot messages only when status=sent (failed/queued dropped), moderated excluded on both sides.
  3. Day markers + conventions — first message of each user-local calendar day is prefixed [Tuesday, June 9] (offset from per-utterance user_local_time, UTC fallback); every system prompt ends with a code-owned [Conversation history conventions] section telling the model its context is a real multi-day SMS thread it does remember. texet_generation snapshot version → 2.
  4. Prompt v2 + label — daily section becomes [Today's Activity (Day N)]; docs/prompts/charla-system-prompt-v2.md is the deployable base prompt (memory self-knowledge, SMS constraints, anti-repetition, instruction privacy without amnesia claims) — paste via admin console after this deploys.
  5. Verification — e2e tests through the real Kani round (assistant-first history survives generation; two openings merge correctly for Bedrock), plus scripts/replay_generation.py to diff any stored generation snapshot against current-code context.

Testing

  • 203 tests pass (make test), mypy clean, no new lint errors vs main.
  • Post-deploy: replay eb02e4ed on prod and inspect the context diff before any prompt/model change.

🤖 Generated with Claude Code

Abhay Singh and others added 6 commits June 11, 2026 15:50
Merge consecutive same-role turns, prepend a placeholder user turn when
history starts with an assistant message, and drop empty/None content.
Converse requires user-first, strictly alternating, non-empty turns;
owning that constraint in the engine lets build_chat_history stay a
faithful transcript. No-op for current alternating histories.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The LLM now sees exactly what was exchanged over SMS: hub opening
messages (texet_hub_initial) stay in history as assistant turns, bot
messages count only once delivered (sent) — dropping failed/queued —
and moderated exchanges remain withheld on both sides.

Previously only the first-ever opening was injected into the system
prompt; since conversations merged to one-per-user (#35) every later
daily opening was invisible to the model, which then hallucinated their
content or denied having context. Remove the [Opening message]
injection and get_opening_message entirely; Bedrock's user-first/
alternating constraint is owned by the engine-boundary normalization.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
History spanning multiple days was an undifferentiated blob: the model
could not map 'what we talked about this week' onto its context, and
stale time references in old replies contradicted [User's Local Time].

- build_chat_history(annotate_days=True) prefixes the first message of
  each user-local calendar day with a [Tuesday, June 9] marker; the
  offset comes from per-utterance user_local_time meta (bot rows via
  their generation snapshot), backfilled for leading messages, UTC
  fallback. Only the LLM view is annotated — stored text and exports
  are untouched, and the moderation-email caller keeps the default.
- compose_instruction_prompt always appends a code-owned
  [Conversation history conventions] section telling the model what
  its context actually is: a real SMS thread since Sunday, day-marked,
  with its own openings included and safety-withheld messages absent.
- texet_generation snapshot version bumped to 2 (history semantics
  changed).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
compose_instruction_prompt takes day_number and labels the daily
section [Today's Activity (Day N)] so the model can tie the curriculum
to the study day.

docs/prompts/charla-system-prompt-v2.md is the deployable base prompt
(paste via admin console; latest row wins). It adds what v1 lacked:
a memory self-knowledge section (the model sees this week's real SMS
thread + last week's summary — never deny it, never invent beyond it),
usage guidance for the activity/summary sections, SMS length and
anti-repetition rules, stale-time handling, and instruction privacy
decoupled from memory denial. Also recommends moving off Llama 4
Maverick 17B.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The autouse kani_stub bypassed kani entirely, so nothing proved an
assistant-first history survives a real chat round. Two new e2e tests
restore the real _generate_reply: one drives a capture engine through
the full Kani round (hub opening reaches the engine assistant-first,
day-marked, reply persisted and sent), the other drives a stubbed
BedrockEngine and asserts two back-to-back openings reach the Converse
payload merged behind the placeholder user turn.

scripts/replay_generation.py loads a bot utterance's texet_generation
snapshot and prints unified diffs of the snapshot system prompt/history
vs what current code would build — read-only, for replaying prod
generations like eb02e4ed against context changes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@1xabhay 1xabhay merged commit 31a5d58 into main Jun 11, 2026
1 check passed
@1xabhay 1xabhay deleted the fix/llm-context-fidelity branch June 11, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant