Skip to content

fix: one open conversation per user, unkeyed from day_number#35

Merged
1xabhay merged 1 commit into
mainfrom
fix/single-open-conversation-per-user
Jun 11, 2026
Merged

fix: one open conversation per user, unkeyed from day_number#35
1xabhay merged 1 commit into
mainfrom
fix/single-open-conversation-per-user

Conversation

@1xabhay

@1xabhay 1xabhay commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Problem

Chat history sent to the LLM was empty at the start of every study day, observed on prod via the texet_generation snapshot (chat_history: [] on a Wednesday despite a week of messages — the bot literally replied "I don't have access to past conversations").

Root cause: conversations were keyed on (owner, status, day_number), so each hub day_number created a fresh conversation, while build_chat_history only reads the current conversation. The intended weekly context window (since_timestamp=week_start) never had a chance to apply across days. Latent since #13; made visible by the #33 snapshots.

Fix

  • One open conversation per user, keyed on (owner_speaker_id, status='open') — created on first message, reused forever. The weekly window stays enforced by the existing since_timestamp filter.
  • day_number removed from the Conversation model; it remains in utterance metadata (daily prompt selection + generation snapshot).
  • Conversation rows no longer accumulate per-request prompt meta (texet_instruction_prompt, texet_day_number, texet_user_local_time) — all captured per-reply in the bot utterance's texet_generation snapshot.
  • Admin UI: Day column/filter removed from Conversations; meta renders generically.
  • chat_history now passed via the Kani constructor.

Migration (9228797f839a)

Runs automatically on deploy via the entrypoint's alembic upgrade head:

  1. Merges each user's per-day open conversations into their earliest one (reassigns utterances, keeps max last_activity_at, deletes emptied rows) — restores full week context immediately on deploy.
  2. Strips stale prompt keys from conversations.meta.
  3. Replaces the day-scoped unique indexes with ux_conversations_owner_open.

Note: ux_conversations_owner_open_no_day had been silently cascade-dropped by d740e0090bdc (its predicate referenced the dropped day_identifier column), so prod had no uniqueness protection for no-day conversations; this migration drops it conditionally and the new index restores protection.

Verification

  • 189 tests pass; mypy clean. New regression test test_response_single_conversation_across_day_numbers asserts one conversation across day_number 1/2/none and that day-2 generation sees day-1 history.
  • Verified live locally end-to-end: migration merged real fragmented data (3 per-day conversations → 1 with all 10 utterances), then a day-4 "what was my goal this week?" message generated with a 10-message history and the bot correctly recalled the goal. Admin UI renders the snapshot and conversation list correctly.

🤖 Generated with Claude Code

Conversations were keyed on (owner, status, day_number), so every study day
started a fresh conversation. Since chat history is scoped to one conversation,
the LLM lost all context at each day boundary — observed on prod as an empty
chat_history in the generation snapshot despite a week of messages.

Conversations are now keyed only on (owner, status='open'): created on the
user's first message and reused for all subsequent ones. The weekly history
window is enforced by the existing since_timestamp filter, not by conversation
boundaries. day_number remains in utterance metadata (daily prompt selection,
generation snapshot).

The migration merges each user's per-day open conversations into their earliest
one (reassigning utterances, keeping max last_activity_at), strips per-request
prompt keys from conversations.meta (now captured per-reply in the
texet_generation snapshot), and replaces the day-scoped unique indexes with a
single one-open-conversation-per-user index. Note: the no-day partial index had
been silently cascade-dropped by d740e0090bdc when it removed the day_identifier
column its predicate referenced, so the migration drops it conditionally.

Also passes chat_history via the Kani constructor instead of attribute
assignment, and removes the now-stale day/prompt fields from the conversation
admin view.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@1xabhay 1xabhay merged commit 4a5f514 into main Jun 11, 2026
1 check passed
@1xabhay 1xabhay deleted the fix/single-open-conversation-per-user branch June 11, 2026 04:30
1xabhay added a commit that referenced this pull request Jun 11, 2026
* feat: normalize Converse messages at the Bedrock engine boundary

Merge consecutive same-role turns, prepend a placeholder user turn when
history starts with an assistant message, and drop empty/None content.
Converse requires user-first, strictly alternating, non-empty turns;
owning that constraint in the engine lets build_chat_history stay a
faithful transcript. No-op for current alternating histories.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix: make chat history a faithful SMS transcript

The LLM now sees exactly what was exchanged over SMS: hub opening
messages (texet_hub_initial) stay in history as assistant turns, bot
messages count only once delivered (sent) — dropping failed/queued —
and moderated exchanges remain withheld on both sides.

Previously only the first-ever opening was injected into the system
prompt; since conversations merged to one-per-user (#35) every later
daily opening was invisible to the model, which then hallucinated their
content or denied having context. Remove the [Opening message]
injection and get_opening_message entirely; Bedrock's user-first/
alternating constraint is owned by the engine-boundary normalization.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat: day markers in chat history + history-conventions prompt section

History spanning multiple days was an undifferentiated blob: the model
could not map 'what we talked about this week' onto its context, and
stale time references in old replies contradicted [User's Local Time].

- build_chat_history(annotate_days=True) prefixes the first message of
  each user-local calendar day with a [Tuesday, June 9] marker; the
  offset comes from per-utterance user_local_time meta (bot rows via
  their generation snapshot), backfilled for leading messages, UTC
  fallback. Only the LLM view is annotated — stored text and exports
  are untouched, and the moderation-email caller keeps the default.
- compose_instruction_prompt always appends a code-owned
  [Conversation history conventions] section telling the model what
  its context actually is: a real SMS thread since Sunday, day-marked,
  with its own openings included and safety-withheld messages absent.
- texet_generation snapshot version bumped to 2 (history semantics
  changed).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat: day-numbered activity label + CHARLA system prompt v2 doc

compose_instruction_prompt takes day_number and labels the daily
section [Today's Activity (Day N)] so the model can tie the curriculum
to the study day.

docs/prompts/charla-system-prompt-v2.md is the deployable base prompt
(paste via admin console; latest row wins). It adds what v1 lacked:
a memory self-knowledge section (the model sees this week's real SMS
thread + last week's summary — never deny it, never invent beyond it),
usage guidance for the activity/summary sections, SMS length and
anti-repetition rules, stale-time handling, and instruction privacy
decoupled from memory denial. Also recommends moving off Llama 4
Maverick 17B.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test: real-Kani e2e coverage + generation replay script

The autouse kani_stub bypassed kani entirely, so nothing proved an
assistant-first history survives a real chat round. Two new e2e tests
restore the real _generate_reply: one drives a capture engine through
the full Kani round (hub opening reaches the engine assistant-first,
day-marked, reply persisted and sent), the other drives a stubbed
BedrockEngine and asserts two back-to-back openings reach the Converse
payload merged behind the placeholder user turn.

scripts/replay_generation.py loads a bot utterance's texet_generation
snapshot and prints unified diffs of the snapshot system prompt/history
vs what current code would build — read-only, for replaying prod
generations like eb02e4ed against context changes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs: clarify prompt v2 deploy ordering (code first, paste after)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Abhay Singh <abby@Abhays-MacBook-Pro.local>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant