Skip to content

[Feature] Memory calibration agent — validate thread memory against project config #21

@davesleal

Description

@davesleal

Problem

Thread memory persists stale or incorrect context across threads. When the agent hallucinates during one conversation (e.g., calls a web app a "macOS weather app"), that gets saved to thread memory and poisons every future conversation in that channel.

Currently the only fix is manually deleting .shellack/thread-memory/*.md.

Solution

A memory calibration agent (Haiku) that runs on every thread memory load:

  1. Reads the thread memory
  2. Compares against the project config (name, language, platform, description from projects.yaml)
  3. Flags or removes contradictions
  4. Returns only validated context

Flow

New thread starts → load thread memory → calibration check:

Haiku:
  "Does this thread memory match the project config?
   Project: Atmos, platform: web, language: typescript
   Memory says: macOS app, Swift
   → INVALID. Discard memory, start fresh."

Calibration rules

  • If memory references a different platform than config → discard
  • If memory references a different language than config → discard
  • If memory contains open questions that STATE.md already answers → resolve them
  • If memory is older than 24h → flag for review, prefer STATE.md
  • If memory aligns with config → pass through

Cost

One Haiku call per new thread (~$0.0005). Only runs when thread memory exists. Skip if no thread memory file found.

Alternative: TTL-based expiry

Simpler approach — thread memory files expire after a configurable TTL (e.g., 24h). Stale files are deleted on load. No Haiku call needed.

Could combine both: TTL for age-based cleanup + calibration for content validation.

Priority

High — this is actively causing bad UX. Every wrong answer gets persisted and repeated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions