Real decay write-back + contradiction supersede + token budget + typed LLM errors (Q-1…Q-5)#8
Merged
Merged
Conversation
…yped LLM errors
MemoryEntry (Q-1): id defaults to "" — tools constructed entries
without it and crashed the demo in Session 1 with TypeError.
Decay (Q-2): _decay_confidence now writes decayed confidence back
(was a pass statement). New facts default to 0.9 so they CAN decay;
1.0 is pinned for explicitly verified facts. Age measured from
created_at so the write-back doesn't reset the clock. end_session
reports and prints decayed/forgotten memories.
Contradiction supersede (differentiator): detect_contradictions()
finds same-topic/same-category facts via token overlap; supersede()
demotes the old fact to 0.2 (below the 0.3 recall floor) and stores
the new one at 0.9. Demo: Session 1 stores "We use Pinecone...",
Session 3 supersedes with pgvector and announces it.
Context budget (Q-3): recall_context keeps highest-confidence
memories within config.max_context_chars (default 8000) and appends
a truncation note — compounding no longer grows the prompt unboundedly.
LLM errors (Q-4): QwenClient raises typed LLMError; 429/5xx/network
retried with backoff (1s,2s,4s, max 3 retries), 4xx fail fast.
process_message returns a clean message instead of "[LLM Error...]"
text; _auto_store skips extraction on LLM failure so error strings
are never persisted as memories.
Session accounting (Q-5): session_memories reset in start_session.
Engram backend rewritten as direct SQLite (the installed engram 0.5.x
exposes only `engram serve` — the old CLI verbs store/recall/delete
never existed, so the demo could not run against the real binary).
The backend reads/writes engram's own facts/facts_fts schema (WAL,
quoted FTS queries, soft-invalidation forget, confidence write-back),
so `engram serve` on the same db serves these memories over MCP/REST.
get_project_context now uses wildcard+category recall split by tags —
keyword queries ("conventions") never matched stored content.
requirements.txt documents stdlib-only. 23 tests + CI.
Verified end-to-end: MEMORY_BACKEND=engram demo runs against a real
SQLite store — health ok, memories persist across sessions, "4
conventions found", contradiction beat prints, LLM 401s degrade
cleanly without polluting memory.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Resolved merge conflicts from shared library extraction: - Updated imports: agent.memory → perseus_agent_core.memory - Updated imports: agent.tools → perseus_agent_core.tools - Removed deleted memory/tools files (now in perseus-agent-core) - Kept all PR features: decay write-back, contradiction detection, token-budgeted recall, typed LLMError with 429 retry, session reset Test: import paths verified, no remaining agent.memory references.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements PROMPT 1 from the pre-submission fix plan — the MemoryAgent Track requirements that were claimed in docs but absent in code, plus one discovery that changed the approach (see "Engram backend" below).
Track-requirement fixes
MemoryEntry.idnow defaults to""(the handoff doc said this was already applied — it was not on the remote). The tools layer constructed entries withoutidand crashed Session 1 with a verified TypeError._decay_confidencewrites decayed confidence back (was a literalpass). New facts default to 0.9 so they can decay; 1.0 is pinned for explicitly verified facts. Age is measured fromcreated_atso the write-back doesn't reset the decay clock.end_sessionreports and prints every decay (~ forgotten: "…" 90% → 20% (14d old)).detect_contradictions()(same-category token-overlap) +supersede()— old fact demoted to 0.2 (below the 0.3 recall floor), new fact stored at 0.9. The demo announces it:recall_contextkeeps highest-confidence memories withinmax_context_chars(default 8000, env-overridable) with an explicit truncation note. Compounding can no longer grow the prompt until the context window 400s.LLMErrorwithstatus/retryable; 429/5xx/network retried with 1s/2s/4s backoff, 4xx fail fast.process_messagereturns a clean sentence instead of[LLM Error…]text, and_auto_storeskips extraction on failure — error strings can never be persisted as memories.session_memoriesresets instart_session.Engram backend: rewritten as direct SQLite (discovery)
The installed engram-rs (v0.5.0) exposes only
engram serve— the CLI verbs the old backend shelled out to (store/recall/delete/health) don't exist, so the demo could never have run against the real binary. Per the fix plan's "update it via the CLI or direct DB access", the backend now reads/writes engram's ownfacts/facts_ftsschema directly (stdlibsqlite3, WAL, FTS5 with operator-safe quoting, soft-invalidationforget, id-upsert for confidence write-back). Anengram servepointed at the same db file serves these memories over MCP/REST — same store, two access paths. Timestamps round-trip, so age-based decay has real ages.Also fixed while verifying:
get_project_contextsearched for the literal words "conventions"/"tech stack", which stored content never contains — now wildcard recall scoped by category, split by tags ("4 conventions found" in the demo, previously 0).Verification (end-to-end, real store)
Tests cover: decay write-back values + forgotten floor + pinned-1.0 + fresh-facts, supersede demote/unrelated/identical cases, context budget keep-best/truncation-note, LLM error non-storage, FTS operator injection inertness, project isolation, and the engram round-trip with tags/timestamps. CI workflow added.
🤖 Generated with Claude Code