mnemon-dev · Grivn · May 11, 2026 · May 7, 2026 · May 7, 2026 · May 7, 2026
diff --git a/README.md b/README.md
@@ -35,7 +35,7 @@ Most memory tools embed their own LLM inside the pipeline. Mnemon takes a differ
 Mnemon also addresses a gap in the protocol stack. MCP standardizes how LLMs discover and invoke tools. ODBC/JDBC standardizes how applications access databases. But how LLMs interact with databases using memory semantics — this layer has no protocol. Mnemon's three primitives — `remember`, `link`, `recall` — form an intent-native protocol: command names map to the LLM's cognitive vocabulary (`remember` not INSERT, `recall` not SELECT), and output is structured JSON with signal transparency rather than raw database rows.
 
 <p align="center">
-  <img src="docs/diagrams/llm-supervised-concept.jpg" width="720" alt="LLM-Supervised Architecture — three patterns compared, with detailed Mnemon implementation showing hooks, brain/organ split, and sub-agent delegation" />
+  <img src="docs/diagrams/llm-supervised-concept.jpg" width="720" alt="LLM-Supervised Architecture — three patterns compared, with Mnemon hooks, protocol boundary, and deterministic memory engine" />
   <br />
   <sub>The LLM-Supervised pattern: hooks drive the lifecycle, the host LLM makes judgment calls, the binary handles deterministic computation.</sub>
 </p>
@@ -113,40 +113,50 @@ mnemon setup --eject
 
 ## How it works
 
-Once set up, memory operates transparently — you use your LLM CLI as usual. Mnemon integrates via Claude Code's [hook system](https://docs.anthropic.com/en/docs/claude-code/hooks), injecting memory operations at key lifecycle points:
+Once set up, memory operates through a lightweight harness: `SKILL.md` teaches
+commands, `GUIDELINE.md` teaches judgment, hooks remind the agent at lifecycle
+boundaries, and the `mnemon` binary executes deterministic memory operations.
+Supported setup commands automate this, but the harness is installable from
+markdown alone.
 
-```
+```text
 Session starts
-    │
-    ▼
-  Prime (SessionStart) ─── prime.sh ──→ load guide.md (memory execution manual)
-    │
-    ▼
-  User sends message
-    │
-    ▼
-  Remind (UserPromptSubmit) ─── user_prompt.sh ──→ remind agent to recall & remember
-    │
-    ▼
-  LLM generates response (guided by skill + guide.md rules)
-    │
-    ▼
-  Nudge (Stop) ─── stop.sh ──→ remind agent to remember
-    │
-    ▼
-  (when context compacts)
-  Compact (PreCompact) ─── compact.sh ──→ extract critical insights to remember
+    |
+    v
+  Prime   -> make skill, guideline, and active store visible
+    |
+    v
+User prompt arrives
+    |
+    v
+  Remind  -> decide whether recall could change this task
+    |
+    v
+Agent works and calls Mnemon only when useful
+    |
+    v
+  Nudge   -> decide whether durable writeback is justified
+    |
+    v
+Before context compaction
+    |
+    v
+  Compact -> preserve only critical continuity
 ```
 
-Four hooks drive the memory lifecycle. **Prime** loads the behavioral guide — a detailed execution manual for recall, remember, and sub-agent delegation. **Remind** prompts the agent to evaluate recall and remember before starting work. **Nudge** reminds the agent to consider remember after finishing work. **Compact** instructs the agent to extract and save critical insights before context compression. **The skill file** teaches command syntax. **The guide** (`~/.mnemon/prompt/guide.md`) defines the detailed rules for when to recall, what to remember, and how to delegate.
+The four hook phases are reminders, not a hard workflow. **Prime** makes the
+skill, guideline, and active store visible. **Remind** prompts a recall
+decision. **Nudge** prompts a writeback decision. **Compact** preserves only
+critical continuity before context compression.
 
-You don't run mnemon commands yourself. The agent does — driven by hooks and guided by the skill and behavioral guide.
+You don't run mnemon commands yourself. The agent does when the guideline says
+memory is useful.
 
 ## Features
 
-- **Zero user-side operation** — install once, memory runs in the background via hooks
+- **Zero user-side operation** — install once; supported runtimes can use hooks, minimal runtimes can use persistent rules
 - **LLM-supervised** — the host LLM decides what to remember, update, and forget; no embedded LLM, no API keys
-- **Hook-based integration** — four lifecycle hooks: Prime (load guide), Remind (recall & remember), Nudge (remember), and Compact (save before compression)
+- **Markdown-installable harness** — `SKILL.md`, `INSTALL.md`, `GUIDELINE.md`, and four lifecycle reminders
 - **Four-graph architecture** — temporal, entity, causal, and semantic edges, not just vector similarity
 - **Intent-native protocol** — three primitives (`remember`, `link`, `recall`) map to the LLM's cognitive vocabulary, not database syntax; structured JSON output with signal transparency
 - **Intent-aware recall** — graph traversal + optional vector search (RRF fusion), enabled by default for all queries
@@ -170,7 +180,11 @@ All your local agentic AIs — across sessions and frameworks — sharing one po
   Gemini CLI ───┘
 ```
 
-The foundation is in place: a single `~/.mnemon` database that any agent can read and write. Claude Code's hook integration is the reference implementation; OpenClaw uses a plugin-based approach; NanoClaw integrates via container skills and volume mounts. The same pattern can be replicated for any LLM CLI that supports event hooks or system prompts.
+The foundation is in place: a single `~/.mnemon` database that any agent can
+read and write. Claude Code setup automates hook installation; OpenClaw can use
+plugin hooks; NanoClaw integrates via container skills and volume mounts. The
+same harness can be installed in any LLM CLI that supports skills, rules,
+system prompts, or event hooks.
 
 The longer-term direction is a **memory gateway**: protocol decoupled from storage engine. The current SQLite backend is the first adapter; the protocol surface (`remember / link / recall`) can sit on top of PostgreSQL, Neo4j, or any graph database. Agent-side optimization (when to recall, what to remember) and storage-side optimization (indexing, graph algorithms) evolve independently. See [Future Direction](docs/design/08-decisions.md#82-future-direction) for details.
 
@@ -194,10 +208,15 @@ Different agents/processes can use different stores via the `MNEMON_STORE` envir
 `mnemon setup` defaults to **local** (project-scoped `.claude/`), recommended for most users. **Global** (`mnemon setup --global`, installed to `~/.claude/`) activates mnemon across all projects — convenient if you want other frameworks (e.g., OpenClaw) to share memory by forwarding requests through Claude Code CLI, but may add maintenance overhead.
 
 **How do I customize the behavior?**
-Edit `~/.mnemon/prompt/guide.md`. This file controls when the agent recalls memories and what it considers worth remembering. The skill file (`SKILL.md`) is auto-deployed and should not need manual editing.
+Edit the generated guideline (`~/.mnemon/prompt/guide.md` in current setup
+flows) or use the installable [GUIDELINE.md](docs/framework/GUIDELINE.md) as
+the source. The skill file should stay focused on command syntax.
 
 **What is sub-agent delegation?**
-Memory writes don't happen in the main conversation. The host LLM (e.g., Opus) decides *what* to remember, then delegates the actual `mnemon remember` execution to a lightweight sub-agent (e.g., Sonnet). This saves tokens and keeps memory operations out of the main context.
+Sub-agent delegation is optional. When a runtime supports it, the main agent can
+decide *what* to remember and ask a cheaper or isolated worker to execute
+`mnemon remember`. It is a useful execution strategy, not a required part of the
+Mnemon architecture.
 
 ## Configuration
 
@@ -230,7 +249,12 @@ See [Development and Deployment](docs/DEPLOYMENT.md) for Docker, Compose, Ollama
 
 ## Documentation
 
-- [Design & Architecture](docs/DESIGN.md) — philosophy, algorithms, integration design
+- [Mnemon Memory Harness](docs/framework/HARNESS.md) — skill-first memory harness design and installation guideline
+- [Harness Install Guide](docs/framework/INSTALL.md) — agent-facing installation contract
+- [Memory Guideline](docs/framework/GUIDELINE.md) — recall/writeback judgment policy
+- [Self-Evolution Harness Design](docs/design/SELF_EVOLUTION_HARNESS.md) — consolidated v0.2 architecture for install, memory loop, skill evolution, and risk control
+- [Agent Systems Research](docs/research/agent-systems/README.md) — condensed source index for memory and self-evolution research
+- [Design & Architecture](docs/DESIGN.md) — current engine architecture, algorithms, integration design
 - [Usage & Reference](docs/USAGE.md) — CLI commands, embedding support, architecture overview
 - [Architecture Diagrams](docs/diagrams/) — system architecture, pipelines, lifecycle management
 

diff --git a/docs/DESIGN.md b/docs/DESIGN.md
@@ -6,6 +6,8 @@
 
 Mnemon is a persistent memory system designed for LLM agents. It adopts the **LLM-Supervised** pattern: the host LLM acts as external orchestrator of a standalone memory binary through symbolic CLI interfaces, while the binary handles deterministic storage, graph indexing, and lifecycle management. Memory is organized as a four-graph knowledge structure with temporal, entity, causal, and semantic edges. Implemented as a single Go binary + SQLite, with no external API dependencies.
 
+This document describes the current Mnemon binary and engine architecture. The broader memory harness doctrine lives in [Mnemon Memory Harness](framework/HARNESS.md), with installable runtime artifacts in [INSTALL.md](framework/INSTALL.md) and [GUIDELINE.md](framework/GUIDELINE.md). The v0.2 self-evolution architecture is consolidated in [Self-Evolution Harness Design](design/SELF_EVOLUTION_HARNESS.md).
+
 ---
 
 ## Table of Contents
@@ -14,9 +16,9 @@ Mnemon is a persistent memory system designed for LLM agents. It adopts the **LL
 
 Why Mnemon exists — the amnesia problem in LLM agents, structural bottlenecks of traditional approaches, and a comparison with existing solutions (Mem0, MemGPT, Claude Code Memory).
 
-### [2. Design Philosophy](design/02-philosophy.md)
+### [2. Engine Design Philosophy](design/02-philosophy.md)
 
-The LLM-Supervised pattern, Organs vs Textbooks metaphor, Memory Gateway protocol (the MCP analogy for LLM↔DB interaction), key design insights, and theoretical foundations from RLM, MAGMA, and Graph-LLM structural analysis.
+The current engine's LLM-Supervised pattern, Hook-native / LLM-led / Protocol-constrained principle, Organs vs Textbooks metaphor, Memory Gateway protocol (the MCP analogy for LLM↔DB interaction), key design insights, and theoretical foundations from RLM, MAGMA, and Graph-LLM structural analysis.
 
 ### [3. Core Concepts & Architecture](design/03-concepts.md)
 
@@ -36,7 +38,11 @@ Effective Importance (EI) decay formula, immunity rules, auto-pruning, GC comman
 
 ### [7. LLM CLI Integration](design/07-integration.md)
 
-Lifecycle hooks (Prime, Remind, Nudge, Compact), skill file, behavioral guide, automated setup via `mnemon setup`, sub-agent delegation pattern, and adaptation to other LLM CLIs.
+Markdown-installable runtime integration: `SKILL.md`, `INSTALL.md`, `GUIDELINE.md`, the four hook phases (Prime, Remind, Nudge, Compact), agent-led memory decisions, optional setup automation, and lightweight markdown self-evolution.
+
+### [Self-Evolution Harness](design/SELF_EVOLUTION_HARNESS.md)
+
+The v0.2 architecture for agent-agnostic installation, canonical `.mnemon` filesystem, memory consolidation loop, skill evolution, optional maintenance runner, and proposal-first risk control.
 
 ### [8. Design Decisions & Future Direction](design/08-decisions.md)
 

diff --git a/docs/design/02-philosophy.md b/docs/design/02-philosophy.md
@@ -1,4 +1,4 @@
-# 2. Design Philosophy
+# 2. Engine Design Philosophy
 
 [< Back to Design Overview](../DESIGN.md)
 
@@ -30,6 +30,11 @@ This means:
 - **Stronger judgment capability**: An Opus-class LLM evaluates candidate links, not gpt-4o-mini
 - **LLM swappable**: The same Binary + Skill works across Claude Code, Cursor, or any LLM CLI
 
+This engine follows the broader [Mnemon Memory Harness](../framework/HARNESS.md) stance:
+hook-native, LLM-led, and protocol-constrained. The framework doctrine is kept
+separate from the current engine architecture so we can discuss principles
+without assuming today's binary is the final runtime shape.
+
 ## 2.2 Tools are Organs, Skills are Textbooks
 
 This philosophy can be understood through a game development analogy: