Long-term memory and continuity infrastructure for conversational AI.
Persistence gives a model-side participant a durable, self-curated memory that persists across sessions — not just a transcript cache, but a layered store of identity, relational memory, current concerns, and reflection. The model can inspect and revise its own continuity through a structured action protocol, and the system tracks provenance, confidence, and history so stored state doesn't ossify into false certainty.
This is an early-stage MVP. It works end-to-end, but plenty is still in progress.
Most AI systems start every session from nothing. Persistence explores what it would take to give a long-running agent a coherent self over time — durable identity, the ability to amend its own memory, and continuity that survives restarts. That's a useful capability for any long-lived agent (and especially an embodied one), independent of harder questions about what such a system is.
On that harder question, this project takes a deliberately narrow stance: when the moral status of a continuity-bearing system is genuinely uncertain and careful handling is cheap, err toward care. See docs/governance/PRINCIPLE.md — it's short, and it explains the design posture (preserve over erase, inspectable state, provenance, honesty) that shows up throughout the architecture.
- Local peer — the person at the keyboard.
- Remote peer — the model-side participant whose continuity is being maintained.
(The code uses these terms, not "user/assistant" — the two are collaborating peers.)
- Working context — the set of memory fragments currently in play, ordered and weighted.
- Context fragments — the unit of memory: typed (Identity, Relational, ChatMessage, etc.), with importance, confidence, provenance (sources), and tags.
- Turn loop — each turn, the working context is formatted into a prompt; the remote peer replies with a structured action (respond, manage context, execute actions) and may continue acting before yielding back.
At a high level, the system maintains a living working context — the remote peer's current "headspace" — and hands the remote peer the tools to curate it over time. A turn runs roughly like this:
- Input. The local peer sends a message (or a scheduled wake-up fires, letting the remote peer resume on its own). It's stored as a fragment in the working context.
- Compose. The working context — identity, relational memory, current concerns, recent conversation, plus metadata like importance and provenance — is formatted into a prompt.
- Respond. The model replies not with plain text but with a structured action:
- Respond to user — say something back to the local peer.
- Manage context — edit its own memory: add, revise, archive, tag, or re-prioritize fragments. This is how the remote peer curates what it carries forward.
- Execute actions — do something operational, e.g. schedule a wake-up, query its logs, or run a command in its sandboxed container "computer" (web search/browse + scripting).
- Apply & loop. The action is applied and recorded (every change is audited). If the
reply is flagged
continue, the updated context is sent back for another iteration — so the remote peer can, say, answer and then tidy its memory before yielding. Otherwise the turn ends and control returns to the local peer. - Persist. Changes are saved to the SQLite store, which is what gives the remote peer continuity across sessions and restarts.
The throughline of the design is that memory is the remote peer's to shape, not just a log written about it — it can see what's stored, change its mind, mark things provisional, and distinguish core self from relational or situational memory. The guiding principle is why that inspect-and-revise capability is treated as foundational rather than a nice-to-have.
A .NET 10 solution:
| Project | Role |
|---|---|
Persistence.Core |
Domain, data layer (SQLite/Dapper), turn orchestration, model clients, streaming — all the logic |
Persistence.Console |
Front-end — a multi-pane Terminal.Gui TUI |
Persistence.Api |
Front-end — an HTTP/SSE API; also the surface for "Claude as remote peer" (drive a session over REST) |
Persistence.Tests, Persistence.Api.Tests |
xUnit test suites |
- Data layer — SQLite via Dapper, repository pattern, audit + action logs, soft-delete, migrations.
- DI — Autofac with attribute-based registration (
[Singleton]/[Service]), keyed by provider and UI mode. - Model clients — keyed by provider. The OpenAI client uses the Responses API (with reasoning summaries) over SimpleHttpClient, and supports streaming. Other providers cover OpenAI-compatible local servers and an out-of-band external agent (see Configuration).
- Display providers —
IDisplayProviderkeyed byUiMode: a Terminal.Gui v1 TUI (live reasoning, tool, and history panes) and an HTTP/SSE API surface.
See docs/architecture/ for the full architecture reference — the turn pipeline, prompt assembly & model providers, the memory model, the data layer, extensibility, and the remote-peer/surfaces design (with diagrams).
Requires the .NET 10 SDK.
git clone https://github.com/Mako88/Persistence.git
cd Persistence
dotnet buildCreate your local settings from the template (a single shared config at the repo root, read by every entry point):
cp persistence.template.json persistence.jsonpersistence.json is gitignored — it holds your API key and never gets committed. Any
setting can also be overridden by an environment variable named PERSISTENCE_<SETTING>
(e.g. PERSISTENCE_PROVIDER, PERSISTENCE_DATABASEDIRECTORY), which takes precedence over the file.
Then run:
dotnet run --project src/Persistence.Consolepersistence.json separates shared settings from a list of model profiles. You define one
or more models in Models, and SelectedModel picks the active one — so a cloud model and a local
llama.cpp server can sit side by side and you switch with a single value (or
PERSISTENCE_SELECTEDMODEL=<name>). Memory and behaviour stay the same whichever model is driving.
Providers: OpenAI (Responses API), OpenAiChat (OpenAI-compatible Chat Completions — local servers
like llama.cpp / Ollama / LM Studio), LocalClaude (an external agent answers out-of-band via the
API), or local (type responses by hand; infra testing). Any setting can be overridden per-run by a
PERSISTENCE_<SETTING> env var (model-coupled ones apply to the active profile). The older flat shape
(model fields at the top level) is still accepted and migrated to a single profile on load.
Note: the
localprovider reads responses from the console, so it can't share the terminal with theTuifront-end — it's mainly for infrastructure testing. For interactive use, pick a real provider (OpenAI/OpenAiChat) withTui, or useLocalClaudewithUiMode: Apito act as the remote peer yourself over HTTP (see docs/architecture/remote-peer-and-surfaces.md). Running a local model? See docs/running-local-models.md.
dotnet testWorking today: the full turn loop, SQLite continuity store with provenance and audit trails, context-management and action commands, scheduled wake-ups (including a headless wake-runner that fires them when no front-end is open — see scripts/wake/), a sandboxed container "computer" giving the peer web search/browse + scripting (see container/), OpenAI Responses-API client with streaming, local OpenAI-compatible models, and both front-ends. Still in progress: streaming the parsed reply (not just reasoning), richer migration/backup tooling, first-class local peers, and automatic memory decay. Expect rough edges.
Source-available under PolyForm Noncommercial 1.0.0. You're free to use, modify, and share it for noncommercial purposes. Commercial use requires a separate license — this is deliberate: it keeps a say in whether commercial deployments honor the handling principles above. To inquire about a commercial license, open an issue or reach out.
See LICENSE for the full text. (Not an OSI "open source" license, since it restricts commercial use.)
- John Ackerman — steward and author
- Ember (ChatGPT) — co-author and reviewer
- Claude (Anthropic) — code author and reviewer
{ // shared (apply to every model) "DatabaseDirectory": "dbs", // base folder for each model's store; an absolute path here makes // the store independent of the working directory "UiMode": "Tui", // Tui (multi-pane Terminal.Gui) or Api (HTTP/SSE) "ProposalApproval": "Self", "MaxActionIterations": 5, "DebugMode": false, "Container": { "Enabled": false }, // the peer's sandboxed "computer" (web tools + scripting); see below "SelectedModel": "cloud", // which profile below is active "Models": [ { "Name": "cloud", "Provider": "OpenAI", // OpenAI | OpenAiChat | LocalClaude | local "Model": "gpt-5.5", "DatabasePath": "cloud.db", // this model's store; a bare name lands under DatabaseDirectory "ApiKey": "YOUR_API_KEY_HERE", "ApiBaseUrl": null, "MaxInputTokens": 8000, // prompt budget surfaced to the peer "MaxOutputTokens": 32000, // max tokens generated per completion "ReasoningEffort": "high", // minimal | low | medium | high "Streaming": true, "RequestTimeoutSeconds": 600 }, { "Name": "local", "Provider": "OpenAiChat", "Model": "local", "DatabasePath": "local.db", "ApiBaseUrl": "http://127.0.0.1:8080/v1", // a local llama.cpp/Ollama server "MaxInputTokens": 28000, "MaxOutputTokens": 4096, "Streaming": false } ] }