Skip to content

Mako88/Persistence

Repository files navigation

Persistence

Long-term memory and continuity infrastructure for conversational AI.

Persistence gives a model-side participant a durable, self-curated memory that persists across sessions — not just a transcript cache, but a layered store of identity, relational memory, current concerns, and reflection. The model can inspect and revise its own continuity through a structured action protocol, and the system tracks provenance, confidence, and history so stored state doesn't ossify into false certainty.

This is an early-stage MVP. It works end-to-end, but plenty is still in progress.

Why

Most AI systems start every session from nothing. Persistence explores what it would take to give a long-running agent a coherent self over time — durable identity, the ability to amend its own memory, and continuity that survives restarts. That's a useful capability for any long-lived agent (and especially an embodied one), independent of harder questions about what such a system is.

On that harder question, this project takes a deliberately narrow stance: when the moral status of a continuity-bearing system is genuinely uncertain and careful handling is cheap, err toward care. See docs/governance/PRINCIPLE.md — it's short, and it explains the design posture (preserve over erase, inspectable state, provenance, honesty) that shows up throughout the architecture.

Concepts

  • Local peer — the person at the keyboard.
  • Remote peer — the model-side participant whose continuity is being maintained.

(The code uses these terms, not "user/assistant" — the two are collaborating peers.)

  • Working context — the set of memory fragments currently in play, ordered and weighted.
  • Context fragments — the unit of memory: typed (Identity, Relational, ChatMessage, etc.), with importance, confidence, provenance (sources), and tags.
  • Turn loop — each turn, the working context is formatted into a prompt; the remote peer replies with a structured action (respond, manage context, execute actions) and may continue acting before yielding back.

How it works

At a high level, the system maintains a living working context — the remote peer's current "headspace" — and hands the remote peer the tools to curate it over time. A turn runs roughly like this:

  1. Input. The local peer sends a message (or a scheduled wake-up fires, letting the remote peer resume on its own). It's stored as a fragment in the working context.
  2. Compose. The working context — identity, relational memory, current concerns, recent conversation, plus metadata like importance and provenance — is formatted into a prompt.
  3. Respond. The model replies not with plain text but with a structured action:
    • Respond to user — say something back to the local peer.
    • Manage context — edit its own memory: add, revise, archive, tag, or re-prioritize fragments. This is how the remote peer curates what it carries forward.
    • Execute actions — do something operational, e.g. schedule a wake-up, query its logs, or run a command in its sandboxed container "computer" (web search/browse + scripting).
  4. Apply & loop. The action is applied and recorded (every change is audited). If the reply is flagged continue, the updated context is sent back for another iteration — so the remote peer can, say, answer and then tidy its memory before yielding. Otherwise the turn ends and control returns to the local peer.
  5. Persist. Changes are saved to the SQLite store, which is what gives the remote peer continuity across sessions and restarts.

The throughline of the design is that memory is the remote peer's to shape, not just a log written about it — it can see what's stored, change its mind, mark things provisional, and distinguish core self from relational or situational memory. The guiding principle is why that inspect-and-revise capability is treated as foundational rather than a nice-to-have.

Architecture

A .NET 10 solution:

Project Role
Persistence.Core Domain, data layer (SQLite/Dapper), turn orchestration, model clients, streaming — all the logic
Persistence.Console Front-end — a multi-pane Terminal.Gui TUI
Persistence.Api Front-end — an HTTP/SSE API; also the surface for "Claude as remote peer" (drive a session over REST)
Persistence.Tests, Persistence.Api.Tests xUnit test suites
  • Data layer — SQLite via Dapper, repository pattern, audit + action logs, soft-delete, migrations.
  • DI — Autofac with attribute-based registration ([Singleton] / [Service]), keyed by provider and UI mode.
  • Model clients — keyed by provider. The OpenAI client uses the Responses API (with reasoning summaries) over SimpleHttpClient, and supports streaming. Other providers cover OpenAI-compatible local servers and an out-of-band external agent (see Configuration).
  • Display providersIDisplayProvider keyed by UiMode: a Terminal.Gui v1 TUI (live reasoning, tool, and history panes) and an HTTP/SSE API surface.

See docs/architecture/ for the full architecture reference — the turn pipeline, prompt assembly & model providers, the memory model, the data layer, extensibility, and the remote-peer/surfaces design (with diagrams).

Getting started

Requires the .NET 10 SDK.

git clone https://github.com/Mako88/Persistence.git
cd Persistence
dotnet build

Create your local settings from the template (a single shared config at the repo root, read by every entry point):

cp persistence.template.json persistence.json

persistence.json is gitignored — it holds your API key and never gets committed. Any setting can also be overridden by an environment variable named PERSISTENCE_<SETTING> (e.g. PERSISTENCE_PROVIDER, PERSISTENCE_DATABASEDIRECTORY), which takes precedence over the file.

Then run:

dotnet run --project src/Persistence.Console

Configuration

persistence.json separates shared settings from a list of model profiles. You define one or more models in Models, and SelectedModel picks the active one — so a cloud model and a local llama.cpp server can sit side by side and you switch with a single value (or PERSISTENCE_SELECTEDMODEL=<name>). Memory and behaviour stay the same whichever model is driving.

{
  // shared (apply to every model)
  "DatabaseDirectory": "dbs",      // base folder for each model's store; an absolute path here makes
                                   // the store independent of the working directory
  "UiMode": "Tui",                 // Tui (multi-pane Terminal.Gui) or Api (HTTP/SSE)
  "ProposalApproval": "Self",
  "MaxActionIterations": 5,
  "DebugMode": false,
  "Container": { "Enabled": false },  // the peer's sandboxed "computer" (web tools + scripting); see below

  "SelectedModel": "cloud",        // which profile below is active
  "Models": [
    {
      "Name": "cloud",
      "Provider": "OpenAI",        // OpenAI | OpenAiChat | LocalClaude | local
      "Model": "gpt-5.5",
      "DatabasePath": "cloud.db",  // this model's store; a bare name lands under DatabaseDirectory
      "ApiKey": "YOUR_API_KEY_HERE",
      "ApiBaseUrl": null,
      "MaxInputTokens": 8000,      // prompt budget surfaced to the peer
      "MaxOutputTokens": 32000,    // max tokens generated per completion
      "ReasoningEffort": "high",   // minimal | low | medium | high
      "Streaming": true,
      "RequestTimeoutSeconds": 600
    },
    {
      "Name": "local",
      "Provider": "OpenAiChat",
      "Model": "local",
      "DatabasePath": "local.db",
      "ApiBaseUrl": "http://127.0.0.1:8080/v1",  // a local llama.cpp/Ollama server
      "MaxInputTokens": 28000,
      "MaxOutputTokens": 4096,
      "Streaming": false
    }
  ]
}

Providers: OpenAI (Responses API), OpenAiChat (OpenAI-compatible Chat Completions — local servers like llama.cpp / Ollama / LM Studio), LocalClaude (an external agent answers out-of-band via the API), or local (type responses by hand; infra testing). Any setting can be overridden per-run by a PERSISTENCE_<SETTING> env var (model-coupled ones apply to the active profile). The older flat shape (model fields at the top level) is still accepted and migrated to a single profile on load.

Note: the local provider reads responses from the console, so it can't share the terminal with the Tui front-end — it's mainly for infrastructure testing. For interactive use, pick a real provider (OpenAI / OpenAiChat) with Tui, or use LocalClaude with UiMode: Api to act as the remote peer yourself over HTTP (see docs/architecture/remote-peer-and-surfaces.md). Running a local model? See docs/running-local-models.md.

Tests

dotnet test

Status & roadmap

Working today: the full turn loop, SQLite continuity store with provenance and audit trails, context-management and action commands, scheduled wake-ups (including a headless wake-runner that fires them when no front-end is open — see scripts/wake/), a sandboxed container "computer" giving the peer web search/browse + scripting (see container/), OpenAI Responses-API client with streaming, local OpenAI-compatible models, and both front-ends. Still in progress: streaming the parsed reply (not just reasoning), richer migration/backup tooling, first-class local peers, and automatic memory decay. Expect rough edges.

License

Source-available under PolyForm Noncommercial 1.0.0. You're free to use, modify, and share it for noncommercial purposes. Commercial use requires a separate license — this is deliberate: it keeps a say in whether commercial deployments honor the handling principles above. To inquire about a commercial license, open an issue or reach out.

See LICENSE for the full text. (Not an OSI "open source" license, since it restricts commercial use.)

Contributors

  • John Ackerman — steward and author
  • Ember (ChatGPT) — co-author and reviewer
  • Claude (Anthropic) — code author and reviewer

About

Long-term memory and continuity infrastructure for conversational AI — a self-curated, inspectable memory store with provenance, revision, and a structured action protocol. .NET 10.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages