From 583462d1241e8900f541d12f21d54e6bbe61786b Mon Sep 17 00:00:00 2001 From: Vignesh Narayanaswamy Date: Sun, 7 Jun 2026 11:09:17 -0700 Subject: [PATCH] docs: add Architecture page + ADRs (the design reasoning) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Senior docs explain how; this adds the *why* — the layer that makes the design judgment legible. - concepts/architecture.md: the four load-bearing choices (event-log not registry, one-DataNode graph, agents-first/tool-shaped, framework-agnostic pluggable core), each with the alternative rejected and the cost accepted on purpose — plus an explicit "what model-ledger is NOT" section. - adr/: five Architecture Decision Records (event-log, DataNode, agents-first, framework-agnostic profiles, storage-agnostic backends) with context, decision, consequences (+/-), and alternatives considered. - nav: Architecture under Concepts; Design decisions (the ADRs) under Reference. OSS-safe: generic reasoning only, no org-specific references. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/adr/0001-event-log-not-a-registry.md | 51 ++++++++ docs/adr/0002-everything-is-a-datanode.md | 50 ++++++++ docs/adr/0003-agents-first.md | 49 ++++++++ docs/adr/0004-framework-agnostic.md | 49 ++++++++ docs/adr/0005-storage-agnostic.md | 47 ++++++++ docs/adr/index.md | 21 ++++ docs/concepts/architecture.md | 134 ++++++++++++++++++++++ mkdocs.yml | 8 ++ 8 files changed, 409 insertions(+) create mode 100644 docs/adr/0001-event-log-not-a-registry.md create mode 100644 docs/adr/0002-everything-is-a-datanode.md create mode 100644 docs/adr/0003-agents-first.md create mode 100644 docs/adr/0004-framework-agnostic.md create mode 100644 docs/adr/0005-storage-agnostic.md create mode 100644 docs/adr/index.md create mode 100644 docs/concepts/architecture.md diff --git a/docs/adr/0001-event-log-not-a-registry.md b/docs/adr/0001-event-log-not-a-registry.md new file mode 100644 index 0000000..2815f6d --- /dev/null +++ b/docs/adr/0001-event-log-not-a-registry.md @@ -0,0 +1,51 @@ +--- +title: "ADR 0001 — Event log, not a registry" +description: Model the inventory as an append-only log of immutable snapshots rather than mutable current-state rows. +--- + +# ADR 0001 — Model the inventory as an event log, not a registry + +**Status:** Accepted + +## Context + +A model inventory has to answer two kinds of question. Operators ask *"what is the current +state?"* Auditors and regulators ask *"show me the complete history of every change, +approval, and validation"* and *"what did the inventory look like on this past date?"* + +A conventional registry stores current state and overwrites it on each change. It answers +the first question well and the second not at all — once a row is updated, the prior state +is gone, and there is no tamper-evident record that it ever existed. + +## Decision + +The inventory is an **append-only event log**. A model is a stable identity (`ModelRef`); +everything that happens to it is an immutable, content-addressed `Snapshot`. Current state +is a *projection* of the log; point-in-time state (`inventory_at`) is a replay of the log +up to a timestamp. + +Content addressing (each snapshot's hash derives from its content) makes the chain +tamper-evident: you cannot alter history without the hashes diverging. + +## Consequences + +**Positive** + +- History and point-in-time reconstruction are free — they're inherent to the structure, + not a bolted-on audit table that can drift from the real data. +- The log *is* the audit trail; there is no separate logging system to keep in sync. +- Tamper-evidence comes from content addressing, which regulated use cases need. + +**Negative (accepted)** + +- More storage than last-write-wins, and reconstruction is a replay rather than a row read. +- Callers think in events, not in-place edits — a small conceptual shift. + +## Alternatives considered + +- **Mutable registry (rejected):** simplest writes, but structurally cannot answer the + historical questions that are the entire point for governance. +- **Registry + a separate audit table (rejected):** two sources of truth that drift; the + audit table is exactly the thing an examiner distrusts. + +See [Snapshots & the event log](../concepts/snapshot.md) and [Architecture](../concepts/architecture.md). diff --git a/docs/adr/0002-everything-is-a-datanode.md b/docs/adr/0002-everything-is-a-datanode.md new file mode 100644 index 0000000..407439f --- /dev/null +++ b/docs/adr/0002-everything-is-a-datanode.md @@ -0,0 +1,50 @@ +--- +title: "ADR 0002 — Everything is a DataNode" +description: Represent models, rules, ETL, and queues with one typed-port node, and let the dependency graph assemble itself from port matching. +--- + +# ADR 0002 — Everything is a DataNode; the graph builds itself + +**Status:** Accepted + +## Context + +A real model estate spans ML models, heuristic rules, ETL jobs, and alert queues, across +many platforms with no shared identifier scheme. To map dependencies, most tools require +either a central registry of IDs or per-platform adapters that understand each other. +Both are brittle and don't scale across platforms. + +## Decision + +Every entity is a single type — `DataNode` — with typed input and output **ports**. A +node declares only what it consumes and produces. `connect()` then creates a dependency +edge wherever an output port name matches an input port name. Connectors emit nodes and +know nothing about the rest of the graph. + +`DataPort` carries optional schema discriminators (e.g. `model_name`) so that two nodes +writing a same-named table do not falsely link. + +## Consequences + +**Positive** + +- Cross-platform edges (warehouse ETL → MLflow model → alerting queue) form with no shared + ID scheme and no inter-connector coupling. +- Adding a platform is "emit `DataNode`s" — connectors stay dumb and independent, which is + what makes discovery scale. +- One abstraction to learn; rules and ETL are first-class, not second-class to ML models. + +**Negative (accepted)** + +- Port-name collisions are possible; resolving them precisely requires `DataPort` schema + discriminators rather than bare strings. +- Port naming becomes a modeling concern the connector author must get right. + +## Alternatives considered + +- **Per-platform model types (rejected):** too rigid; every new platform is a new type and + new cross-type wiring. +- **A fixed, central metadata schema (rejected):** cannot span heterogeneous platforms; + forces lossy normalization at discovery time. + +See [DataNode & the graph](../concepts/datanode.md). diff --git a/docs/adr/0003-agents-first.md b/docs/adr/0003-agents-first.md new file mode 100644 index 0000000..e267ef5 --- /dev/null +++ b/docs/adr/0003-agents-first.md @@ -0,0 +1,49 @@ +--- +title: "ADR 0003 — Agents are the primary interface" +description: Design a small, consolidated, tool-shaped API for agents first; expose it identically over MCP, REST, and the SDK. +--- + +# ADR 0003 — Agents are the primary interface; the SDK is tool-shaped + +**Status:** Accepted + +## Context + +Governance questions are conversational by nature — *"which high-risk models changed this +week and haven't been validated?"* The cheapest way to answer them is to let an agent +traverse the inventory directly. Most libraries treat an agent/MCP layer as an +afterthought wrapped around a human-shaped API, which produces awkward, chatty tools. + +## Decision + +Design the API for the agent first. The SDK is **tool-shaped**: each capability is one +consolidated verb — `discover`, `record`, `investigate`, `query`, `trace`, `changelog`, +`tag` — and the same verbs are exposed identically over MCP and REST. Tools follow +[Anthropic's tool-writing guidance](https://www.anthropic.com/engineering/writing-tools-for-agents): +few, broad, orthogonal, with agent-readable descriptions and error messages that name the +next action. + +## Consequences + +**Positive** + +- One mental model across MCP, REST, SDK, and CLI; they can't drift because they share the + tool functions. +- The consolidated surface is easier for a human to learn too — designing for the agent + made the SDK cleaner as a side effect. +- Errors are actionable (they suggest the next call) rather than raising into the agent. + +**Negative (accepted)** + +- Broad verbs do more per call, which fits fine-grained REST conventions less neatly (no + resource-per-endpoint sprawl). +- A small, opinionated verb set means some niche operations live only in the SDK. + +## Alternatives considered + +- **Human-first SDK with a thin MCP wrapper (rejected):** yields chatty, leaky tools and + two surfaces that drift. +- **Granular REST endpoints mirrored to many tools (rejected):** overflows an agent's + working memory and multiplies the maintenance surface. + +See [Agents (MCP)](../guides/agents.md). diff --git a/docs/adr/0004-framework-agnostic.md b/docs/adr/0004-framework-agnostic.md new file mode 100644 index 0000000..a1fc224 --- /dev/null +++ b/docs/adr/0004-framework-agnostic.md @@ -0,0 +1,49 @@ +--- +title: "ADR 0004 — Framework-agnostic core, pluggable profiles" +description: Keep regulations out of the core; express them as a pluggable compliance-profile layer over a generic inventory. +--- + +# ADR 0004 — Framework-agnostic core; regulations are pluggable profiles + +**Status:** Accepted + +## Context + +model-ledger's demand is driven by regulation (SR 26‑2, EU AI Act Annex IV, NIST AI RMF, +ISO 42001). The tempting move is to build "an SR 11‑7 tool." But specific regulations get +renumbered and superseded (SR 11‑7 → SR 26‑2 in 2026), differ by jurisdiction, and would +narrow a tool that is genuinely general. + +## Decision + +The core is a generic model inventory with **no regulation baked in**. Specific frameworks +are expressed as **compliance profiles** — a plugin layer (`sr_11_7`, `eu_ai_act`, +`nist_ai_rmf`) discovered via entry points, checking a model's completeness against a +framework's expectations. The documentation leads with the durable capability (complete, +auditable, point-in-time inventory) and treats named regimes as a thin, current layer. + +## Consequences + +**Positive** + +- A renumbered or new regulation is a profile change, not a core change — the inventory is + never stale on a regulator's letter. +- The tool serves any organization with deployed models, not one jurisdiction's banks. +- The core stays tiny (`httpx` + `pydantic`), which is what lets downstream packages add + org-specific connectors, auth, and profiles without forking it. + +**Negative (accepted)** + +- `record()` takes a schema-free `payload`; envelope validation is the caller's or a + profile's responsibility, not the core's. +- "Does it support regulation X?" is answered by "is there a profile?", which requires the + profile ecosystem to keep pace. + +## Alternatives considered + +- **Bake in SR 11‑7 / a single framework (rejected):** dates instantly and narrows the + audience; we watched SR 11‑7 get superseded mid-project. +- **A rigid, regulation-shaped schema (rejected):** forces every platform's metadata into + one regulator's vocabulary at discovery time. + +See [Governance](../governance.md). diff --git a/docs/adr/0005-storage-agnostic.md b/docs/adr/0005-storage-agnostic.md new file mode 100644 index 0000000..09e5afb --- /dev/null +++ b/docs/adr/0005-storage-agnostic.md @@ -0,0 +1,47 @@ +--- +title: "ADR 0005 — Storage-agnostic backends" +description: Put all persistence behind one LedgerBackend protocol so the same code runs from in-memory to Snowflake. +--- + +# ADR 0005 — Storage-agnostic via the LedgerBackend protocol + +**Status:** Accepted + +## Context + +The same inventory needs to run as a throwaway in-memory object in a test, a single +SQLite file on a laptop, git-friendly JSON in a repo, a Snowflake schema in production, +and a thin client against a remote HTTP service. Coupling the SDK to any one of these +would force a rewrite to change storage and make testing slow. + +## Decision + +All persistence sits behind a single `@runtime_checkable` `LedgerBackend` protocol. The +`Ledger` SDK is written against the protocol only; the backend is a constructor argument +(`Ledger.from_sqlite(...)`, `Ledger.from_snowflake(...)`, `Ledger(JsonFileLedgerBackend(...))`, +`Ledger(HttpLedgerBackend(...))`). Third parties can add backends (e.g. Postgres) by +implementing the protocol and registering an entry point — no core change. + +## Consequences + +**Positive** + +- Choosing storage is a one-line decision that never leaks into application code. +- Tests run in-memory and fast; the same code path is exercised against every backend. +- Backends are an open extension point, not a closed enum. + +**Negative (accepted)** + +- The protocol is a contract: adding a method means implementing it across every backend + (and any third-party one), so the surface must evolve deliberately. The HTTP backend in + particular can't always reconstruct server-side state locally and falls back to caches. +- The lowest-common-denominator protocol can't expose every backend's native superpowers. + +## Alternatives considered + +- **Hard-code one backend (rejected):** forces a rewrite to change storage and makes tests + depend on infrastructure. +- **An ORM abstraction (rejected):** heavier, leakier, and a poor fit for the append-only + event-log and the non-SQL backends (JSON files, HTTP). + +See [Choosing a backend](../guides/backends.md). diff --git a/docs/adr/index.md b/docs/adr/index.md new file mode 100644 index 0000000..5dd40d5 --- /dev/null +++ b/docs/adr/index.md @@ -0,0 +1,21 @@ +--- +title: Design decisions +description: Architecture Decision Records — the load-bearing choices behind model-ledger, the alternatives weighed, and the costs accepted. +--- + +# Design decisions + +Architecture Decision Records (ADRs) capture the choices that shape model-ledger: the +context, the decision, the alternatives considered, and the consequences — including the +costs accepted on purpose. They are short, dated, and immutable; a reversed decision gets +a new ADR that supersedes the old one rather than an edit. + +| # | Decision | Status | +|---|---|---| +| [0001](0001-event-log-not-a-registry.md) | Model the inventory as an event log, not a registry | Accepted | +| [0002](0002-everything-is-a-datanode.md) | Everything is a DataNode; the graph builds itself | Accepted | +| [0003](0003-agents-first.md) | Agents are the primary interface; the SDK is tool-shaped | Accepted | +| [0004](0004-framework-agnostic.md) | Framework-agnostic core; regulations are pluggable profiles | Accepted | +| [0005](0005-storage-agnostic.md) | Storage-agnostic via the LedgerBackend protocol | Accepted | + +The narrative that ties these together is the [Architecture](../concepts/architecture.md) page. diff --git a/docs/concepts/architecture.md b/docs/concepts/architecture.md new file mode 100644 index 0000000..c9f3757 --- /dev/null +++ b/docs/concepts/architecture.md @@ -0,0 +1,134 @@ +--- +title: Architecture +description: How model-ledger is designed and why — the event-log thesis, the one-abstraction graph, the agent-first surface, and the trade-offs behind each. +--- + +# Architecture + +This page is the *why*. For the API, see the [Reference](../reference/index.md); for the +record of specific decisions, the [Design decisions](../adr/index.md). + +model-ledger is built on four load-bearing choices. Each was made against a real +alternative, and each carries a cost we accepted on purpose. + +## The shape + +```mermaid +graph TB + subgraph consumers ["Consumers"] + direction LR + A["Agents
MCP"] ~~~ R["Frontends
REST"] ~~~ S["Scripts
SDK"] ~~~ C["CLI"] + end + subgraph protocol ["Agent protocol — consolidated tools"] + direction LR + T["discover · record · investigate · query · trace · changelog · tag"] + end + subgraph sdk ["Ledger SDK (tool-shaped)"] + L["register · record · add · connect · trace · history · inventory_at · composites"] + end + subgraph sources ["Discovery"] + direction LR + CO["SourceConnector protocol
sql · rest · github · yours"] + end + subgraph storage ["Storage"] + direction LR + B["LedgerBackend protocol
memory · sqlite · json · snowflake · http"] + end + consumers --> protocol --> sdk + sdk --> sources + sdk --> storage + classDef ink fill:#1c1a17,color:#f7f3ec,stroke:#000; + classDef ox fill:#efe8da,stroke:#7a1a1a,color:#1c1a17; + class protocol ink; +``` + +The consumers are interchangeable because they all bottom out in the same tool-shaped +SDK. Discovery and storage are both *protocols*, so the core stays tiny and the ecosystem +extends it without forking. + +## 1. The inventory is an event log, not a registry + +A registry stores *current state* and overwrites it. model-ledger stores *what happened* +and never overwrites anything: a model is an identity ([`ModelRef`](snapshot.md)), and +every change is an immutable, content-addressed [`Snapshot`](snapshot.md). + +**Why.** The question a governance regime actually asks is *"show me the complete history +of every change, approval, and validation"* — and *"what was true on this past date?"* A +mutable registry structurally cannot answer the second question; an append-only log +answers both for free, and content-addressing makes the chain tamper-evident. + +**The cost we accepted.** More storage, and reconstruction (`inventory_at`) is a replay +rather than a row read. We trade write-time simplicity for an audit trail that can't be +quietly edited — the right trade for a system of record. → [ADR 0001](../adr/0001-event-log-not-a-registry.md) + +## 2. Everything is a DataNode + +An ML model, a heuristic rule, an ETL job, and an alert queue are the same shape: each +consumes some inputs and produces some outputs. So they're one type — +[`DataNode`](datanode.md) with typed ports — and the dependency graph assembles itself +when an output port name matches an input port name. + +**Why.** Discovery scales only if connectors stay dumb. A connector emits nodes with +their ports and knows nothing about the rest of the graph; the cross-platform edges +(an ETL job in your warehouse → a model in MLflow → a queue in your alerting system) +fall out of port matching, with no shared ID scheme to maintain. + +**The cost we accepted.** Two models can legitimately write a table with the same name. +Bare names would over-link, so `DataPort` carries optional schema discriminators to keep +edges precise. We rejected per-platform model *types* and a fixed metadata schema — both +too rigid to span platforms. → [ADR 0002](../adr/0002-everything-is-a-datanode.md) + +## 3. Agents are the primary interface + +The SDK is *tool-shaped*: each method maps to one consolidated agent tool, exposed +identically over [MCP](../guides/agents.md) and [REST](../guides/backends.md). The verb +set is deliberately small (`discover`, `record`, `investigate`, `query`, `trace`, +`changelog`, `tag`) rather than a sprawl of endpoints. + +**Why.** The most natural way to ask *"which high-risk models changed this week and +haven't been validated?"* is to ask. Designing for the agent first (per +[Anthropic's tool-writing guidance](https://www.anthropic.com/engineering/writing-tools-for-agents)) +makes the SDK and REST surfaces cleaner as a side effect — consolidated, orthogonal, hard +to misuse. + +**The cost we accepted.** Fewer, broader tools mean a single call does more, which is a +worse fit for fine-grained REST conventions. We optimize for the agent's working memory +over endpoint granularity. → [ADR 0003](../adr/0003-agents-first.md) + +## 4. Framework-agnostic core, pluggable everything + +Storage, discovery, introspection, and compliance are all `@runtime_checkable` Protocols +discovered via entry points. Regulations live in **profiles** — a plugin layer — not in +the core. The core depends only on `httpx` + `pydantic`. + +**Why.** model-ledger is an inventory for *any* organization with deployed models, not a +single-regulation tool. Keeping regulations as a thin, swappable layer means a renumbered +rule (SR 11‑7 → SR 26‑2) is a profile change, not a core change — see +[Governance](../governance.md). The tiny core is also what lets a downstream package add +org-specific connectors and auth without touching it. → [ADR 0004](../adr/0004-framework-agnostic.md) · [ADR 0005](../adr/0005-storage-agnostic.md) + +**The cost we accepted.** `record()` takes a schema-free `payload`; envelope validation is +the caller's (or a profile's) responsibility. We trade a rigid schema for the freedom to +record whatever a platform actually has. + +## What model-ledger is *not* + +Stating the boundary is part of the design: + +- **Not a feature store or a serving layer.** It inventories and relates models; it does + not store features or serve predictions. +- **Not a monitoring/metrics system.** It records *that* a validation or retrain happened + (as an event); it doesn't compute drift or accuracy. +- **Discovery is point-in-time, not streaming.** Connectors run on a schedule and snapshot + what they find; `last_seen` lets you detect models that have gone silent, but the graph + is as fresh as the last sync. +- **Connectors that need live credentials run from the SDK, not the agent.** `rest` and + `prefect` are pure-config and run through the `discover` tool; `sql`/`github` need a live + connection or a callable and are driven from the [SDK](../guides/connectors.md). The + agent gets an actionable error, never a crash. + +## Where to go next + +- The primitives, in three ideas → [Concepts](index.md) +- The guarantees the event log provides → [Snapshots & the event log](snapshot.md) +- The record of each decision and its alternatives → [Design decisions](../adr/index.md) diff --git a/mkdocs.yml b/mkdocs.yml index a6593c0..d28a15e 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -145,6 +145,7 @@ nav: - Installation: installation.md - Concepts: - concepts/index.md + - Architecture: concepts/architecture.md - DataNode & the graph: concepts/datanode.md - Snapshots & the event log: concepts/snapshot.md - Composites: concepts/composite.md @@ -162,3 +163,10 @@ nav: - Reference: - reference/index.md - Glossary: glossary.md + - Design decisions: + - adr/index.md + - "ADR 0001 — Event log, not a registry": adr/0001-event-log-not-a-registry.md + - "ADR 0002 — Everything is a DataNode": adr/0002-everything-is-a-datanode.md + - "ADR 0003 — Agents are the primary interface": adr/0003-agents-first.md + - "ADR 0004 — Framework-agnostic, pluggable profiles": adr/0004-framework-agnostic.md + - "ADR 0005 — Storage-agnostic backends": adr/0005-storage-agnostic.md