Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
[![Python](https://img.shields.io/badge/python-3.10+-blue.svg)](https://python.org)
[![PyPI](https://img.shields.io/pypi/v/model-ledger)](https://pypi.org/project/model-ledger/)
[![Docs](https://img.shields.io/badge/docs-block.github.io/model--ledger-7a1a1a.svg)](https://block.github.io/model-ledger/)

📖 **[Read the documentation →](https://block.github.io/model-ledger/)**

---

Expand Down
56 changes: 56 additions & 0 deletions docs/glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
title: Glossary
description: The vocabulary of model-ledger — DataNode, Snapshot, ModelRef, Composite, and the rest, in one place.
---

# Glossary

The whole system is a handful of nouns. (These terms also get hover-definitions
wherever they appear in the docs.)

`Backend`
: Pluggable storage behind the `LedgerBackend` protocol — in-memory, SQLite, JSON
files, Snowflake, or a remote HTTP service. Swapping it never changes your code.

`Composite`
: A governed group whose members are themselves models — a business-level entity (e.g.
a "Credit Decision System") that rolls up its scorecard, rules, and ETL. See
[Composites](concepts/composite.md).

`Connector`
: A source that emits `DataNode`s from a platform (SQL, REST, GitHub, …) via the
`SourceConnector` protocol. See [Connectors & discovery](guides/connectors.md).

`DataNode`
: The core graph primitive: anything with typed input/output ports — an ML model, a
heuristic rule, an ETL job, an alert queue. See [DataNode & the graph](concepts/datanode.md).

`DataPort`
: A named connection point on a `DataNode`, optionally carrying schema so identically
named outputs from different models don't falsely link.

`Dependency graph`
: The links between nodes, built automatically when an output port name matches an
input port name (`connect()`).

`Event log`
: The inventory itself — an append-only sequence of immutable Snapshots. Nothing is
overwritten, so history is always reconstructable.

`ModelRef`
: A model's stable identity: name, owner, type, risk `tier`, purpose, status. The
minimum a regulator needs. See [Snapshots & the event log](concepts/snapshot.md).

`Point-in-time`
: Reconstruction of the inventory as it stood on any past date, via `inventory_at()`.

`Profile`
: A pluggable compliance check (`sr_11_7`, `eu_ai_act`, `nist_ai_rmf`) that validates a
model's completeness against a framework. See [Governance](governance.md).

`Snapshot`
: An immutable, content-addressed record of one thing that happened to a model — a
registration, a retrain, a validation. The unit of the event log.

`Tag`
: A mutable named pointer to a specific Snapshot (e.g. `production`, `latest-validated`).
72 changes: 72 additions & 0 deletions docs/governance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
title: Governance
description: A complete, auditable, point-in-time model inventory — the durable building blocks every model-risk regime asks for, mapped to model-ledger primitives.
---

# Governance

Model-risk regimes change their names and their numbers. What they *ask for* barely
changes. Strip away the acronyms and every regime — US banking, EU, insurance — wants
the same six things from your model inventory. model-ledger is built to produce them as
a byproduct of normal use, not as a separate compliance chore.

## What every regime actually asks for

| The durable need | What an examiner says | The model-ledger primitive |
|---|---|---|
| **Complete inventory** | "Show me *every* model — including the shadow ones." | Cross-platform [discovery & connectors](guides/connectors.md) — ML models, rules, and ETL as one graph |
| **Risk tiering** | "Which are high-materiality?" | `tier` on every [`ModelRef`](reference/index.md); business systems roll up as [composites](concepts/composite.md) |
| **Change control + audit trail** | "What changed, when, and who did it?" | Immutable, content-addressed [Snapshots](concepts/snapshot.md) — append-only, tamper-evident |
| **Dependency & lineage** | "How do these components feed each other?" | The [dependency graph](concepts/datanode.md), built from port matching |
| **Validation records** | "Prove this was validated, and find what wasn't." | `record_validation()` events live in the same immutable log |
| **Point-in-time reconstruction** | "Show me the inventory as it stood on December 31." | [`inventory_at(date)`](recipes/point-in-time.md) replays the log |

That's the whole compliance story: **nothing is overwritten, so the answer to "what was
true then?" is always reconstructable.**

## It falls out of normal use

```python
from model_ledger import Ledger

ledger = Ledger.from_sqlite("./inventory.db")

# Identity + risk tier — the minimum a regulator needs
ledger.register(
name="credit_scorecard", owner="risk-team",
model_type="ml_model", tier="high",
purpose="Consumer credit decisioning",
)

# Validation outcomes are just events in the same immutable log
ledger.record("credit_scorecard", event="validated", actor="mrm-team",
payload={"result": "pass", "validator": "second-line"})

# The full, ordered, tamper-evident history an examiner can replay
for snap in ledger.history("credit_scorecard"):
print(snap.timestamp, snap.event_type, snap.actor)
```

## Frameworks it maps to

The primitives above satisfy the documentation and inventory expectations of the major
model-risk and AI-governance regimes:

- **US banking — SR 26‑2 / OCC Bulletin 2026‑13** (the 2026 revision that superseded
SR 11‑7): tiered model inventory, materiality classification, lifecycle documentation,
and validation status.
- **EU AI Act — Annex IV**: version-tracked technical documentation, component
dependencies, and change history for high-risk systems.
- **NIST AI RMF** and **ISO/IEC 42001**: inventory, risk management, and lifecycle
governance practices.

model-ledger ships **pluggable validation profiles** (`sr_11_7`, `eu_ai_act`,
`nist_ai_rmf`) that check a model's completeness against a framework, and you can add
your own — profiles are a plugin layer, not the core. Run them with
`model-ledger validate --profile <name>` (see the [CLI guide](guides/cli.md)).

!!! note "Framework-agnostic on purpose"
model-ledger is a model inventory for *any* organization with deployed models — not
a single-regulation tool. The frameworks above are examples of what the underlying
capability is good for; they are a thin, swappable layer over a durable foundation.
When a regulator renumbers a rule, you update a profile — not your inventory.
68 changes: 68 additions & 0 deletions docs/guides/cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: CLI
description: The model-ledger command line — launch the MCP server or REST API over any backend, and work with a local inventory.
---

# CLI

Install the CLI extra, then `model-ledger --help` lists everything:

```bash
pip install "model-ledger[cli]"
model-ledger --help
```

The CLI has two jobs: **launch the agent and HTTP surfaces** (the bridge to the rest of
this documentation), and **work with a local inventory** from the terminal.

## Launch a surface

These serve the [Ledger](../reference/index.md) over any [backend](backends.md) — in-memory,
SQLite, JSON, Snowflake, or a remote HTTP service.

=== "MCP (for agents)"

```bash
model-ledger mcp # in-memory
model-ledger mcp --demo # sample inventory
model-ledger mcp --backend sqlite --path ./inv.db # persistent
model-ledger mcp --backend snowflake --schema DB.MODEL_LEDGER
model-ledger mcp --backend http --path https://model-ledger.internal:8000
```

=== "REST API"

```bash
model-ledger serve --demo --port 8000
# → OpenAPI docs at http://localhost:8000/docs
```

`--backend` accepts `memory` · `sqlite` · `json` · `snowflake` · `http`; `--path` is the
file path (sqlite/json) or URL (http); Snowflake reads credentials from the environment
(see [Choosing a backend](backends.md)).

## Work with a local inventory

These commands operate on a local file-based inventory (`--db`, default `inventory.db`
or `$MODEL_LEDGER_DB`) and render as a table or `--format json`.

| Command | What it does |
|---|---|
| `model-ledger list` | List registered models |
| `model-ledger show <name>` | Show one model's details and versions |
| `model-ledger validate <name> --profile <p>` | Check a model against a compliance profile (`sr_11_7`, `eu_ai_act`, `nist_ai_rmf`) |
| `model-ledger audit-log <name>` | Print the model's audit trail |
| `model-ledger export <name> --output <dir>` | Export an audit pack |
| `model-ledger introspect <artifact> --allow-pickle` | Extract algorithm/features from a fitted model file |

```bash
model-ledger list --format json
model-ledger validate credit_scorecard --profile sr_11_7
model-ledger audit-log credit_scorecard
```

!!! info "Which command for which surface"
`mcp` and `serve` expose the full [event-log Ledger](../concepts/snapshot.md) — the one
the [SDK](../quickstart.md), [agents](agents.md), and [REST API](backends.md) all share.
Use them to point Claude or a dashboard at your inventory. The `validate` profiles map
to the frameworks in the [Governance guide](../governance.md).
8 changes: 8 additions & 0 deletions docs/includes/abbreviations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
*[DataNode]: The core graph primitive — anything with typed input/output ports (model, rule, ETL, queue).
*[DataPort]: A named connection point on a DataNode; dependency edges form when port names match.
*[Snapshot]: An immutable, content-addressed record of one thing that happened to a model.
*[ModelRef]: A model's stable identity — name, owner, type, risk tier, purpose, status.
*[Composite]: A governed group whose members are themselves models (e.g. a "Credit Decision System").
*[MCP]: Model Context Protocol — the agent-native interface; model-ledger's primary surface.
*[SR 26-2]: 2026 US interagency model-risk-management guidance (OCC 2026-13), which superseded SR 11-7.
*[Annex IV]: The EU AI Act's technical-documentation requirements for high-risk AI systems.
42 changes: 42 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
title: Installation
description: Install model-ledger and just the extras you need — SDK core is tiny; surfaces and backends are opt-in.
---

# Installation

model-ledger requires **Python 3.10+**. The core is deliberately tiny (`httpx` +
`pydantic` only); everything else is an opt-in extra, so you install just the surfaces
and backends you use.

```bash
pip install model-ledger # core: SDK + dependency graph + connectors
# or
uv add model-ledger
```

## Extras

| Install | Adds | For |
|---|---|---|
| `model-ledger` | SDK, graph, SQL/REST/GitHub connectors | the core library |
| `model-ledger[mcp]` | MCP server (`model-ledger mcp`) | AI agents — Claude, Goose, Cursor |
| `model-ledger[rest-api]` | FastAPI app (`model-ledger serve`) | frontends, dashboards |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include CLI dependencies for the serve extra

With only model-ledger[rest-api], the console entry point still imports model_ledger.cli.app, which unconditionally imports typer and rich, while the rest-api extra in pyproject.toml only adds FastAPI/uvicorn. Users following this row will hit an import error before model-ledger serve can run unless they also install [cli] or the REST extra includes the CLI dependencies.

Useful? React with 👍 / 👎.

| `model-ledger[cli]` | Typer + Rich CLI | terminal use |
| `model-ledger[snowflake]` | Snowflake backend | production storage |
| `model-ledger[introspect-sklearn]` | scikit-learn introspector | extract algorithm/features from fitted models |
| `model-ledger[introspect-xgboost]` | XGBoost introspector | " |
| `model-ledger[introspect-lightgbm]` | LightGBM introspector | " |
| `model-ledger[excel]` | openpyxl | spreadsheet import/export |
| `model-ledger[all]` | Snowflake + pandas + httpx | the common production set |

Combine them: `pip install "model-ledger[mcp,rest-api,snowflake]"`.

## Which extra for which surface

- **Python SDK** — core install is enough.
- **Talk to it from an agent** — `[mcp]`, then `claude mcp add model-ledger -- model-ledger mcp` (see the [Agent guide](guides/agents.md)).
- **Serve it over HTTP** — `[rest-api]`, then `model-ledger serve` (see [Backends](guides/backends.md)).
- **From the terminal** — `[cli]` (see the [CLI guide](guides/cli.md)).

Next: the [60-second quickstart](quickstart.md).
65 changes: 65 additions & 0 deletions docs/javascripts/md-actions.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
/* Per-page Markdown actions: Copy / View / Open in Claude.
* Exposes the raw .md the docs already publish (built by docs_hooks/llms_txt.py),
* so this site is as consumable by an agent as it is by a human — fitting for a
* tool whose product is an MCP server. Re-runs on Material's instant navigation. */
document$.subscribe(function () {
var content = document.querySelector(".md-content__inner");
if (!content) return;
var h1 = content.querySelector("h1");
if (!h1 || content.querySelector(".md-actions")) return;

// Rendered pages use directory URLs (/x/); their source .md lives at /x.md,
// and the site root maps to /index.md. The logo href gives the base path.
var logo = document.querySelector(".md-header__button.md-logo");
var base = logo ? new URL(logo.href).pathname : "/";
var path = location.pathname;
var mdUrl =
path === base || path === base.replace(/\/$/, "")
? base.replace(/\/$/, "") + "/index.md"
: path.replace(/\/$/, "") + ".md";

var bar = document.createElement("div");
bar.className = "md-actions";

var copy = document.createElement("button");
copy.type = "button";
copy.className = "md-action";
copy.textContent = "Copy as Markdown";
copy.addEventListener("click", function () {
fetch(mdUrl)
.then(function (r) {
return r.text();
})
.then(function (text) {
return navigator.clipboard.writeText(text);
})
.then(function () {
copy.textContent = "Copied ✓";
setTimeout(function () {
copy.textContent = "Copy as Markdown";
}, 1600);
})
.catch(function () {
copy.textContent = "Copy failed";
});
});

var view = document.createElement("a");
view.className = "md-action";
view.href = mdUrl;
view.textContent = "View as Markdown";

var claude = document.createElement("a");
claude.className = "md-action";
claude.target = "_blank";
claude.rel = "noopener";
claude.href =
"https://claude.ai/new?q=" +
encodeURIComponent("Read " + location.origin + mdUrl + " and help me use model-ledger.");
claude.textContent = "Open in Claude";

bar.appendChild(copy);
bar.appendChild(view);
bar.appendChild(claude);
h1.insertAdjacentElement("afterend", bar);
});
4 changes: 4 additions & 0 deletions docs/robots.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
User-agent: *
Allow: /

Sitemap: https://block.github.io/model-ledger/sitemap.xml
50 changes: 50 additions & 0 deletions docs/stylesheets/extra.css
Original file line number Diff line number Diff line change
Expand Up @@ -193,3 +193,53 @@
margin-top: 1.6rem; display: flex; gap: .6rem;
justify-content: center; flex-wrap: wrap;
}

/* ---------- Responsive display type ---------- */
.ml-hero__title { font-size: clamp(2rem, 7vw, 2.9rem); }
.ml-404 .ml-hero__title { font-size: clamp(1.7rem, 6vw, 2.4rem); }

/* ---------- Per-page Markdown actions (Copy / View / Open in Claude) ---------- */
.md-actions { display: flex; flex-wrap: wrap; gap: .5rem; margin: -.2rem 0 1.5rem; }
.md-action {
font-family: "JetBrains Mono", monospace; font-size: .66rem; letter-spacing: .02em;
line-height: 1; padding: .42rem .62rem; cursor: pointer; text-decoration: none;
background: transparent; color: var(--md-default-fg-color--light);
border: 1px solid var(--ml-hairline); border-radius: 3px;
transition: border-color .12s ease, color .12s ease;
}
.md-action:hover { border-color: var(--md-accent-fg-color); color: var(--md-accent-fg-color); }

/* ---------- API reference: extend the theme into mkdocstrings output ---------- */
.md-typeset .doc-heading { font-family: "Spectral", Georgia, serif; font-weight: 600; }
.md-typeset .doc-object > .doc-heading {
border-top: 1px solid var(--ml-hairline); padding-top: .7em; margin-top: 1.6em;
}
.md-typeset .doc-section-title {
font-family: "Spectral", Georgia, serif;
text-transform: uppercase; letter-spacing: .07em; font-size: .78rem;
color: var(--md-default-fg-color--light);
}
.md-typeset .doc-label {
font-family: "JetBrains Mono", monospace; font-size: .62rem; letter-spacing: .02em;
background: var(--md-code-bg-color); color: var(--md-accent-fg-color);
border: 1px solid var(--ml-hairline); border-radius: 3px; padding: .1rem .4rem;
}
.doc-symbol { color: var(--md-accent-fg-color) !important; opacity: .8; }

/* ---------- Blockquote: signature treatment (the agent transcript) ---------- */
.md-typeset blockquote {
border-left: 2px solid var(--md-accent-fg-color);
background: var(--md-code-bg-color);
border-radius: 0 3px 3px 0;
color: var(--md-default-fg-color);
}

/* ---------- Code-fence filename labels (when title= is set) ---------- */
.md-typeset .highlight span.filename {
font-family: "JetBrains Mono", monospace; font-size: .68rem;
background: var(--md-code-bg-color); color: var(--md-default-fg-color--light);
border-bottom: 1px solid var(--ml-hairline);
}

/* ---------- Glossary / definition lists ---------- */
.md-typeset dl dt { font-family: "Spectral", Georgia, serif; font-weight: 600; }
Loading
Loading