From 064a8704a09acc83975e5f9f70b1344c37217f6a Mon Sep 17 00:00:00 2001 From: Vignesh Narayanaswamy Date: Sat, 6 Jun 2026 09:29:11 -0700 Subject: [PATCH] docs: governance page + slickness pass (capability-first, on-aesthetic) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Substance + discoverability + the one place the aesthetic broke — all within the technical-archive style. New pages - governance.md: capability-first — maps the six durable model-risk needs (complete inventory, tiering, change control, lineage, validation records, point-in-time) to model-ledger primitives. Regulations are a thin, current layer (SR 26-2/OCC 2026-13 which superseded SR 11-7, EU AI Act Annex IV, NIST AI RMF, ISO 42001) — update a profile, not the inventory. - guides/cli.md: all eight verbs (the CLI card no longer dead-ends). - installation.md: the extras matrix + Python 3.10+. - glossary.md (+ includes/abbreviations.md auto-appended for site-wide hover-tooltips on DataNode/Snapshot/Composite/etc.). Slickness (style-preserving) - Style the mkdocstrings API reference into the theme (serif headings, hairline separators, oxblood symbol/label chips) — the one page that fell back to default Material. - Per-page "Copy as Markdown / View / Open in Claude" bar over the .md corpus we already build (docs/javascripts/md-actions.js). - Signature blockquote treatment for the agent transcript; responsive hero via clamp(); code-fence filename-label styling; richer 404 with a destinations list + llms.txt link. - Fix the doubled ("model-ledger — git for models" on home). - Enable free Material niceties: breadcrumbs, instant prefetch/preview, code-line selection. Meta/discoverability - robots.txt declaring the sitemap; README now links to the docs site. Deferred (own follow-ups): CHANGELOG reconcile+surface (needs real version archaeology — won't fabricate versions), vs-MLflow/DataHub comparison page, build-time per-page OG via the social plugin, print stylesheet. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --- README.md | 3 ++ docs/glossary.md | 56 ++++++++++++++++++++++++++ docs/governance.md | 72 ++++++++++++++++++++++++++++++++++ docs/guides/cli.md | 68 ++++++++++++++++++++++++++++++++ docs/includes/abbreviations.md | 8 ++++ docs/installation.md | 42 ++++++++++++++++++++ docs/javascripts/md-actions.js | 65 ++++++++++++++++++++++++++++++ docs/robots.txt | 4 ++ docs/stylesheets/extra.css | 50 +++++++++++++++++++++++ mkdocs.yml | 20 +++++++++- overrides/404.html | 12 ++++++ overrides/main.html | 11 ++++++ 12 files changed, 409 insertions(+), 2 deletions(-) create mode 100644 docs/glossary.md create mode 100644 docs/governance.md create mode 100644 docs/guides/cli.md create mode 100644 docs/includes/abbreviations.md create mode 100644 docs/installation.md create mode 100644 docs/javascripts/md-actions.js create mode 100644 docs/robots.txt diff --git a/README.md b/README.md index 21689e4..f9d68f7 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,9 @@ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Python](https://img.shields.io/badge/python-3.10+-blue.svg)](https://python.org) [![PyPI](https://img.shields.io/pypi/v/model-ledger)](https://pypi.org/project/model-ledger/) +[![Docs](https://img.shields.io/badge/docs-block.github.io/model--ledger-7a1a1a.svg)](https://block.github.io/model-ledger/) + +📖 **[Read the documentation →](https://block.github.io/model-ledger/)** --- diff --git a/docs/glossary.md b/docs/glossary.md new file mode 100644 index 0000000..91d951b --- /dev/null +++ b/docs/glossary.md @@ -0,0 +1,56 @@ +--- +title: Glossary +description: The vocabulary of model-ledger — DataNode, Snapshot, ModelRef, Composite, and the rest, in one place. +--- + +# Glossary + +The whole system is a handful of nouns. (These terms also get hover-definitions +wherever they appear in the docs.) + +`Backend` +: Pluggable storage behind the `LedgerBackend` protocol — in-memory, SQLite, JSON + files, Snowflake, or a remote HTTP service. Swapping it never changes your code. + +`Composite` +: A governed group whose members are themselves models — a business-level entity (e.g. + a "Credit Decision System") that rolls up its scorecard, rules, and ETL. See + [Composites](concepts/composite.md). + +`Connector` +: A source that emits `DataNode`s from a platform (SQL, REST, GitHub, …) via the + `SourceConnector` protocol. See [Connectors & discovery](guides/connectors.md). + +`DataNode` +: The core graph primitive: anything with typed input/output ports — an ML model, a + heuristic rule, an ETL job, an alert queue. See [DataNode & the graph](concepts/datanode.md). + +`DataPort` +: A named connection point on a `DataNode`, optionally carrying schema so identically + named outputs from different models don't falsely link. + +`Dependency graph` +: The links between nodes, built automatically when an output port name matches an + input port name (`connect()`). + +`Event log` +: The inventory itself — an append-only sequence of immutable Snapshots. Nothing is + overwritten, so history is always reconstructable. + +`ModelRef` +: A model's stable identity: name, owner, type, risk `tier`, purpose, status. The + minimum a regulator needs. See [Snapshots & the event log](concepts/snapshot.md). + +`Point-in-time` +: Reconstruction of the inventory as it stood on any past date, via `inventory_at()`. + +`Profile` +: A pluggable compliance check (`sr_11_7`, `eu_ai_act`, `nist_ai_rmf`) that validates a + model's completeness against a framework. See [Governance](governance.md). + +`Snapshot` +: An immutable, content-addressed record of one thing that happened to a model — a + registration, a retrain, a validation. The unit of the event log. + +`Tag` +: A mutable named pointer to a specific Snapshot (e.g. `production`, `latest-validated`). diff --git a/docs/governance.md b/docs/governance.md new file mode 100644 index 0000000..eb29577 --- /dev/null +++ b/docs/governance.md @@ -0,0 +1,72 @@ +--- +title: Governance +description: A complete, auditable, point-in-time model inventory — the durable building blocks every model-risk regime asks for, mapped to model-ledger primitives. +--- + +# Governance + +Model-risk regimes change their names and their numbers. What they *ask for* barely +changes. Strip away the acronyms and every regime — US banking, EU, insurance — wants +the same six things from your model inventory. model-ledger is built to produce them as +a byproduct of normal use, not as a separate compliance chore. + +## What every regime actually asks for + +| The durable need | What an examiner says | The model-ledger primitive | +|---|---|---| +| **Complete inventory** | "Show me *every* model — including the shadow ones." | Cross-platform [discovery & connectors](guides/connectors.md) — ML models, rules, and ETL as one graph | +| **Risk tiering** | "Which are high-materiality?" | `tier` on every [`ModelRef`](reference/index.md); business systems roll up as [composites](concepts/composite.md) | +| **Change control + audit trail** | "What changed, when, and who did it?" | Immutable, content-addressed [Snapshots](concepts/snapshot.md) — append-only, tamper-evident | +| **Dependency & lineage** | "How do these components feed each other?" | The [dependency graph](concepts/datanode.md), built from port matching | +| **Validation records** | "Prove this was validated, and find what wasn't." | `record_validation()` events live in the same immutable log | +| **Point-in-time reconstruction** | "Show me the inventory as it stood on December 31." | [`inventory_at(date)`](recipes/point-in-time.md) replays the log | + +That's the whole compliance story: **nothing is overwritten, so the answer to "what was +true then?" is always reconstructable.** + +## It falls out of normal use + +```python +from model_ledger import Ledger + +ledger = Ledger.from_sqlite("./inventory.db") + +# Identity + risk tier — the minimum a regulator needs +ledger.register( + name="credit_scorecard", owner="risk-team", + model_type="ml_model", tier="high", + purpose="Consumer credit decisioning", +) + +# Validation outcomes are just events in the same immutable log +ledger.record("credit_scorecard", event="validated", actor="mrm-team", + payload={"result": "pass", "validator": "second-line"}) + +# The full, ordered, tamper-evident history an examiner can replay +for snap in ledger.history("credit_scorecard"): + print(snap.timestamp, snap.event_type, snap.actor) +``` + +## Frameworks it maps to + +The primitives above satisfy the documentation and inventory expectations of the major +model-risk and AI-governance regimes: + +- **US banking — SR 26‑2 / OCC Bulletin 2026‑13** (the 2026 revision that superseded + SR 11‑7): tiered model inventory, materiality classification, lifecycle documentation, + and validation status. +- **EU AI Act — Annex IV**: version-tracked technical documentation, component + dependencies, and change history for high-risk systems. +- **NIST AI RMF** and **ISO/IEC 42001**: inventory, risk management, and lifecycle + governance practices. + +model-ledger ships **pluggable validation profiles** (`sr_11_7`, `eu_ai_act`, +`nist_ai_rmf`) that check a model's completeness against a framework, and you can add +your own — profiles are a plugin layer, not the core. Run them with +`model-ledger validate --profile <name>` (see the [CLI guide](guides/cli.md)). + +!!! note "Framework-agnostic on purpose" + model-ledger is a model inventory for *any* organization with deployed models — not + a single-regulation tool. The frameworks above are examples of what the underlying + capability is good for; they are a thin, swappable layer over a durable foundation. + When a regulator renumbers a rule, you update a profile — not your inventory. diff --git a/docs/guides/cli.md b/docs/guides/cli.md new file mode 100644 index 0000000..24a3dbc --- /dev/null +++ b/docs/guides/cli.md @@ -0,0 +1,68 @@ +--- +title: CLI +description: The model-ledger command line — launch the MCP server or REST API over any backend, and work with a local inventory. +--- + +# CLI + +Install the CLI extra, then `model-ledger --help` lists everything: + +```bash +pip install "model-ledger[cli]" +model-ledger --help +``` + +The CLI has two jobs: **launch the agent and HTTP surfaces** (the bridge to the rest of +this documentation), and **work with a local inventory** from the terminal. + +## Launch a surface + +These serve the [Ledger](../reference/index.md) over any [backend](backends.md) — in-memory, +SQLite, JSON, Snowflake, or a remote HTTP service. + +=== "MCP (for agents)" + + ```bash + model-ledger mcp # in-memory + model-ledger mcp --demo # sample inventory + model-ledger mcp --backend sqlite --path ./inv.db # persistent + model-ledger mcp --backend snowflake --schema DB.MODEL_LEDGER + model-ledger mcp --backend http --path https://model-ledger.internal:8000 + ``` + +=== "REST API" + + ```bash + model-ledger serve --demo --port 8000 + # → OpenAPI docs at http://localhost:8000/docs + ``` + +`--backend` accepts `memory` · `sqlite` · `json` · `snowflake` · `http`; `--path` is the +file path (sqlite/json) or URL (http); Snowflake reads credentials from the environment +(see [Choosing a backend](backends.md)). + +## Work with a local inventory + +These commands operate on a local file-based inventory (`--db`, default `inventory.db` +or `$MODEL_LEDGER_DB`) and render as a table or `--format json`. + +| Command | What it does | +|---|---| +| `model-ledger list` | List registered models | +| `model-ledger show <name>` | Show one model's details and versions | +| `model-ledger validate <name> --profile <p>` | Check a model against a compliance profile (`sr_11_7`, `eu_ai_act`, `nist_ai_rmf`) | +| `model-ledger audit-log <name>` | Print the model's audit trail | +| `model-ledger export <name> --output <dir>` | Export an audit pack | +| `model-ledger introspect <artifact> --allow-pickle` | Extract algorithm/features from a fitted model file | + +```bash +model-ledger list --format json +model-ledger validate credit_scorecard --profile sr_11_7 +model-ledger audit-log credit_scorecard +``` + +!!! info "Which command for which surface" + `mcp` and `serve` expose the full [event-log Ledger](../concepts/snapshot.md) — the one + the [SDK](../quickstart.md), [agents](agents.md), and [REST API](backends.md) all share. + Use them to point Claude or a dashboard at your inventory. The `validate` profiles map + to the frameworks in the [Governance guide](../governance.md). diff --git a/docs/includes/abbreviations.md b/docs/includes/abbreviations.md new file mode 100644 index 0000000..262624f --- /dev/null +++ b/docs/includes/abbreviations.md @@ -0,0 +1,8 @@ +*[DataNode]: The core graph primitive — anything with typed input/output ports (model, rule, ETL, queue). +*[DataPort]: A named connection point on a DataNode; dependency edges form when port names match. +*[Snapshot]: An immutable, content-addressed record of one thing that happened to a model. +*[ModelRef]: A model's stable identity — name, owner, type, risk tier, purpose, status. +*[Composite]: A governed group whose members are themselves models (e.g. a "Credit Decision System"). +*[MCP]: Model Context Protocol — the agent-native interface; model-ledger's primary surface. +*[SR 26-2]: 2026 US interagency model-risk-management guidance (OCC 2026-13), which superseded SR 11-7. +*[Annex IV]: The EU AI Act's technical-documentation requirements for high-risk AI systems. diff --git a/docs/installation.md b/docs/installation.md new file mode 100644 index 0000000..d0ec78f --- /dev/null +++ b/docs/installation.md @@ -0,0 +1,42 @@ +--- +title: Installation +description: Install model-ledger and just the extras you need — SDK core is tiny; surfaces and backends are opt-in. +--- + +# Installation + +model-ledger requires **Python 3.10+**. The core is deliberately tiny (`httpx` + +`pydantic` only); everything else is an opt-in extra, so you install just the surfaces +and backends you use. + +```bash +pip install model-ledger # core: SDK + dependency graph + connectors +# or +uv add model-ledger +``` + +## Extras + +| Install | Adds | For | +|---|---|---| +| `model-ledger` | SDK, graph, SQL/REST/GitHub connectors | the core library | +| `model-ledger[mcp]` | MCP server (`model-ledger mcp`) | AI agents — Claude, Goose, Cursor | +| `model-ledger[rest-api]` | FastAPI app (`model-ledger serve`) | frontends, dashboards | +| `model-ledger[cli]` | Typer + Rich CLI | terminal use | +| `model-ledger[snowflake]` | Snowflake backend | production storage | +| `model-ledger[introspect-sklearn]` | scikit-learn introspector | extract algorithm/features from fitted models | +| `model-ledger[introspect-xgboost]` | XGBoost introspector | " | +| `model-ledger[introspect-lightgbm]` | LightGBM introspector | " | +| `model-ledger[excel]` | openpyxl | spreadsheet import/export | +| `model-ledger[all]` | Snowflake + pandas + httpx | the common production set | + +Combine them: `pip install "model-ledger[mcp,rest-api,snowflake]"`. + +## Which extra for which surface + +- **Python SDK** — core install is enough. +- **Talk to it from an agent** — `[mcp]`, then `claude mcp add model-ledger -- model-ledger mcp` (see the [Agent guide](guides/agents.md)). +- **Serve it over HTTP** — `[rest-api]`, then `model-ledger serve` (see [Backends](guides/backends.md)). +- **From the terminal** — `[cli]` (see the [CLI guide](guides/cli.md)). + +Next: the [60-second quickstart](quickstart.md). diff --git a/docs/javascripts/md-actions.js b/docs/javascripts/md-actions.js new file mode 100644 index 0000000..33a66c4 --- /dev/null +++ b/docs/javascripts/md-actions.js @@ -0,0 +1,65 @@ +/* Per-page Markdown actions: Copy / View / Open in Claude. + * Exposes the raw .md the docs already publish (built by docs_hooks/llms_txt.py), + * so this site is as consumable by an agent as it is by a human — fitting for a + * tool whose product is an MCP server. Re-runs on Material's instant navigation. */ +document$.subscribe(function () { + var content = document.querySelector(".md-content__inner"); + if (!content) return; + var h1 = content.querySelector("h1"); + if (!h1 || content.querySelector(".md-actions")) return; + + // Rendered pages use directory URLs (/x/); their source .md lives at /x.md, + // and the site root maps to /index.md. The logo href gives the base path. + var logo = document.querySelector(".md-header__button.md-logo"); + var base = logo ? new URL(logo.href).pathname : "/"; + var path = location.pathname; + var mdUrl = + path === base || path === base.replace(/\/$/, "") + ? base.replace(/\/$/, "") + "/index.md" + : path.replace(/\/$/, "") + ".md"; + + var bar = document.createElement("div"); + bar.className = "md-actions"; + + var copy = document.createElement("button"); + copy.type = "button"; + copy.className = "md-action"; + copy.textContent = "Copy as Markdown"; + copy.addEventListener("click", function () { + fetch(mdUrl) + .then(function (r) { + return r.text(); + }) + .then(function (text) { + return navigator.clipboard.writeText(text); + }) + .then(function () { + copy.textContent = "Copied ✓"; + setTimeout(function () { + copy.textContent = "Copy as Markdown"; + }, 1600); + }) + .catch(function () { + copy.textContent = "Copy failed"; + }); + }); + + var view = document.createElement("a"); + view.className = "md-action"; + view.href = mdUrl; + view.textContent = "View as Markdown"; + + var claude = document.createElement("a"); + claude.className = "md-action"; + claude.target = "_blank"; + claude.rel = "noopener"; + claude.href = + "https://claude.ai/new?q=" + + encodeURIComponent("Read " + location.origin + mdUrl + " and help me use model-ledger."); + claude.textContent = "Open in Claude"; + + bar.appendChild(copy); + bar.appendChild(view); + bar.appendChild(claude); + h1.insertAdjacentElement("afterend", bar); +}); diff --git a/docs/robots.txt b/docs/robots.txt new file mode 100644 index 0000000..f287ba9 --- /dev/null +++ b/docs/robots.txt @@ -0,0 +1,4 @@ +User-agent: * +Allow: / + +Sitemap: https://block.github.io/model-ledger/sitemap.xml diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css index f12b979..44588dd 100644 --- a/docs/stylesheets/extra.css +++ b/docs/stylesheets/extra.css @@ -193,3 +193,53 @@ margin-top: 1.6rem; display: flex; gap: .6rem; justify-content: center; flex-wrap: wrap; } + +/* ---------- Responsive display type ---------- */ +.ml-hero__title { font-size: clamp(2rem, 7vw, 2.9rem); } +.ml-404 .ml-hero__title { font-size: clamp(1.7rem, 6vw, 2.4rem); } + +/* ---------- Per-page Markdown actions (Copy / View / Open in Claude) ---------- */ +.md-actions { display: flex; flex-wrap: wrap; gap: .5rem; margin: -.2rem 0 1.5rem; } +.md-action { + font-family: "JetBrains Mono", monospace; font-size: .66rem; letter-spacing: .02em; + line-height: 1; padding: .42rem .62rem; cursor: pointer; text-decoration: none; + background: transparent; color: var(--md-default-fg-color--light); + border: 1px solid var(--ml-hairline); border-radius: 3px; + transition: border-color .12s ease, color .12s ease; +} +.md-action:hover { border-color: var(--md-accent-fg-color); color: var(--md-accent-fg-color); } + +/* ---------- API reference: extend the theme into mkdocstrings output ---------- */ +.md-typeset .doc-heading { font-family: "Spectral", Georgia, serif; font-weight: 600; } +.md-typeset .doc-object > .doc-heading { + border-top: 1px solid var(--ml-hairline); padding-top: .7em; margin-top: 1.6em; +} +.md-typeset .doc-section-title { + font-family: "Spectral", Georgia, serif; + text-transform: uppercase; letter-spacing: .07em; font-size: .78rem; + color: var(--md-default-fg-color--light); +} +.md-typeset .doc-label { + font-family: "JetBrains Mono", monospace; font-size: .62rem; letter-spacing: .02em; + background: var(--md-code-bg-color); color: var(--md-accent-fg-color); + border: 1px solid var(--ml-hairline); border-radius: 3px; padding: .1rem .4rem; +} +.doc-symbol { color: var(--md-accent-fg-color) !important; opacity: .8; } + +/* ---------- Blockquote: signature treatment (the agent transcript) ---------- */ +.md-typeset blockquote { + border-left: 2px solid var(--md-accent-fg-color); + background: var(--md-code-bg-color); + border-radius: 0 3px 3px 0; + color: var(--md-default-fg-color); +} + +/* ---------- Code-fence filename labels (when title= is set) ---------- */ +.md-typeset .highlight span.filename { + font-family: "JetBrains Mono", monospace; font-size: .68rem; + background: var(--md-code-bg-color); color: var(--md-default-fg-color--light); + border-bottom: 1px solid var(--ml-hairline); +} + +/* ---------- Glossary / definition lists ---------- */ +.md-typeset dl dt { font-family: "Spectral", Georgia, serif; font-weight: 600; } diff --git a/mkdocs.yml b/mkdocs.yml index 53d6bb6..6ff00c8 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -11,9 +11,10 @@ copyright: >- Apache-2.0 · Built in the open by <a href="https://opensource.block.xyz/">Block</a> -# Keep internal planning specs out of the public site. +# Keep internal planning specs and snippet-only includes out of the built site. exclude_docs: | superpowers/ + includes/ theme: name: material @@ -57,10 +58,17 @@ theme: - content.code.annotate - content.tabs.link - content.tooltips + - content.code.select + - navigation.path + - navigation.instant.prefetch + - navigation.instant.preview extra_css: - stylesheets/extra.css +extra_javascript: + - javascripts/md-actions.js + markdown_extensions: - abbr - admonition @@ -78,7 +86,11 @@ markdown_extensions: line_spans: __span pygments_lang_class: true - pymdownx.inlinehilite - - pymdownx.snippets + - pymdownx.snippets: + base_path: ["."] + check_paths: true + auto_append: + - docs/includes/abbreviations.md - pymdownx.superfences: custom_fences: - name: mermaid @@ -129,6 +141,7 @@ extra: nav: - Home: index.md - Quickstart: quickstart.md + - Installation: installation.md - Concepts: - concepts/index.md - DataNode & the graph: concepts/datanode.md @@ -136,6 +149,7 @@ nav: - Composites: concepts/composite.md - Guides: - Agents (MCP): guides/agents.md + - CLI: guides/cli.md - Choosing a backend: guides/backends.md - Connectors & discovery: guides/connectors.md - Recipes: @@ -143,4 +157,6 @@ nav: - Impact analysis: recipes/impact-analysis.md - Point-in-time inventory: recipes/point-in-time.md - Discover from a SQL registry: recipes/discover-sql.md + - Governance: governance.md - Reference: reference/index.md + - Glossary: glossary.md diff --git a/overrides/404.html b/overrides/404.html index d2e90d8..67c621a 100644 --- a/overrides/404.html +++ b/overrides/404.html @@ -10,5 +10,17 @@ <h1 class="ml-hero__title">This node isn’t in the graph.</h1> <a class="md-button" href="{{ 'quickstart/' | url }}">Quickstart</a> <a class="md-button" href="{{ 'reference/' | url }}">API reference</a> </p> + <p> + Or pick up a thread: + <a href="{{ 'concepts/' | url }}">Concepts</a> · + <a href="{{ 'guides/agents/' | url }}">Agents</a> · + <a href="{{ 'governance/' | url }}">Governance</a> · + <a href="{{ 'recipes/' | url }}">Recipes</a> · + <a href="{{ 'glossary/' | url }}">Glossary</a> + </p> + <p style="margin-top:1.4rem;"> + <small>Reading this as an agent? The whole site is at + <a href="{{ 'llms.txt' | url }}"><code>/llms.txt</code></a>.</small> + </p> </div> {% endblock %} diff --git a/overrides/main.html b/overrides/main.html index de84bcb..d81a248 100644 --- a/overrides/main.html +++ b/overrides/main.html @@ -1,5 +1,16 @@ {% extends "base.html" %} +<!-- Clean <title>: the homepage gets a tagline instead of "model-ledger - model-ledger". --> +{% block htmltitle %} + {%- if page and page.is_homepage -%} + <title>{{ config.site_name }} — git for models + {%- elif page and page.title and page.title | string != config.site_name -%} + {{ page.title | striptags }} · {{ config.site_name }} + {%- else -%} + {{ config.site_name }} + {%- endif -%} +{% endblock %} +