Skip to content

Commit 0023fd8

Browse files
vigneshnarayanaswamyVignesh Narayanaswamyclaude
authored
docs: documentation site (MkDocs Material) — no-drift, AI-native (#12)
* docs: add MkDocs Material site with no-drift API ref + llms.txt Build the public documentation site so the advertised Documentation URL (https://block.github.io/model-ledger) stops 404ing. Docs-only; no SDK changes. - mkdocs.yml — Material theme, mkdocstrings/Griffe API reference (generated from src/, cannot drift), mermaid, multi-surface content tabs - docs/ — landing (four surfaces + self-building-graph demo), 60s quickstart, concepts (DataNode / Snapshot / Composite), guides (agents/MCP, backends, connectors), recipe cookbook, auto API reference - docs/stylesheets/extra.css — "technical-archive" theme: serif display, warm paper, oxblood accent, hairline rules - docs_hooks/llms_txt.py — build-time llms.txt + llms-full.txt + per-page .md for AI agents, zero extra dependencies - .github/workflows/docs.yml — build --strict on every PR, gh-deploy to GitHub Pages on push to main - pyproject.toml — add [docs] optional-dependencies extra - .pre-commit-config.yaml — exclude mkdocs.yml from check-yaml (custom tags) - .gitignore — ignore /site/ build output Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: make it sing — animated hero, OG card, favicon, on-brand 404 Craft pass over the docs landing + shareability: - Landing moment: the hero is now two columns; a dependency graph assembles itself on the right — nodes pop in (declared), then edges draw in (connect() links them), visualizing the core thesis. Pure inline SVG + CSS, no JS; respects prefers-reduced-motion. - Open Graph card: scripts/make_og_card.py renders a 1200x630 "technical-archive" social card (docs/assets/og-image.png); overrides/main.html injects per-page og:/twitter: meta so shared links preview beautifully. - Favicon + logo: docs/assets/favicon.svg — a minimal oxblood graph glyph, scalable, light/dark safe. - 404: overrides/404.html — "This node isn't in the graph." with routes home, to the quickstart, and the reference. Docs-only; strict build green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: execute examples in CI + griffe API check — close the drift hole The API reference is generated from source and `mkdocs build --strict` catches broken references/links. The remaining risk was hand-written prose: a tutorial snippet calling an API that has since changed. Now the snippets run in CI. - tests/test_docs_examples.py — extracts the runnable ```python blocks from each docs page (including tab-indented ones), skips blocks needing external resources (Snowflake/boto3/live connectors/GitHub), and execs each page's blocks in one namespace. A renamed param or removed method fails the PR. - Fixes two real bugs the test caught in the snippets: * Ledger.record() is record(model, *, event, payload, actor) — event/payload/ actor are keyword-only and required (had conflated it with the MCP tool). * inventory_at() needs a timezone-aware datetime; naive dates raised TypeError. point-in-time recipe rewritten to a self-contained, honest demonstration. - .github/workflows/docs.yml — run the example tests (blocking) before build + deploy; add an advisory `griffe check` step flagging breaking public-API changes vs. the base branch. - pyproject.toml — add griffecli to the [docs] extra for `griffe check`. Verified: deliberately breaking a snippet turns the test red; reverting is green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Vignesh Narayanaswamy <Vigneshn@squareup.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent eabf0fc commit 0023fd8

29 files changed

Lines changed: 2091 additions & 0 deletions

.github/workflows/docs.yml

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
name: docs
2+
3+
on:
4+
push:
5+
branches: [main]
6+
paths:
7+
- "docs/**"
8+
- "docs_hooks/**"
9+
- "mkdocs.yml"
10+
- "src/**"
11+
- "tests/test_docs_examples.py"
12+
- ".github/workflows/docs.yml"
13+
pull_request:
14+
paths:
15+
- "docs/**"
16+
- "docs_hooks/**"
17+
- "mkdocs.yml"
18+
- "src/**"
19+
- "tests/test_docs_examples.py"
20+
21+
# Allow one concurrent deployment; cancel superseded runs.
22+
concurrency:
23+
group: docs-${{ github.ref }}
24+
cancel-in-progress: true
25+
26+
permissions:
27+
contents: write # mkdocs gh-deploy pushes the built site to the gh-pages branch
28+
29+
jobs:
30+
build-and-deploy:
31+
runs-on: ubuntu-latest
32+
steps:
33+
# NOTE: tags are pinned to digests by Renovate (see renovate.json) on its next run.
34+
- uses: actions/checkout@v4
35+
with:
36+
fetch-depth: 0 # gh-deploy needs full history to update gh-pages
37+
38+
- uses: actions/setup-python@v5
39+
with:
40+
python-version: "3.12"
41+
42+
- name: Install
43+
run: pip install -e ".[docs,dev]"
44+
45+
# The docs cannot drift from the SDK: every runnable code example in the
46+
# docs is executed against the installed package. A renamed param or
47+
# removed method breaks a snippet here and fails the PR.
48+
- name: Test docs examples
49+
run: pytest tests/test_docs_examples.py -q
50+
51+
# Always validate: a broken link, a missing nav entry, or a bad
52+
# mkdocstrings reference fails the build (and therefore the PR).
53+
- name: Build (strict)
54+
run: mkdocs build --strict
55+
56+
# Advisory radar: flag breaking public-API changes vs. the base branch so
57+
# they're noticed and changelog'd. Non-blocking during alpha.
58+
- name: Check public API (griffe)
59+
continue-on-error: true
60+
run: |
61+
git fetch origin main --depth=1 || true
62+
python -m griffe check model_ledger -s src -a origin/main
63+
64+
# Deploy only from main, only on push (not PRs).
65+
- name: Deploy to GitHub Pages
66+
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
67+
run: mkdocs gh-deploy --force --no-history

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@ uv.lock
2626
# Internal design docs (contain org-specific context)
2727
docs/superpowers/
2828

29+
# MkDocs build output (the docs site is built + deployed in CI)
30+
/site/
31+
.cache/
32+
2933
# Detailed internal blocklist (committed config is .gitleaks.toml)
3034
.gitleaks-internal.toml
3135

.pre-commit-config.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@ repos:
55
- id: trailing-whitespace
66
- id: end-of-file-fixer
77
- id: check-yaml
8+
# mkdocs.yml uses MkDocs' custom !!python/ YAML tags that the safe
9+
# loader can't parse; `mkdocs build --strict` validates it in CI instead.
10+
exclude: ^mkdocs\.yml$
811
- id: check-json
912
- id: check-toml
1013
- id: check-added-large-files

docs/assets/favicon.svg

Lines changed: 12 additions & 0 deletions
Loading

docs/assets/og-image.png

45.7 KB
Loading

docs/concepts/composite.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
---
2+
title: Composites
3+
description: Governed groups whose members are themselves models — a business-level entity that rolls up its technical components, each governed in its own right.
4+
---
5+
6+
# Composites
7+
8+
A regulator doesn't approve "a SQL job." They approve a **Credit Decision System**.
9+
But that system is really a scorecard, some policy rules, and an ETL pipeline — each
10+
of which deserves its own governance.
11+
12+
A **composite** is the business-level entity that aggregates technical components.
13+
Critically, a member *is itself a model* — so it has its own owner, history, and
14+
validation. Composites are the layer no plain registry or catalog models.
15+
16+
## Register a group and its members
17+
18+
`register_group()` creates the composite and links each member with the
19+
`member_of` relationship:
20+
21+
```python
22+
from model_ledger import Ledger
23+
ledger = Ledger.from_sqlite("./inventory.db")
24+
25+
group = ledger.register_group(
26+
name="Credit Scorecard",
27+
owner="risk-team",
28+
model_type="ml_model",
29+
tier="high",
30+
purpose="Credit risk scoring pipeline",
31+
members=["feature_pipeline", "scoring_model", "alert_queue"],
32+
actor="system",
33+
)
34+
```
35+
36+
```mermaid
37+
graph TD
38+
G["Credit Scorecard<br/><small>composite · tier: high</small>"]
39+
G --- M1["feature_pipeline"]
40+
G --- M2["scoring_model"]
41+
G --- M3["alert_queue"]
42+
classDef ink fill:#1c1a17,color:#f7f3ec,stroke:#000;
43+
classDef ox fill:#efe8da,stroke:#7a1a1a,color:#1c1a17;
44+
class G ink; class M1,M2,M3 ox;
45+
```
46+
47+
## Membership is an event, too
48+
49+
Add and remove members over time — each change is recorded as a snapshot, so you can
50+
ask *who belonged to this system on any past date*:
51+
52+
```python
53+
ledger.add_member("Credit Scorecard", "challenger_model", role="challenger", actor="risk-team")
54+
ledger.remove_member("Credit Scorecard", "scoring_model", actor="risk-team")
55+
56+
ledger.members("Credit Scorecard") # current members (replayed from the event log)
57+
ledger.groups("scoring_model") # which composites a model belongs to
58+
59+
from datetime import datetime
60+
ledger.membership_at("Credit Scorecard", datetime(2025, 12, 31)) # membership as of a date
61+
```
62+
63+
## Roll-up view
64+
65+
`composite_summary()` aggregates a composite and its members into a single governance
66+
view — tiers, statuses, open observations, and validation state across the whole
67+
system:
68+
69+
```python
70+
summary = ledger.composite_summary("Credit Scorecard")
71+
```
72+
73+
This is what makes composites the **primary inventory entry** for governance: an
74+
examiner reads ~one entry per business system, and every technical component beneath it
75+
remains individually traceable.
76+
77+
!!! note "Observations & validations"
78+
Composites also carry governance events — `record_observation()`,
79+
`resolve_observation()`, and `record_validation()` — so findings and validation
80+
outcomes live in the same immutable log as everything else. See the
81+
[API reference](../reference/index.md).

docs/concepts/datanode.md

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
---
2+
title: DataNode & the graph
3+
description: Everything is a DataNode with typed ports. Declare inputs and outputs; the dependency graph builds itself.
4+
---
5+
6+
# DataNode & the graph
7+
8+
The core insight: **a model, a rule, an ETL job, and an alert queue are the same
9+
shape.** Each consumes some things and produces others. So they're all one type —
10+
`DataNode` — and the dependency graph falls out of matching what they produce to
11+
what others consume.
12+
13+
## A node is what it reads and writes
14+
15+
```python
16+
from model_ledger import DataNode
17+
18+
DataNode(
19+
name="fraud_scorer",
20+
platform="ml",
21+
inputs=["customer_features"], # what it consumes
22+
outputs=["risk_scores"], # what it produces
23+
metadata={"framework": "xgboost", "owner": "risk-team"},
24+
)
25+
```
26+
27+
`inputs` and `outputs` are **ports** — the names of the data flowing in and out. A
28+
plain string becomes a [`DataPort`](#dataport-precision) automatically.
29+
30+
## The graph builds itself
31+
32+
You never draw edges. You call `connect()`, and every place an output port name
33+
matches an input port name becomes a dependency:
34+
35+
```python
36+
from model_ledger import Ledger, DataNode
37+
38+
ledger = Ledger()
39+
ledger.add([
40+
DataNode("segmentation", platform="etl", outputs=["customer_segments"]),
41+
DataNode("fraud_scorer", platform="ml", inputs=["customer_segments"], outputs=["risk_scores"]),
42+
DataNode("fraud_alerts", platform="alerting", inputs=["risk_scores"]),
43+
])
44+
ledger.connect()
45+
46+
ledger.trace("fraud_alerts") # ['segmentation', 'fraud_scorer', 'fraud_alerts']
47+
ledger.upstream("fraud_alerts") # everything that feeds it
48+
ledger.downstream("segmentation")# everything that depends on it
49+
```
50+
51+
```mermaid
52+
graph LR
53+
A["segmentation"] -->|customer_segments| B["fraud_scorer"] -->|risk_scores| C["fraud_alerts"]
54+
classDef n fill:#efe8da,stroke:#7a1a1a,color:#1c1a17;
55+
class A,B,C n;
56+
```
57+
58+
This is why discovery scales: a connector just emits `DataNode`s with their ports,
59+
and the cross-platform graph assembles itself — an ETL job in your warehouse links to
60+
a model in MLflow links to a queue in your alerting system, with no shared ID scheme.
61+
62+
## DataPort precision
63+
64+
When two models legitimately write a table with the same name, a bare port name would
65+
collide. `DataPort` carries optional schema to disambiguate — edges only form when the
66+
schema matches too:
67+
68+
```python
69+
from model_ledger import DataNode, DataPort
70+
71+
DataNode("check_rules", outputs=[DataPort("alerts", model_name="checks")])
72+
DataNode("card_rules", outputs=[DataPort("alerts", model_name="cards")])
73+
DataNode("check_queue", inputs=[DataPort("alerts", model_name="checks")])
74+
# check_queue connects to check_rules only — model_name must match.
75+
```
76+
77+
Port matching is case-insensitive, and schema values support `%` wildcards.
78+
79+
## From node to governed model
80+
81+
A `DataNode` gives you structure. To give a node an **identity and history**
82+
owner, risk tier, purpose, and an audit trail — you
83+
[`register()`](../reference/index.md) it as a [`ModelRef`](snapshot.md) and
84+
[`record()`](snapshot.md) events against it. Discovery and registration are two views
85+
of the same inventory: the graph (what connects to what) and the ledger (what each
86+
thing *is* and how it changed).
87+
88+
[Next: Snapshots & the event log :octicons-arrow-right-24:](snapshot.md)

docs/concepts/index.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
---
2+
title: Concepts
3+
description: The whole model in three ideas — the DataNode graph, the Snapshot event log, and Composites.
4+
---
5+
6+
# Concepts
7+
8+
model-ledger is small on purpose. Three ideas carry the whole system.
9+
10+
<div class="grid cards" markdown>
11+
12+
- :material-graph-outline:{ .lg } &nbsp;__[DataNode & the graph](datanode.md)__
13+
14+
---
15+
16+
Everything is a `DataNode` with typed input/output ports. Declare what a node
17+
reads and writes; the dependency graph builds itself from port matching.
18+
19+
- :material-history:{ .lg } &nbsp;__[Snapshot & the event log](snapshot.md)__
20+
21+
---
22+
23+
A model is an identity (`ModelRef`). Everything that happens to it is an
24+
immutable, content-addressed `Snapshot`. The inventory is an append-only log.
25+
26+
- :material-layers-outline:{ .lg } &nbsp;__[Composites](composite.md)__
27+
28+
---
29+
30+
Governed groups whose members are themselves models. A "credit decision system"
31+
that rolls up its scorecard, policy rules, and ETL — each governed in its own right.
32+
33+
</div>
34+
35+
## How they fit together
36+
37+
```mermaid
38+
graph TB
39+
subgraph identity ["Identity"]
40+
REF["ModelRef<br/><small>name · owner · type · tier · purpose</small>"]
41+
end
42+
subgraph history ["History (append-only)"]
43+
S1["Snapshot<br/><small>registered</small>"] --> S2["Snapshot<br/><small>retrained</small>"] --> S3["Snapshot<br/><small>validated</small>"]
44+
end
45+
subgraph graph ["Graph"]
46+
N1["DataNode"] -->|port match| N2["DataNode"]
47+
end
48+
REF --- S1
49+
REF -.is a node in.- N1
50+
classDef ink fill:#1c1a17,color:#f7f3ec,stroke:#000;
51+
classDef ox fill:#7a1a1a,color:#fff,stroke:#5a1010;
52+
class REF ink; class S1,S2,S3 ox;
53+
```
54+
55+
- **Identity** is the minimum a regulator needs: who owns it, what kind of model,
56+
how risky, what it's for.
57+
- **History** is every change, immutable and ordered. You can ask the inventory what
58+
it looked like on any past date.
59+
- **Graph** is how models relate. Declare ports; dependencies follow.
60+
61+
A fourth idea — **compliance profiles** (SR 11-7, EU AI Act, NIST AI RMF) — reads this
62+
data to check completeness. It's a pluggable layer, not part of the core model; see the
63+
[API reference](../reference/index.md).

0 commit comments

Comments
 (0)