Skip to content

docs: add RFC-021 canonical law-model crate and schema/type codegen#796

Open
tdjager wants to merge 2 commits into
mainfrom
docs-rfc-law-model
Open

docs: add RFC-021 canonical law-model crate and schema/type codegen#796
tdjager wants to merge 2 commits into
mainfrom
docs-rfc-law-model

Conversation

@tdjager

@tdjager tdjager commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds RFC-021 (docs/src/content/rfcs/rfc-021.md), status Proposed: extract the law-YAML document model into a canonical leaf crate and wire schema/type codegen to it.

Motivation

A repo-wide architecture audit (2026-06-11) found:

  • F1 (high) — the law-YAML model is implemented 6 times: the full serde model in engine/src/article.rs, a private partial mirror plus line-based $id/name string parsing in corpus/src/source_map.rs, a third struct set serializing law YAML in harvester/src/yaml/writer.rs (stringly-typed regulatory_layer), a fourth hand-rolled line scanner in tui/src/backend/corpus_scanner.rs, untyped serde_yaml_ng::Value tree-walking in pipeline/src/enrich.rs, and 18 frontend JS files accessing machine_readable.* on js-yaml objects with no types. Root cause: corpus deliberately avoids depending on engine (the dep is reversed as a dev-dep), so every consumer re-derives a partial model.
  • F2 (high) — a law-format change touches 7–9 uncoordinated places (schema/vX dir, latest symlink, article.rs, two hand-synced supported-schemas lists, validate.rs include_str! chain, corpus mirror, harvester writer, frontend JS, conformance fixtures) with no codegen in either direction.

What the RFC proposes

  1. New leaf crate regelrecht-law-model (over growing regelrecht-shared) holding the document model, consumed by engine, corpus, harvester, tui, pipeline — with engine re-exports so nothing breaks, plus a deliberately tolerant parse_law_header() for the $id/name-only consumers (tolerance to malformed bodies is load-bearing for editor drafts and mid-enrichment files).
  2. Codegen direction: schemars (structs → schema) with a CI gate diffing the generated schema against schema/latest — fits the append-only schema/vX constraint and the multi-version supported-schemas table better than typify; includes an honest one-time reconciliation step since the generated schema won't be byte-identical to the hand-written one.
  3. TS types via ts-rs on the law model (the law object is client-side-parsed YAML, so OpenAPI/utoipa would type the envelope, not the payload); utoipa for API DTOs stays open as a separate step.
  4. Six migration stages that each keep every package compiling (extract+re-export → corpus → harvester → tui+pipeline → codegen gates → TS types), with explicit non-goals: no law-format changes, no behavior changes.
  5. Risks: WASM build weight (feature-gate schemars/ts-rs out of the wasm build), compile-time impact, the corpus↔engine dev-dep, schema-version multiplicity (the model tracks one current shape; frozen per-version JSON schemas keep validating older docs), tolerant-parsing semantics.

Docs-only change; the RFC proposes, the team decides.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

Preview Deployment — docs — docs

Your changes have been deployed to a preview environment:

URL: https://docs-pr796-regel-k4c.rig.prd1.gn2.quattro.rijksapps.nl

This deployment will be automatically cleaned up when the PR is closed.

tdjager added a commit that referenced this pull request Jun 11, 2026
…f to RFC-022

Review discussion on #766 surfaced that single-article repeal (Stb. 2018,
362 art. 2 / art. 76 Gwwd) deserved its own RFC rather than a scope-note
paragraph. RFC-020 now documents the succession model: a repealed article
is a new version of the same $id (BWB consolidation -> harvester version
file -> newest-wins), no article-level valid_to exists or is needed, and a
post-repeal lookup fails with the OutputNotFound data fact.

Since RFC-020 merges with this PR, the date-aware resolution RFC (as_of,
#767) is renumbered to RFC-022 (021 is claimed by #796); all references in
RFC-019 and the engine doc comment follow.
tdjager added a commit that referenced this pull request Jun 11, 2026
Renumbered from RFC-020: the article-repeal-as-version-succession RFC took
number 020 because it merges with #766; this follow-up becomes RFC-022
(021 is claimed by #796). Content otherwise unchanged; rebased onto the
updated #766 base so the RFC-019/020 links resolve.
@tdjager tdjager changed the title docs(rfcs): RFC-021 canonical law-model crate and schema/type codegen docs: add RFC-021 canonical law-model crate and schema/type codegen Jun 11, 2026
corpus, tui and auth do not depend on regelrecht-shared, so 'every
crate' overstated its reach; say 'most crates' instead.
@tdjager tdjager marked this pull request as ready for review June 11, 2026 15:07
tdjager added a commit that referenced this pull request Jun 12, 2026
…f to RFC-022

Review discussion on #766 surfaced that single-article repeal (Stb. 2018,
362 art. 2 / art. 76 Gwwd) deserved its own RFC rather than a scope-note
paragraph. RFC-020 now documents the succession model: a repealed article
is a new version of the same $id (BWB consolidation -> harvester version
file -> newest-wins), no article-level valid_to exists or is needed, and a
post-repeal lookup fails with the OutputNotFound data fact.

Since RFC-020 merges with this PR, the date-aware resolution RFC (as_of,
#767) is renumbered to RFC-022 (021 is claimed by #796); all references in
RFC-019 and the engine doc comment follow.
tdjager added a commit that referenced this pull request Jun 12, 2026
* docs(rfc): add RFC-019 temporal validity and date-aware reference resolution

Draft RFC for law end dates (valid_to) and date-aware cross-law reference
resolution. Adds the instrument end date and selection upper bound, an as_of
that lets a reference resolve at a date differing from the global
calculation_date and propagate down its subtree (ex tunc / statische
verwijzing / eerbiedigende werking), and conservative diagnostics for ended
references. Builds on RFC-001/003/007/008/010/012/013/015; every claim is
grounded in actual law (Aanwijzingen, Awb 7:11, Twm art. VIII, CEK23, WOR)
with a worked YAML example per mechanism.

* docs: tighten RFC-019 — remove duplication, fix valid_to consistency

Remove repeated inclusive-bound explanation, the duplicated axis-separation
note, and the duplicate Twm YAML block; trim wordy phrasing. Fix §1 which
still showed valid_to 2027-01-01 (now 2026-12-31, consistent with the
inclusive-bound conversion).

* docs(rfc-019): resolve open questions with legal grounding

Investigate the open questions against law and jurisprudence and fold the
findings back in:
- applicable-date rule is layered (art. 7:11 Awb + jurisprudential ex-nunc
  reading; overgangsrecht default = onmiddellijke werking, Aanwijzing 5.61,
  with exceptions in the specific law per Aanwijzing 5.60) -> jurisprudential
  defaults belong in Engine Policy, not engine code
- multiple dates per beoordeling are codified: art. 5:46 lid 4 Awb lex mitior
  (ABRvS 21-05-2025, 30-08-2023; CBb 07-07-2021) -> confirms as_of must be
  per-reference; add a lex-mitior MIN-over-two-as_of example
- as_of propagation, instrument-vs-substantive scope, and hooks/overrides
  resolved as design decisions

* docs(rfc-019): sharpen lex mitior — mandatory, offender-favourable, punitive-only

Clarify that art. 5:46 lid 4 Awb lex mitior is mandatory (not a free choice
and not favourable-to-administration), means favourable to the offender,
applies only to punitive sanctions (not a general best-of-dates rule), and
that the favourable-selector lives in the law (MIN for a fine, IF if the
conduct is no longer punishable) — the engine never optimizes over dates.

* docs(rfc-019): cite primary source for Aanwijzingen + clarify their status

Add the official consolidated text (BWBR0005730) as the primary source for the
Aanwijzingen voor de regelgeving, and clarify that they are a besluit van de
Minister-President (drafting guidelines binding on the wetgever, not
substantive law) — substantive temporal effects rest on the laws and
jurisprudence. kcbr.nl deep links kept as the official navigation portal.

* feat(engine): law end dates (valid_to) — RFC-019 concept 1

Schema v0.5.3 adds an optional law-level valid_to (instrument end date), set
only when a law genuinely ends with no successor. Engine version selection
gains an inclusive upper bound: past valid_to a version is no longer in force
and the engine does NOT fall through to an older version (RFC-019 §2). A typed
SelectionReason (NotFound / NotYetInForce / EndedOn) backs honest diagnostics
(§3) without asserting 'geen grondslag'. The harvester derives valid_to from
BWB einddatum only for the final consolidation with a finite einddatum
(manifest::resolve_valid_to), wired through the pipeline, and the writer emits
it. Engine version bumped 0.2.0 -> 0.3.0.

Scope: concept 1 of the split. Date-aware resolution (as_of, ex nunc/ex tunc,
statische verwijzing, lex mitior) is RFC-020.

* fix(harvester): add valid_to to LawMetadata doctest example

* docs(rfc-019): 'routinely' -> 'can' end without a successor

Drop the subjective frequency claim; a single instance is enough to motivate
the feature.

* feat(corpus): add CEK23 prijsplafond regeling as executable valid_to example

Harvest Subsidieregeling bekostiging plafond energietarieven kleinverbruikers
2023 (BWBR0047628) and interpret art. 2.2 (plafondtarief) and art. 2.3
(volumeplafond) as machine_readable — grounded, reverse-validated constants the
articles literally vaststellen. valid_to '2026-12-31' from art. 8.1 lid 2
('vervalt met ingang van 1 januari 2027'). Demonstrates RFC-019 end-to-end:
evaluate at 2023-06-01 returns the caps; at 2027-06-01 the law is not in force.

The consumer korting (verbruik x (contracttarief - plafondtarief)) is NOT in
the regeling text and is deliberately not modeled. RFC-019 worked example A
updated to match the real law. Layer corrected to MINISTERIELE_REGELING.

* docs(rfc-019): reframe valid_to around termination; ground end date in art. 8.1

- Reframe semantics: valid_to marks that the law is *terminated* (vervalt /
  ingetrokken), not 'ends without a successor'. A law repealed and replaced by a
  different $id is still terminated and still carries valid_to.
- Two provenance modes: intrinsic (horizonbepaling in the law's own text) vs
  extrinsic (terminated by another instrument -> date derived from that
  instrument, analogous to RFC-007 overrides; future work).
- Boundary note: validity is static version-selection metadata today; deriving
  it (extrinsic, or 'bij KB te bepalen tijdstip') makes in-force status an
  executed property -- a deliberate boundary RFC-019 does not cross.
- CEK23: art. 8.1 now outputs the grounded vervaldatum ('2027-01-01', verbatim
  lid 2); top-level valid_to is the denormalised inclusive last day (D-1).
- Align schema description + engine/harvester doc comments with the framing.

* docs(rfc-019): fix Twm example, ground the valid_to-per-version rejection, add KB example

- The syntax/example snippets used the Twm with a literal valid_to '2022-05-18'
  — but that date is not in the law: art. VIII lets a KB extend the geldingsduur
  (or parts) and the end followed from the rejected 5th-extension goedkeuringswet.
  Replace the §1 syntax snippet with the clean intrinsic CEK23 case, and reframe
  example B as the hard case: dynamic, KB-determined, per-provision end that a
  static law-level valid_to cannot capture (the §1 boundary).
- Reframe the 'valid_to on every version' rejection around faithfulness: a
  superseded version has no end date in the text, so adding one would be invented
  — not 'redundant with newest-wins'.
- Add concrete extrinsic KB examples (Twm verlengings-KBs; Wet open overheid
  art. 10.2f 'vervalt op een bij KB te bepalen tijdstip').

* docs(rfc-019): replace em dashes with hyphens

* feat(engine): wire honest expired-law diagnostics into the cross-law path (RFC-019 §3)

Review findings on #766:

- SelectionReason was only used by its own unit tests; an ended law still
  surfaced as 'Law not found' on the cross-law path - exactly the misleading
  diagnostic §3 prohibits. The three law lookups in service.rs now map the
  typed reason to honest errors: LawNotYetInForce / LawEnded carry the
  reference date and valid_to ('no version of law X in force on D; last in
  force until E'), never a legal verdict.
- EndedOn now carries a NaiveDate instead of a String.
- ResolutionContext rejects malformed calculation dates up front: an
  unparseable date silently became None and bypassed valid_from/valid_to
  version selection entirely.
- The receipt's loaded_regulations records valid_to next to valid_from (§4).
- schema v0.5.3: valid_to pattern is date-only. It had inherited valid_from's
  '#.+' alternative, but the engine reads valid_to as static metadata and
  silently ignores unparseable values - a schema-valid '#ref' would make a
  law never end. Derived end dates are explicitly future work (§1 boundary).
- BDD scenarios promised by the RFC's Testing section: features/einddatum.feature
  with test laws test_einddatum / test_einddatum_afnemer (inclusive upper
  bound, no resolution after the end date, expired cross-reference states
  the data fact).

* chore: point annotation tooling at schema v0.5.3

The PR adds schema/v0.5.3 (including an unchanged annotation-schema.json copy)
but the annotation validators and the editor's annotation writers still
referenced v0.5.2, leaving the new copy unused. Bump the embedded schema
includes and the $schema URLs to v0.5.3.

* docs(rfc-019): correct legal facts against the official sources

Every claim re-verified against wetten.overheid.nl, Staatsblad and the
Eerste Kamer dossiers:

- Twm: measures vervielen per 20 mei 2022 (last day in force 19 mei), not
  'on 19 mei'; the rejected goedkeuringswet is dossier 36.042; art. VIII
  lid 3 allows extension by *at most* three months ('steeds ten hoogste');
  the hypothetical literal valid_to is 2022-05-19.
- Tijdelijke regeling (BWBR0044416): vervallen per the *same* date as the
  Twm (20 mei 2022), not one day later.
- Aanwijzingen: only the old Ar 243 (pre-2018) stated 'van rechtswege
  vervallen'; the current 6.24 prescribes explicit mede-intrekking
  precisely because that doctrine caused unclarity.
- Woo art. 10.2f lid 1 quoted verbatim ('Hoofdstuk 6 vervalt bij koninklijk
  besluit'), replacing a paraphrase presented as a quote.
- CEK23 art. 8.1 lid 2 quoted in full, including the staartzin ('blijft van
  toepassing op subsidies die voor die datum zijn verstrekt') - eerbiedigende
  werking inside the worked example itself, now flagged as the canonical
  real-corpus bridge case for RFC-020's as_of.
- Faithfulness note corrected: the supplier subsidy formula *is* in
  art. 3.1-3.5 (volume-capped verbruik x (contracttarief - plafondtarief));
  what the regeling never computes is a consumer-facing korting.
- source.regulation lives in RFC-001 §9, not §6.
- CEK23 YAML: plafondtarief outputs typed number (euro-per-unit rates with
  five decimals; the corpus amount convention is whole eurocents and
  type_spec has no tariff units yet - schema gap noted in the RFC).
- Testing/§3 text updated to match the implemented behaviour (typed errors,
  features/einddatum.feature; evaluate at 2027-06-01 now fails with the
  data fact instead of returning no outputs).

* docs(rfc-019): add a real Aanwijzing 6.24 example (Stb. 2018, 362)

The verweesde-regelingen paragraph cited Aanwijzing 6.24 without a law that
actually follows it. Stb. 2018, 362 is the textbook case: one besluit drops
the delegation basis (art. 76 Gwwd) and explicitly repeals the Fokkerijbesluit
and Fokkerijregeling resting on it, with a nota van toelichting that cites
aanwijzing 6.24 verbatim. Quotes verified against officielebekendmakingen.nl.

* docs(rfc-019): article repeal is succession; terminates binds at harvest/load time, not execution time

Two clarifications from review discussion:

- Scope note: repealing a single article (Stb. 2018, 362 art. 2 / art. 76
  Gwwd) is an ordinary amendment handled by version succession (new
  consolidation, new version of the same $id, newest-wins) - no
  article-level valid_to exists or is needed.
- Extrinsic mode: the unilateral declaration shape matches
  overrides/implements, but the binding moment cannot - in-force status must
  be known before execution, so a terminates-effect binds at harvest time
  today (BWB einddatum -> literal valid_to) and at load/index time in the
  future mechanism; moving end dates (Twm verlengings-KBs) favour load-time
  derivation over harvest-time literals.

* docs(rfc-019): separate the two halves of the Stb. 2018, 362 example

Art. 2 (article repeal) is ordinary succession, not this RFC's problem;
art. 3 (Fokkerijbesluit + Fokkerijregeling repealed as whole instruments)
is exactly the valid_to case. Make that explicit so the example is not read
as an article-level valid_to.

* docs: add RFC-020 article repeal as version succession; renumber as_of to RFC-022

Review discussion on #766 surfaced that single-article repeal (Stb. 2018,
362 art. 2 / art. 76 Gwwd) deserved its own RFC rather than a scope-note
paragraph. RFC-020 now documents the succession model: a repealed article
is a new version of the same $id (BWB consolidation -> harvester version
file -> newest-wins), no article-level valid_to exists or is needed, and a
post-repeal lookup fails with the OutputNotFound data fact.

Since RFC-020 merges with this PR, the date-aware resolution RFC (as_of,
#767) is renumbered to RFC-022 (021 is claimed by #796); all references in
RFC-019 and the engine doc comment follow.

* docs(rfc-020): precise grounding of the file-per-version convention

The one-file-per-consolidation-date layout was only visible in RFC-003/018
examples, never decided; RFC-020 now states it makes that explicit instead
of mis-citing RFC-001 §5 (which grounds valid_from, not the layout). Also
link the Stb. 2018, 362 art. 2 half in RFC-019 directly to RFC-020.

* docs(rfc-019): fold the article-repeal RFC back in as an Alternatives entry

On reflection, article repeal contains no decision of its own: like any
amendment it is enacted by another instrument and is already fully expressed
by version succession (new consolidation -> new version of the same $id ->
newest-wins). A standalone RFC that decides nothing dilutes the register, so
rfc-020.md is removed and its substance lands where the question is actually
asked: the scope note (succession + the file-per-consolidation-date layout)
and a rejected alternative 'Article-level valid_from/valid_to' with the
Stb. 2018, 362 / art. 76 Gwwd grounding. The as_of follow-up reclaims number
RFC-020 (renumber reverted in #767).

* docs(rfc-019): tighten - problem/solution/examples, bullets over prose

Same verified facts, ~35% shorter. Restructured per review feedback:
problem statement and scope as bullet lists, a Grounding table naming the
real legislation the RFC is built on (CEK23, WOR, Twm/Trm, Stb. 2018, 362,
Woo 10.2f, Aanwijzingen, BWB einddatum), the decision sections as bullets,
worked examples kept with their YAML, and the per-law reference links
deduplicated into the table.

* fix: address claude-review findings on valid_to robustness

- schema v0.5.3: $id and title still self-reported as v0.5.2 in both
  schema.json and annotation-schema.json (copy-paste from the v0.5.2 dir);
  external JSON Schema tooling keys on $id, so both bumped to v0.5.3.
- engine: a format-valid but calendar-invalid valid_to ('2023-02-30')
  passed the schema regex and parse_date(..).ok() then silently skipped the
  expiry check - the law would stay in force forever. load_law now rejects
  an unparseable valid_to with a LoadError (valid_from keeps its lenient
  '#'-reference convention). Unit test added.
- resolver: documented that the None arm of get_law_for_date_reported
  deliberately skips the valid_to window (display/listing semantics) and
  that execution paths always pass Some(date).

* fix: carry RFC-019 §3 data facts through the external error boundary

Second claude-review round:

- ExternalError collapsed LawNotYetInForce/LawEnded into a single
  LawNotInForce(law_id), dropping the reference date and 'last in force
  until' fact for WASM/API consumers. The validity window is public legal
  data, not internal detail - the external variants now mirror the internal
  ones with the same honest messages.
- harvester resolve_valid_to: comment documenting that the lexicographic
  date comparison is sound because BWB dates are ISO 8601.
- The eurocent/tariff-unit gap flagged on CEK23 art. 2.2 is already
  documented in the YAML comments and the RFC (acknowledged by the review
  itself); no change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant