Skip to content

docs: schema-validated documentation examples#359

Merged
igrigorik merged 18 commits into
mainfrom
docs/example-validation
May 21, 2026
Merged

docs: schema-validated documentation examples#359
igrigorik merged 18 commits into
mainfrom
docs/example-validation

Conversation

@igrigorik

@igrigorik igrigorik commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Validation harness to ensure every JSON example in the spec docs validates against UCP schemas, with enforcement to keep it that way: when schemas change, stale examples break CI before they mislead readers.

Contract

Every ```json block in the spec MUST be preceded by an HTML comment annotation. Unannotated blocks are hard failures.

<!-- ucp:example schema=shopping/checkout op=create direction=request -->
<!-- ucp:example schema=shopping/cart target=$.discounts -->
<!-- ucp:example schema=common/loyalty def=loyalty extract=$.loyalty -->
<!-- ucp:example schema=profile def=business_schema -->
<!-- ucp:example skip reason="OAuth metadata, not UCP payload" -->

Authoring conveniences beyond strict JSON (small fixed set):

  • // line comments (stripped before parse)
  • {{ ucp_version }} template variable
  • HTTP envelope unwrap (request/response line + headers, body extracted)
  • Elision sentinels: { ... }, [ ... ], "..." mean "this field/container is present, value not asserted"

Full grammar: docs/documentation/schema-authoring.md.

Coverage

292 JSON blocks across 41 files
 253 validated (86.6%)
  39 skipped (all with precise, grep-able reasons)
   0 unannotated, 0 failed, 0 errors

The 39 skips fall into two categories:

  • Meta docs (25): JSON Schema fragments teaching how to author schemas;
  • Non-UCP payloads (14): third-party OAuth metadata, tokenization API responses, payment-handler templates

Capabilities now covered end-to-end: catalog (search + lookup), cart, checkout, order, discount, fulfillment, loyalty, identity-linking, profile, plus all transport bindings (REST, MCP, A2A, Embedded Protocol) and JSON-RPC envelopes.


Bugs caught and fixed

Schema-vs-doc drift surfaced by the validation pass:

  • checkout.md: item missing required price field in disclosure example
  • checkout.md: totals array missing required subtotal entry
  • checkout-rest.md: business outcome used non-schema available_quantity field on line_item; replaced with message-based pattern
  • discount.md (4 instances): quantity wrongly nested inside item object instead of on line_item
  • discount.md: price and title shown in create requests where the request schema omits them (request: only item.id valid)
  • ap2-mandates.md, checkout-rest.md, discount.md, overview.md: trailing commas masked by lenient parser, exposed when strict-JSON enforcement landed
  • checkout-a2a.md: examples showed bare A2A Message payloads; rewrapped in the JSON-RPC message/send envelope they actually travel in on the wire
  • Various: {...spread} JS-style pseudo-elision replaced with the validator's canonical bare-form { ... } / [ ... ]

Enforcement

  • CI (.github/workflows/docs.yml): full-corpus validator + unit tests on every PR. Mandatory backstop.
  • Local hooks (opt-in via pre-commit install --hook-type pre-commit --hook-type pre-push): three hooks scoped by file filter:
    • Changed docs/*.md files validated incrementally on commit
    • Full-corpus check when source/schemas/ or the validator itself changes (cross-file regression case)
    • Unit tests when validator code changes

Type of change

  • Documentation update
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings

@igrigorik igrigorik self-assigned this Apr 16, 2026
@igrigorik igrigorik added the TC review Ready for TC review label Apr 16, 2026
@igrigorik igrigorik requested review from a team as code owners April 16, 2026 05:28

@drewolson-google drewolson-google left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, one small comment on empty request bodies.

Comment thread docs/specification/checkout-rest.md Outdated

@wry-ry wry-ry left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scaffold approach would require that we write (and maintain) a scaffold for most types, right? An alternative approach that I was exploring was:

  • Force every example to not omit any required root level properties (value can be ellipsis)
  • Use ucp-schema to resolve the schema based on the annotation
  • Use jsonschema library to validate the example against the schema
    • Use jsonschema's extend function to create a custom validator class that skips ellipses

May be more convoluted, but would save the extra work of maintaining additional scaffold files. Curious to hear your thoughts.

@raginpirate

raginpirate commented Apr 22, 2026

Copy link
Copy Markdown
Contributor

I have no issue with this PR, but I would love to also challenge the fundamental requirement to have all required top-level keys be provided in an unignored example.
We don't want to bloat every snippet with whole complete top level properties, so I feel like many docs will end up leveraging the ignore block 🤔

Can we solve this using a custom key like "...": true or "__rest__": true to ignore evaluating any other required but not present fields in that obj. And perhaps we could display that nicely in the final output docs? {"id": 1234, "...": true} => {"id": 1234, ...}

This can be viewed as non-blocking if we just want to get V0 out, but it could be nice to disallow the "skip" unless you are providing completely custom, non-ucp json 😄

igrigorik added 15 commits May 19, 2026 09:04
  Introduce a validation harness that ensures every JSON example in the spec
  docs is either validated against UCP schemas or explicitly skipped. This
  catches documentation drift — when schemas change, stale examples break CI
  instead of silently misleading readers.

  ## Contract

  Every ```json block requires an annotation on the preceding line:

      <!-- ucp:example schema=shopping/checkout op=read -->
      <!-- ucp:example schema=shopping/checkout path=$.totals op=read -->
      <!-- ucp:example schema=shopping/cart op=create direction=request -->
      <!-- ucp:example skip reason="JSON-RPC transport binding" -->

  Unannotated blocks are hard failures.

  Principle: "if you open it, you own it." When an example opens an object's
  braces, every required field (per the resolved schema for the declared
  op + direction) must be shown or acknowledged. This applies recursively.

  Three-tier value contract for annotated (non-skip) examples:
  - Full value: validated against schema (types, constraints, enums)
  - Required fields ({ ... }, [ ... ], "..."): field must exists but
    values are optional; scaffold fills it for validation.
  - Absent fields only permitted for optional schema fields

  ## Pipeline

  1. Extract — find annotated ```json blocks (handles tab indentation)
  2. Unwrap — strip HTTP headers from request/response examples
  3. Expand — substitute {{ ucp_version }} with valid date
  4. Preprocess — convert bare ... to valid JSON ({ ... } → {"...":"..."})
  5. Coverage — walk resolved schema, verify required fields acknowledged
  6. Strip — remove ellipsis markers
  7. Merge — deep-merge into scaffold (example wins, scaffold fills gaps)
  8. Validate — ucp-schema validate on merged payload
  9. Report — map errors to source file:line

  ## Bugs caught and fixed

  - checkout.md: item missing required `price` field in disclosure example
  - checkout.md: totals array missing required `subtotal` entry
  - checkout-rest.md: business outcome used non-schema `available_quantity`
    field on line_item; replaced with message-based pattern
  - discount.md (4 instances): `quantity` wrongly nested inside `item`
    object instead of on `line_item` — schema requires quantity at line_item
    level
  - discount.md: `price` and `title` shown in create requests where schema
    omits them (request-only fields: only `item.id` is valid)

  ## Current state

  268 blocks across 39 files:
  - 52 validated (REST request/response pairs, core spec examples, error
    responses, extension examples)
  - 216 skipped (transport bindings, schema definitions, configs — future
    work: payload_path= for JSON-RPC unwrap)

  Run: python3 scripts/validate_examples.py --schema-base source/schemas/
Address review feedback (drewolson-google): empty `{}` request bodies
for GET and cancel operations are valid requests, not skip-worthy.

The harness now recognizes `{}` as trivially valid — no coverage or
schema validation needed for empty bodies. Changed annotations from
`skip reason="empty request body"` to proper schema declarations.
Add schema $defs targeting via def= annotation parameter for schemas
that define request/response types in $defs (catalog_search,
catalog_lookup). The harness resolves the schema, extracts the named
$def, and validates against it.

Fix ellipsis handling: [ ... ] and { ... } now preserve field presence
(as [] and {}) instead of stripping the key entirely. Validation errors
at ellipsis-acknowledged paths are suppressed — the coverage check
already verified the field was acknowledged.

Catalog bugs caught and fixed:
- catalog/rest.md: variant missing required description field
- catalog/index.md: product missing description and price_range
- catalog/rest.md: error response wrongly annotated as get_product

14 new validated blocks (catalog REST + index).
Replace blanket "conceptual/profile fragment" with specific reasons:
service endpoint fragment, schema definition, profile structure,
JSON-RPC transport/error format, payment handler advertisement, etc.

Validate the one block that maps cleanly to a schema (version
unsupported error response → error_response).

1 validated, 30 skipped with descriptive reasons.
Validate: complete checkout request (line 1282), version error response
(line 1910). The rest have specific skip reasons:

- 8x JSON-RPC transport (needs payload_path= feature)
- 8x profile documents (ucp metadata nested under "ucp" key, no
  wrapper schema — needs payload_path=$.ucp to extract inner object)
- 3x schema definitions (meta-schemas, not data payloads)
- 2x invalid JSON (contains // comments)
- 6x fragments and declarations
Main has moved 28 commits since branch point. Reconcile in two parts:

1. Annotate 11 new doc blocks introduced upstream:
   - core-concepts.md (2): capability declaration + profile fragment
   - schema-authoring.md (3): cancellation_reason / status / totals
     anti-pattern from the new "extensibility" section
   - identity-linking.md (5): scope metadata, OAuth token response,
     profile documents, incomplete checkout fragment (#354 OAuth rewrite)
   - overview.md (1): attribution fragment (#391)

   All map cleanly to existing skip-reason taxonomy.

2. Fix checkout-rest.md:1298 — upstream rewrote the example in #371
   but dropped required top-level fields. Added "currency" and "links"
   to satisfy the schema. The validator caught the bug; this is exactly
   the drift-prevention this work exists for.

After this commit:
  69 passed, 1 failed, 0 errors, 209 skipped

The single remaining failure (discount.md:336 trailing comma) is fixed
by the next commit (// comment + trailing comma stripping).
Documentation examples occasionally use JSONC conventions — line
comments to annotate fields, trailing commas for diff hygiene. The
content is still meaningful UCP payload; only the syntax is non-strict.

Add a preprocessing step that strips // comments (outside strings) and
trailing commas before json.loads. Pattern-matching, not a full JSONC
parser — the doc corpus is well-behaved enough that the simpler approach
holds. If we ever hit a false positive (e.g. a string containing `//`
or `,]`), swap in a real JSONC tokenizer.

Convert 8 previously-skipped blocks to validated:
  - ap2-mandates.md (2): merchant_authorization checkout response,
    AP2 complete payment request
  - buyer-consent.md (1): consent-bound checkout response
  - checkout-rest.md (4): three update requests + one read response
  - overview.md (2): two complete checkout requests with // annotations

After this commit:
  79 passed, 0 failed, 0 errors, 200 skipped
Four examples used trailing commas before closing braces, which is
invalid RFC 8259 JSON. The validator's previous lenient stripper
masked these — they parse correctly with strict JSON.

  - ap2-mandates.md: card display.description
  - checkout-rest.md: line_items[0].quantity in update example
  - discount.md: item.id in cart create (introduced via #371)
  - overview.md: payment.instruments token, ap2.checkout_mandate

These fixes stand on their own as bug fixes against strict JSON.
A follow-up commit removes the validator's trailing-comma tolerance.
Document the bespoke JSON capability set used in spec examples and
align the validator with that contract. The author-facing guide and
the implementation-facing module docstring describe the same
three-layer model and MUST stay in sync.

Author guide (docs/documentation/schema-authoring.md):
  New "Documenting JSON Examples" section covering annotation grammar,
  supported authoring conveniences (line comments, template variable,
  HTTP envelope, elision markers), explicitly-unsupported features,
  skip-reason taxonomy, and common patterns.

Validator (scripts/validate_examples.py):
  - Replace module docstring with the full three-layer contract.
  - Restructure pipeline into named layer functions:
      reduce_to_canonical_json (Layer 1 -> 2)
      parse_example (Layer 2 boundary)
      process_block (Layer 3 orchestrator)
  - Drop trailing-comma stripping. Was tolerated; now rejected to keep
    wire-format honesty. The four corpus violations were fixed in the
    previous commit.
  - Reject unknown annotation attribute keys (catches typos like
    "shema=" before they become confusing errors).
  - Reject multiple stacked annotations before a fence.
  - Track non-json fence state -- annotations inside any fenced block
    are documentation, not directives, and are now ignored.

  Run: python3 scripts/test_validate_examples.py
   Make the docs validator explicit about the small amount of authoring syntax it
   accepts and route final validation back through `ucp-schema`. `extract=` selects
   the nested UCP payload from a displayed envelope, `target=` splices fragments
   into a scaffold and selects the matching sub-schema, and `def=` selects profile
   or schema variants without requiring reader-facing docs to expose JSON Pointer
   machinery. Full examples no longer need fake empty scaffolds; fragments still
   require a real parent scaffold so the final object under validation is
   meaningful.

   Treat profile documents as first-class protocol artifacts rather than a
   discovery implementation detail. Move the canonical profile schema to
   `source/schemas/profile.json`, remove the duplicate discovery schema, and inline
   the reader-facing profile contract into Overview where profile negotiation is
   already explained. The schema keeps a variant-neutral wrapper with
   business/platform validation definitions, requires the registries needed for
   negotiation, keeps `capabilities` optional, and constrains `signing_keys` to
   public EC JWKs so the profile surface stays aligned with the signature spec.

   Convert profile, capability, MCP/JSON-RPC, and fragment examples to validated
   annotations and repair examples where stricter validation exposed drift. MCP
   text fallbacks now use abbreviated serialized JSON instead of unhelpful prose,
   and remaining skipped blocks are limited to non-UCP payloads, schema-authoring
   examples, or formats that do not get a validation schema in this change.

   Add dependency-free validator contract tests for the annotation grammar,
   extraction, target insertion, ellipsis lowering, template expansion, HTTP body
   unwrapping, implicit full-example scaffolds, and trailing-comma rejection.

   Validated with `python3 scripts/test_validate_examples.py`,
   `python3 scripts/validate_examples.py --schema-base source/schemas/ --audit`,
   `python3 scripts/validate_examples.py --schema-base source/schemas/`,
   `ucp-schema lint source/`, and pre-commit on the changed files.
   Add four narrow envelope schemas under source/schemas/transports/ so
   documentation examples that previously had to skip with reason="JSON-RPC
   transport binding" / "embedded protocol binding" / "A2A transport binding"
   can be machine-validated:

     - jsonrpc.json: JSON-RPC 2.0 request / success response / error response
     - mcp_tool_call.json: UCP's MCP tools/call mapping (params.name,
       params.arguments, result.structuredContent)
     - embedded_message.json: Embedded Protocol envelope and ec.* / ep.cart.*
       method namespaces
     - a2a_message.json: UCP's A2A mapping points (Agent Card extension
       advertisement, Message request wrappers, JSON-RPC responses carrying
       A2A Message results)

   Each schema validates only UCP's mapping into the underlying transport,
   not the transport protocol itself. Operation payloads inside these
   envelopes remain validated by the relevant capability schema.
   Convert ~30 previously skipped transport-binding blocks to validated
   annotations against the new transports/* schemas:

     - cart-mcp, order-mcp JSON-RPC tools/call requests now validate against
       transports/mcp_tool_call def=request
     - embedded-cart, embedded-checkout, embedded-protocol messages validate
       against transports/embedded_message (def=request / response /
       error_response)
     - overview.md JSON-RPC error responses validate against
       transports/jsonrpc def=error_response
     - signatures.md JWK validates against profile.signing_keys via target=;
       JSON-RPC error response against transports/jsonrpc

   Drive-by cleanups in the same files:

     - cart-rest.md, order-rest.md: HTTP-headers-only blocks moved from
       ```json (with skip) to ```http fences
     - cart-rest.md: cancel block converted from skip to real op=cancel
       validation
     - embedded-checkout.md: JWT structure example moved to ```text fence
       (not JSON)
     - embedded-cart, embedded-checkout, embedded-protocol: replace
       `{/* ... */}` JS-comment elision pseudo-syntax with the validator's
       canonical bare-form `{ ... }` / `[ ... ]`
   A2A `message/send` is JSON-RPC 2.0. Prior examples on this page showed
   bare A2A Message payloads, which was internally consistent but didn't
   match what an implementer sees on the wire — and couldn't be validated
   end-to-end against the A2A binding's envelope shape.

   Rewrap every example as a full JSON-RPC request/response:

     jsonrpc: "2.0"
     id: <synthetic 1..N>
     method: "message/send"
     params: { message: { ... } }

   Also replace JS-spread pseudo-syntax (`{...checkoutObject}`,
   `...paymentObject`) with the validator's canonical bare-form elision
   markers (`{ ... }`), and flip skip annotations to validate against
   transports/a2a_message (def=message_request / def=message_response /
   def=agent_card).

   No normative change to the A2A binding contract — just bringing the
   shown payloads in line with what implementations actually exchange.
   Upstream main moved 15 widely-shared types from source/schemas/shopping/types/
   to source/schemas/common/types/ (#436) to enable cross-vertical reuse — most
   recently the loyalty extension's common/loyalty.json depends on the relocated
   amount.json.

   Update every `schema=shopping/types/error_response` annotation on our branch
   to `schema=common/types/error_response` so the validator can resolve the
   schema at its new location. 31 annotations across 14 spec docs; pure path
   swap, no semantic change.
   The loyalty extension (#340) landed 12 ```json blocks in loyalty.md with
   no annotations, which trips the validator's "no unannotated blocks" rule.
   Annotate each block against the right schema; no changes to the example
   content itself.
igrigorik added 2 commits May 19, 2026 09:47
   CI (.github/workflows/docs.yml):
     - Add `scripts/**` to the pull_request paths filter so PRs touching the
       validator itself trigger the workflow.
     - After the existing `ucp-schema lint source/` step, run the full-corpus
       example validator and its unit tests. Both are fast (~13s combined)
       and use tooling already in setup-build-env (Python via uv, ucp-schema
       binary via cargo install). CI is the mandatory backstop.

   Validator API (scripts/validate_examples.py):
     - `--file` now accepts one or more paths (nargs='+') instead of a single
       path. Backward compatible — single-file usage still works — but enables
       incremental validation when a tool (or pre-commit) knows which files
       actually changed.

   pre-commit (.pre-commit-config.yaml):
     Three hooks at the pre-commit and pre-push stages, scoped by file filter
     for fast local feedback:

     - validate-examples-changed: fires on docs/*.md edits. Validates only
       the changed files via `--file <paths>` (~0.6s per file, typically
       <2s total). Catches direct annotation/schema-binding errors at commit
       time.

     - validate-examples-full: fires on source/schemas/* or validator-code
       edits. Runs the full corpus (~12s) because a single schema change
       can invalidate examples across many docs — the cross-file regression
       case the incremental hook intentionally trades off.

     - validate-examples-tests: fires on validator code/test edits. Runs the
       43 unit tests (~0.3s).

     All three use `language: system`, requiring Python (already needed for
     ruff/pre-commit) and the ucp-schema binary on PATH (cargo install
     ucp-schema). Contributors opt in by running:

         pre-commit install --hook-type pre-commit --hook-type pre-push

     Without the install step, contributors get caught at PR time by CI.
     With it, they get early feedback locally. CI remains the mandatory
     gate; pre-commit/pre-push is the opt-in early-warning system.
   The "Running the validator locally" subsection of the schema-authoring
   guide was a three-line code snippet. Now that the validator is enforced
   via pre-commit hooks and CI, authors need to know:

     - How to install the tooling (uv sync + cargo install ucp-schema +
       pre-commit install --hook-type pre-commit --hook-type pre-push).
     - Why both hook types matter: pre-commit only installs the pre-commit
       stage hook by default, but this repo also uses pre-push hooks as a
       safety net against --no-verify bypasses.
     - What runs automatically and where — a table covering the three
       enforcement surfaces (pre-commit changed-files, pre-commit/pre-push
       full-corpus, CI full-corpus) with scope and trigger.
     - Why the full-corpus check fires on schema/validator edits but not on
       pure doc edits — the cross-file regression case the incremental
       hook intentionally trades off.
@igrigorik igrigorik force-pushed the docs/example-validation branch from 7079377 to a04d055 Compare May 19, 2026 16:58
@igrigorik igrigorik added this to the Working Draft milestone May 19, 2026
@igrigorik

Copy link
Copy Markdown
Contributor Author

@wry-ry re, scaffold maintenance + jsonschema custom validator: scaffols are now optional. I deliberately kept ucp-schema as the validation backend so docs validate against the same canonical tool every other UCP consumer uses. Ellipsis skipping is implemented in the Python wrapper.

@raginpirate re, partial-object elision: current contract supports full-object elision ({ ... }) and per-field value elision ("x": "..."), but not partial-object with "show some fields, ignore other required fields in the same object". My preference is to keep strong and simple contract: required is required; if you open an object, make it pass required.

@igrigorik igrigorik requested a review from wry-ry May 19, 2026 17:34

@ihoosain ihoosain left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@igrigorik igrigorik merged commit b5c64c1 into main May 21, 2026
13 checks passed
@igrigorik igrigorik deleted the docs/example-validation branch May 21, 2026 00:30
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label May 21, 2026
igrigorik added a commit that referenced this pull request May 30, 2026
* feat: schema-validated documentation examples

  Introduce a validation harness that ensures every JSON example in the spec
  docs is either validated against UCP schemas or explicitly skipped. This
  catches documentation drift — when schemas change, stale examples break CI
  instead of silently misleading readers.

  ## Contract

  Every ```json block requires an annotation on the preceding line:

      <!-- ucp:example schema=shopping/checkout op=read -->
      <!-- ucp:example schema=shopping/checkout path=$.totals op=read -->
      <!-- ucp:example schema=shopping/cart op=create direction=request -->
      <!-- ucp:example skip reason="JSON-RPC transport binding" -->

  Unannotated blocks are hard failures.

  Principle: "if you open it, you own it." When an example opens an object's
  braces, every required field (per the resolved schema for the declared
  op + direction) must be shown or acknowledged. This applies recursively.

  Three-tier value contract for annotated (non-skip) examples:
  - Full value: validated against schema (types, constraints, enums)
  - Required fields ({ ... }, [ ... ], "..."): field must exists but
    values are optional; scaffold fills it for validation.
  - Absent fields only permitted for optional schema fields

  ## Pipeline

  1. Extract — find annotated ```json blocks (handles tab indentation)
  2. Unwrap — strip HTTP headers from request/response examples
  3. Expand — substitute {{ ucp_version }} with valid date
  4. Preprocess — convert bare ... to valid JSON ({ ... } → {"...":"..."})
  5. Coverage — walk resolved schema, verify required fields acknowledged
  6. Strip — remove ellipsis markers
  7. Merge — deep-merge into scaffold (example wins, scaffold fills gaps)
  8. Validate — ucp-schema validate on merged payload
  9. Report — map errors to source file:line

  ## Bugs caught and fixed

  - checkout.md: item missing required `price` field in disclosure example
  - checkout.md: totals array missing required `subtotal` entry
  - checkout-rest.md: business outcome used non-schema `available_quantity`
    field on line_item; replaced with message-based pattern
  - discount.md (4 instances): `quantity` wrongly nested inside `item`
    object instead of on `line_item` — schema requires quantity at line_item
    level
  - discount.md: `price` and `title` shown in create requests where schema
    omits them (request-only fields: only `item.id` is valid)

  ## Current state

  268 blocks across 39 files:
  - 52 validated (REST request/response pairs, core spec examples, error
    responses, extension examples)
  - 216 skipped (transport bindings, schema definitions, configs — future
    work: payload_path= for JSON-RPC unwrap)

  Run: python3 scripts/validate_examples.py --schema-base source/schemas/

* fix: validate empty request bodies instead of skipping

Address review feedback (drewolson-google): empty `{}` request bodies
for GET and cancel operations are valid requests, not skip-worthy.

The harness now recognizes `{}` as trivially valid — no coverage or
schema validation needed for empty bodies. Changed annotations from
`skip reason="empty request body"` to proper schema declarations.

* feat: validate catalog examples, add def= and ellipsis suppression

Add schema $defs targeting via def= annotation parameter for schemas
that define request/response types in $defs (catalog_search,
catalog_lookup). The harness resolves the schema, extracts the named
$def, and validates against it.

Fix ellipsis handling: [ ... ] and { ... } now preserve field presence
(as [] and {}) instead of stripping the key entirely. Validation errors
at ellipsis-acknowledged paths are suppressed — the coverage check
already verified the field was acknowledged.

Catalog bugs caught and fixed:
- catalog/rest.md: variant missing required description field
- catalog/index.md: product missing description and price_range
- catalog/rest.md: error response wrongly annotated as get_product

14 new validated blocks (catalog REST + index).

* chore: improve overview.md skip annotations with precise reasons

Replace blanket "conceptual/profile fragment" with specific reasons:
service endpoint fragment, schema definition, profile structure,
JSON-RPC transport/error format, payment handler advertisement, etc.

Validate the one block that maps cleanly to a schema (version
unsupported error response → error_response).

1 validated, 30 skipped with descriptive reasons.

* chore: overview.md — validate 2 blocks, precise skip reasons

Validate: complete checkout request (line 1282), version error response
(line 1910). The rest have specific skip reasons:

- 8x JSON-RPC transport (needs payload_path= feature)
- 8x profile documents (ucp metadata nested under "ucp" key, no
  wrapper schema — needs payload_path=$.ucp to extract inner object)
- 3x schema definitions (meta-schemas, not data payloads)
- 2x invalid JSON (contains // comments)
- 6x fragments and declarations

* chore: reconcile annotations after main rebase

Main has moved 28 commits since branch point. Reconcile in two parts:

1. Annotate 11 new doc blocks introduced upstream:
   - core-concepts.md (2): capability declaration + profile fragment
   - schema-authoring.md (3): cancellation_reason / status / totals
     anti-pattern from the new "extensibility" section
   - identity-linking.md (5): scope metadata, OAuth token response,
     profile documents, incomplete checkout fragment (#354 OAuth rewrite)
   - overview.md (1): attribution fragment (#391)

   All map cleanly to existing skip-reason taxonomy.

2. Fix checkout-rest.md:1298 — upstream rewrote the example in #371
   but dropped required top-level fields. Added "currency" and "links"
   to satisfy the schema. The validator caught the bug; this is exactly
   the drift-prevention this work exists for.

After this commit:
  69 passed, 1 failed, 0 errors, 209 skipped

The single remaining failure (discount.md:336 trailing comma) is fixed
by the next commit (// comment + trailing comma stripping).

* feat: strip // comments and trailing commas, validate 8 more blocks

Documentation examples occasionally use JSONC conventions — line
comments to annotate fields, trailing commas for diff hygiene. The
content is still meaningful UCP payload; only the syntax is non-strict.

Add a preprocessing step that strips // comments (outside strings) and
trailing commas before json.loads. Pattern-matching, not a full JSONC
parser — the doc corpus is well-behaved enough that the simpler approach
holds. If we ever hit a false positive (e.g. a string containing `//`
or `,]`), swap in a real JSONC tokenizer.

Convert 8 previously-skipped blocks to validated:
  - ap2-mandates.md (2): merchant_authorization checkout response,
    AP2 complete payment request
  - buyer-consent.md (1): consent-bound checkout response
  - checkout-rest.md (4): three update requests + one read response
  - overview.md (2): two complete checkout requests with // annotations

After this commit:
  79 passed, 0 failed, 0 errors, 200 skipped

* fix(docs): remove trailing commas in JSON examples

Four examples used trailing commas before closing braces, which is
invalid RFC 8259 JSON. The validator's previous lenient stripper
masked these — they parse correctly with strict JSON.

  - ap2-mandates.md: card display.description
  - checkout-rest.md: line_items[0].quantity in update example
  - discount.md: item.id in cart create (introduced via #371)
  - overview.md: payment.instruments token, ap2.checkout_mandate

These fixes stand on their own as bug fixes against strict JSON.
A follow-up commit removes the validator's trailing-comma tolerance.

* feat(validator): formalize JSON example contract

Document the bespoke JSON capability set used in spec examples and
align the validator with that contract. The author-facing guide and
the implementation-facing module docstring describe the same
three-layer model and MUST stay in sync.

Author guide (docs/documentation/schema-authoring.md):
  New "Documenting JSON Examples" section covering annotation grammar,
  supported authoring conveniences (line comments, template variable,
  HTTP envelope, elision markers), explicitly-unsupported features,
  skip-reason taxonomy, and common patterns.

Validator (scripts/validate_examples.py):
  - Replace module docstring with the full three-layer contract.
  - Restructure pipeline into named layer functions:
      reduce_to_canonical_json (Layer 1 -> 2)
      parse_example (Layer 2 boundary)
      process_block (Layer 3 orchestrator)
  - Drop trailing-comma stripping. Was tolerated; now rejected to keep
    wire-format honesty. The four corpus violations were fixed in the
    previous commit.
  - Reject unknown annotation attribute keys (catches typos like
    "shema=" before they become confusing errors).
  - Reject multiple stacked annotations before a fence.
  - Track non-json fence state -- annotations inside any fenced block
    are documentation, not directives, and are now ignored.

  Run: python3 scripts/test_validate_examples.py

* feat(validator): validate profile and docs examples

   Make the docs validator explicit about the small amount of authoring syntax it
   accepts and route final validation back through `ucp-schema`. `extract=` selects
   the nested UCP payload from a displayed envelope, `target=` splices fragments
   into a scaffold and selects the matching sub-schema, and `def=` selects profile
   or schema variants without requiring reader-facing docs to expose JSON Pointer
   machinery. Full examples no longer need fake empty scaffolds; fragments still
   require a real parent scaffold so the final object under validation is
   meaningful.

   Treat profile documents as first-class protocol artifacts rather than a
   discovery implementation detail. Move the canonical profile schema to
   `source/schemas/profile.json`, remove the duplicate discovery schema, and inline
   the reader-facing profile contract into Overview where profile negotiation is
   already explained. The schema keeps a variant-neutral wrapper with
   business/platform validation definitions, requires the registries needed for
   negotiation, keeps `capabilities` optional, and constrains `signing_keys` to
   public EC JWKs so the profile surface stays aligned with the signature spec.

   Convert profile, capability, MCP/JSON-RPC, and fragment examples to validated
   annotations and repair examples where stricter validation exposed drift. MCP
   text fallbacks now use abbreviated serialized JSON instead of unhelpful prose,
   and remaining skipped blocks are limited to non-UCP payloads, schema-authoring
   examples, or formats that do not get a validation schema in this change.

   Add dependency-free validator contract tests for the annotation grammar,
   extraction, target insertion, ellipsis lowering, template expansion, HTTP body
   unwrapping, implicit full-example scaffolds, and trailing-comma rejection.

   Validated with `python3 scripts/test_validate_examples.py`,
   `python3 scripts/validate_examples.py --schema-base source/schemas/ --audit`,
   `python3 scripts/validate_examples.py --schema-base source/schemas/`,
   `ucp-schema lint source/`, and pre-commit on the changed files.

* feat(validator): transport envelope schemas

   Add four narrow envelope schemas under source/schemas/transports/ so
   documentation examples that previously had to skip with reason="JSON-RPC
   transport binding" / "embedded protocol binding" / "A2A transport binding"
   can be machine-validated:

     - jsonrpc.json: JSON-RPC 2.0 request / success response / error response
     - mcp_tool_call.json: UCP's MCP tools/call mapping (params.name,
       params.arguments, result.structuredContent)
     - embedded_message.json: Embedded Protocol envelope and ec.* / ep.cart.*
       method namespaces
     - a2a_message.json: UCP's A2A mapping points (Agent Card extension
       advertisement, Message request wrappers, JSON-RPC responses carrying
       A2A Message results)

   Each schema validates only UCP's mapping into the underlying transport,
   not the transport protocol itself. Operation payloads inside these
   envelopes remain validated by the relevant capability schema.

* docs: validate transport binding examples

   Convert ~30 previously skipped transport-binding blocks to validated
   annotations against the new transports/* schemas:

     - cart-mcp, order-mcp JSON-RPC tools/call requests now validate against
       transports/mcp_tool_call def=request
     - embedded-cart, embedded-checkout, embedded-protocol messages validate
       against transports/embedded_message (def=request / response /
       error_response)
     - overview.md JSON-RPC error responses validate against
       transports/jsonrpc def=error_response
     - signatures.md JWK validates against profile.signing_keys via target=;
       JSON-RPC error response against transports/jsonrpc

   Drive-by cleanups in the same files:

     - cart-rest.md, order-rest.md: HTTP-headers-only blocks moved from
       ```json (with skip) to ```http fences
     - cart-rest.md: cancel block converted from skip to real op=cancel
       validation
     - embedded-checkout.md: JWT structure example moved to ```text fence
       (not JSON)
     - embedded-cart, embedded-checkout, embedded-protocol: replace
       `{/* ... */}` JS-comment elision pseudo-syntax with the validator's
       canonical bare-form `{ ... }` / `[ ... ]`

* a2a: wrap examples in JSON-RPC envelopes

   A2A `message/send` is JSON-RPC 2.0. Prior examples on this page showed
   bare A2A Message payloads, which was internally consistent but didn't
   match what an implementer sees on the wire — and couldn't be validated
   end-to-end against the A2A binding's envelope shape.

   Rewrap every example as a full JSON-RPC request/response:

     jsonrpc: "2.0"
     id: <synthetic 1..N>
     method: "message/send"
     params: { message: { ... } }

   Also replace JS-spread pseudo-syntax (`{...checkoutObject}`,
   `...paymentObject`) with the validator's canonical bare-form elision
   markers (`{ ... }`), and flip skip annotations to validate against
   transports/a2a_message (def=message_request / def=message_response /
   def=agent_card).

   No normative change to the A2A binding contract — just bringing the
   shown payloads in line with what implementations actually exchange.

* point error_response annotations at common/types

   Upstream main moved 15 widely-shared types from source/schemas/shopping/types/
   to source/schemas/common/types/ (#436) to enable cross-vertical reuse — most
   recently the loyalty extension's common/loyalty.json depends on the relocated
   amount.json.

   Update every `schema=shopping/types/error_response` annotation on our branch
   to `schema=common/types/error_response` so the validator can resolve the
   schema at its new location. 31 annotations across 14 spec docs; pure path
   swap, no semantic change.

* validate loyalty extension examples

   The loyalty extension (#340) landed 12 ```json blocks in loyalty.md with
   no annotations, which trips the validator's "no unannotated blocks" rule.
   Annotate each block against the right schema; no changes to the example
   content itself.

* enforce example validation in pre-commit + CI

   CI (.github/workflows/docs.yml):
     - Add `scripts/**` to the pull_request paths filter so PRs touching the
       validator itself trigger the workflow.
     - After the existing `ucp-schema lint source/` step, run the full-corpus
       example validator and its unit tests. Both are fast (~13s combined)
       and use tooling already in setup-build-env (Python via uv, ucp-schema
       binary via cargo install). CI is the mandatory backstop.

   Validator API (scripts/validate_examples.py):
     - `--file` now accepts one or more paths (nargs='+') instead of a single
       path. Backward compatible — single-file usage still works — but enables
       incremental validation when a tool (or pre-commit) knows which files
       actually changed.

   pre-commit (.pre-commit-config.yaml):
     Three hooks at the pre-commit and pre-push stages, scoped by file filter
     for fast local feedback:

     - validate-examples-changed: fires on docs/*.md edits. Validates only
       the changed files via `--file <paths>` (~0.6s per file, typically
       <2s total). Catches direct annotation/schema-binding errors at commit
       time.

     - validate-examples-full: fires on source/schemas/* or validator-code
       edits. Runs the full corpus (~12s) because a single schema change
       can invalidate examples across many docs — the cross-file regression
       case the incremental hook intentionally trades off.

     - validate-examples-tests: fires on validator code/test edits. Runs the
       43 unit tests (~0.3s).

     All three use `language: system`, requiring Python (already needed for
     ruff/pre-commit) and the ucp-schema binary on PATH (cargo install
     ucp-schema). Contributors opt in by running:

         pre-commit install --hook-type pre-commit --hook-type pre-push

     Without the install step, contributors get caught at PR time by CI.
     With it, they get early feedback locally. CI remains the mandatory
     gate; pre-commit/pre-push is the opt-in early-warning system.

* document validator setup, hooks, and CI enforcement

   The "Running the validator locally" subsection of the schema-authoring
   guide was a three-line code snippet. Now that the validator is enforced
   via pre-commit hooks and CI, authors need to know:

     - How to install the tooling (uv sync + cargo install ucp-schema +
       pre-commit install --hook-type pre-commit --hook-type pre-push).
     - Why both hook types matter: pre-commit only installs the pre-commit
       stage hook by default, but this repo also uses pre-push hooks as a
       safety net against --no-verify bypasses.
     - What runs automatically and where — a table covering the three
       enforcement surfaces (pre-commit changed-files, pre-commit/pre-push
       full-corpus, CI full-corpus) with scope and trigger.
     - Why the full-corpus check fires on schema/validator edits but not on
       pure doc edits — the cross-file regression case the incremental
       hook intentionally trades off.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation TC review Ready for TC review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants