From c61f0ea59bcfc77dafe91a94aa45327472e9a4cf Mon Sep 17 00:00:00 2001 From: Matthias Brenninkmeijer Date: Sat, 9 May 2026 20:16:24 +0200 Subject: [PATCH 1/3] Update gitignore --- .gitignore | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/.gitignore b/.gitignore index 92e83b4..58c5168 100644 --- a/.gitignore +++ b/.gitignore @@ -225,3 +225,10 @@ docs/REFACTORING_PLAN.md docs/UNTTP_PLUGIN_PLAN.md docs/VC_WALLET_ROADMAP.md docs/STRATEGIC_ROADMAP.md + +# CIRPASS-2 spec-snapshot scratch (verbatim downloads). +# Phase 0 of docs/plans/CIRPASS_2_MIGRATION.md. The canonical bundled +# bytes land in src/dppvalidator/vocabularies/data/ontologies/ in +# Phase 1; the gitignored scratch dir is the operator-audit workspace. +tools/snapshot/cirpass-2/ +tools/snapshot/manifest-rows.json From 290e30341795689c3478566a223c70e326ae4217 Mon Sep 17 00:00:00 2001 From: Matthias Brenninkmeijer Date: Sat, 9 May 2026 23:46:17 +0200 Subject: [PATCH 2/3] feat: add CIRPASS-2 reference structure v1.3.0 and complete UNTP 0.7.0 alignment MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit End-to-end implementation of the CIRPASS-2 migration plan (docs/plans/CIRPASS_2_MIGRATION.md, Phases 0–9). Adds the second parallel schema family alongside UNTP DPP, completes the UNTP 0.7.0 alignment, and stages the 0.5.0 Preview release prep. CIRPASS-2 (Phases 0–7): - New SchemaFamily.CIRPASS family with v1.3.0 registered as a first- class entry. Auto-detection routes via @context and shape signatures; --target {auto,untp,cirpass} CLI flag plus DET001 family-mismatch diagnostic + EXIT_FAMILY_MISMATCH (3) exit code. - 20 Pydantic models under dppvalidator.models.cirpass.v1_3.* covering every required field of the CIRPASS reference schema. - 11 semantic rules under validators/rules/cirpass_v1_3/ across 4 prefixes (ACT, REL, SOC, LCA) covering every axiom in the v1.9.x EUDPP module specs. - Cross-family forward + reverse compat shims at compat/ with 5 MAP00X warning codes and a 15-row M01-M15 declarative step table. - CIRPASSJsonLDExporter accepting both native CIRPASS and UNTP-shimmed inputs; EUDPP exporter rebased onto v1.9.1 namespaces with EUDPP_CONTEXT_URL deprecated via PEP 562. - migrate --to {untp-0.7,cirpass-1.3} generalises the migrate command with --default-language for LocalisedText wrapping. - 6 EUDPP v1.9.x ontology TTLs vendored, every IRI rebased onto the canonical https://w3id.org/eudpp# fragment namespace (ADR 0002). - Six-code CLI exit surface formalised at module level (EXIT_VALID/INVALID/ERROR/FAMILY_MISMATCH/BLOCKING_WARNINGS/ IO_ERROR), documented at docs/reference/cli/exit-codes.md (ADR 0005). Plugins (Phase 7): - Textile v2 profile built-in (--profile textile-v2): 7 rules TXT001-TXT007 including TXT006 recycled-content disclosure and TXT007 repair-info (both new in v2). The legacy textile-v1 profile remains available; both ship as TEXTILE_PROFILES registry entries (ADR 0004). - Tyres GPL plugin under plugins/tyres/ (dppvalidator-tyres==0.1.0, Pre-1.0 / Experimental): 4 GDSO declaration models (Birth v0.9, Collection v0.1, Retread v0.1, Recycling v0.1) + TyreLifecycleHistory aggregate enforcing UUID-chain / chronological-order / single-Recycling invariants. 8 TYR-coded validators auto-registered via entry-points (ADR 0003). - Plugin license isolation enforced via tools/check_imports.py (AST-walks the core source tree; fails on any import from plugins/* — R8 mitigation). UNTP 0.7.0 alignment (Phase 8 polish + Phase 9 BLOCKER fixes): - DEFAULT_VERSIONS[UNTP] flipped from 0.6.1 to 0.7.0 (Phase 9.1). v0.6.x remains supported via auto-detection and the v0.6 -> v0.7 upgrade shim (compat/upgrade_0_6_to_0_7.py). - D1 BLOCKER fix (Phase 9.7): BitstringStatusListEntry.statusListIndex is now int | None with ge=0; new before-validator transparently coerces numeric strings (whitespace and leading-zero tolerant) for v0.6 fixture back-compat. Non-breaking. - D2 BLOCKER fix (Phase 9.8): PartyRoleEnum acceptance gradient documented + new SCHEMA_STRICT_ROLES constant + new advisory rule PRT001 surfaces the gap when payloads use one of the 14 wider values. New ValidationEngine(strict_role_enum=True) opt-in upgrades PRT001 from info to error. CIRPASS reverse-shim mapping fidelity preserved. Non-breaking. - 3-tier alignment guard test (Phase 9.9): tests/unit/test_v07_model_schema_alignment.py registers the full Phase 8.9 drift baseline (12 tests across strict / drift-watch / compat tiers), forcing every future PR widening drift to update the registered baseline atomically. - 3 deprecation surfaces activated for Phase 10 removal (Phase 9.4): bare-string SCHEMA_REGISTRY[version] lookup, is_dpp_document() alias, EUDPP_CONTEXT_URL legacy URL. Polish (Phase 8.5-8.6): - Centralised cross-CLI _load_input helper at cli/_io.py. - Centralised UNTP-CIRPASS shim helpers at compat/_shared.py and rules/cirpass_v1_3/_helpers.py. - Removed 6 dead pre-Phase-2 detection aliases. - Authored 2 missing error-doc pages (TXT006, TXT007). Tests: - 2525 passed / 36 skipped (+57 new tests vs the 0.4.0 baseline of 2468). - Coverage 92.04% (effectively flat). - 101/101 cross-version regression baseline (Phase 9.11 release- gate command). Documentation: - New CHANGELOG.md 0.5.0 entry (family-keyed: CIRPASS-2, UNTP, Plugins; Bug fixes; Deprecations; Cross-version compatibility; Known limitations referencing all 27 Phase 8.9 drift items; Migration guide). - New concept doc cirpass-2-alignment.md, user-facing guide migrate-untp-to-cirpass.md, and reference/cirpass/ section auto-generated via mkdocstrings. - Two new ADRs (0004 — textile v2 built-in; 0005 — six-code CLI exit surface) plus two refreshed earlier ADRs. Compatibility constraints honoured throughout (Phase 9 plan addendum): v0.6.0 / v0.6.1 fixtures parse, validate, and upgrade without regression; CIRPASS v1.3.0 round-trips remain bit-stable; mkdocs build --strict, ruff check, ruff format --check, ty check all clean. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitignore | 6 +- AGENTS.md | 2 +- CHANGELOG.md | 188 + README.md | 68 +- .../0001-cirpass-json-schema-derivation.md | 109 + docs/adr/0002-canonical-eudpp-iri.md | 129 + docs/adr/0003-tyre-license.md | 92 + docs/adr/0004-textile-v2-built-in.md | 87 + docs/adr/0005-cli-exit-codes.md | 93 + docs/adr/README.md | 31 + docs/concepts/cirpass-2-alignment.md | 155 + docs/concepts/cirpass-2-spec-snapshot.md | 142 + docs/concepts/eudpp-1.9-changelog.md | 170 + docs/concepts/untp-cirpass-mapping.md | 208 + docs/errors/PRT001.md | 104 + docs/errors/TXT006.md | 56 + docs/errors/TXT007.md | 50 + docs/guides/migrate-untp-to-cirpass.md | 315 ++ docs/plans/CIRPASS_2_MIGRATION.md | 3434 +++++++++++++++++ docs/plugins/tyres.md | 136 + docs/reference/api/validators.md | 2 +- docs/reference/cirpass/index.md | 149 + docs/reference/cli/exit-codes.md | 80 + mkdocs.yml | 17 +- plugins/tyres/LICENSE | 30 + plugins/tyres/README.md | 56 + plugins/tyres/pyproject.toml | 68 + plugins/tyres/samples/birth.json | 70 + .../tyres/src/dppvalidator_tyres/__init__.py | 46 + .../tyres/src/dppvalidator_tyres/exporters.py | 112 + .../src/dppvalidator_tyres/models/__init__.py | 45 + .../src/dppvalidator_tyres/models/actor.py | 53 + .../src/dppvalidator_tyres/models/birth.py | 200 + .../dppvalidator_tyres/models/collection.py | 78 + .../src/dppvalidator_tyres/models/history.py | 150 + .../dppvalidator_tyres/models/recycling.py | 126 + .../src/dppvalidator_tyres/models/retread.py | 88 + .../dppvalidator_tyres/validators/__init__.py | 69 + .../dppvalidator_tyres/validators/rules.py | 423 ++ scripts/smoke_test.py | 9 +- src/dppvalidator/cli/_io.py | 67 + src/dppvalidator/cli/commands/export.py | 92 +- src/dppvalidator/cli/commands/migrate.py | 246 +- src/dppvalidator/cli/commands/schema.py | 94 +- src/dppvalidator/cli/commands/validate.py | 143 +- src/dppvalidator/cli/main.py | 31 + src/dppvalidator/compat/__init__.py | 74 +- .../compat/_identifier_schemes.py | 255 ++ src/dppvalidator/compat/_mapping_codes.py | 198 + src/dppvalidator/compat/_shared.py | 81 + src/dppvalidator/compat/_untp_cirpass_map.py | 338 ++ .../compat/cirpass_1_3_to_untp_0_7.py | 739 ++++ .../compat/untp_0_7_to_cirpass_1_3.py | 645 ++++ src/dppvalidator/exporters/__init__.py | 39 +- src/dppvalidator/exporters/cirpass_jsonld.py | 341 ++ src/dppvalidator/exporters/eudpp_jsonld.py | 51 +- src/dppvalidator/models/__init__.py | 15 +- src/dppvalidator/models/cirpass/__init__.py | 16 + .../models/cirpass/v1_3/__init__.py | 88 + src/dppvalidator/models/cirpass/v1_3/actor.py | 168 + .../models/cirpass/v1_3/connector.py | 130 + src/dppvalidator/models/cirpass/v1_3/i18n.py | 88 + src/dppvalidator/models/cirpass/v1_3/lca.py | 141 + .../models/cirpass/v1_3/material.py | 146 + .../models/cirpass/v1_3/passport.py | 130 + .../models/cirpass/v1_3/product.py | 174 + .../models/cirpass/v1_3/substances.py | 188 + .../models/cirpass/v1_3/temporal.py | 90 + src/dppvalidator/models/v0_7/envelope.py | 35 +- src/dppvalidator/models/v0_7/identifiers.py | 50 +- src/dppvalidator/schemas/data/MANIFEST.json | 174 +- .../schemas/data/cirpass-reference-1.3.0.json | 1301 +++++++ src/dppvalidator/schemas/registry.py | 244 +- src/dppvalidator/validators/__init__.py | 20 +- src/dppvalidator/validators/detection.py | 465 ++- src/dppvalidator/validators/engine.py | 140 +- src/dppvalidator/validators/model.py | 113 +- src/dppvalidator/validators/rules/__init__.py | 35 +- .../validators/rules/cirpass_v1_3/__init__.py | 115 + .../validators/rules/cirpass_v1_3/_helpers.py | 31 + .../validators/rules/cirpass_v1_3/actor.py | 269 ++ .../validators/rules/cirpass_v1_3/base.py | 337 ++ .../rules/cirpass_v1_3/connector.py | 158 + .../validators/rules/cirpass_v1_3/lca.py | 248 ++ .../rules/cirpass_v1_3/substances.py | 216 ++ .../validators/rules/v0_7/__init__.py | 32 +- .../validators/rules/v0_7/party_role.py | 107 + .../validators/rules/v0_7/textile_v2.py | 500 +++ src/dppvalidator/validators/schema.py | 23 +- src/dppvalidator/validators/semantic.py | 102 +- src/dppvalidator/validators/shacl_cirpass.py | 485 +++ .../data/ontologies/actors_roles_v1.9.1.ttl | 526 +++ .../data/ontologies/connector_v1.9.1.ttl | 173 + .../data/ontologies/eudpp_core_v1.9.1.ttl | 39 + .../data/ontologies/lca_v1.9.4_Maki.ttl | 1652 ++++++++ .../data/ontologies/product_dpp_v1.9.1.ttl | 749 ++++ .../data/ontologies/soc_v1.9.1.ttl | 216 ++ src/dppvalidator/vocabularies/eudpp_actors.py | 78 +- .../vocabularies/eudpp_classes.py | 47 +- src/dppvalidator/vocabularies/eudpp_lca.py | 194 +- .../vocabularies/eudpp_relations.py | 92 +- .../vocabularies/eudpp_substances.py | 23 +- src/dppvalidator/vocabularies/ontology.py | 283 +- .../golden/eudpp_ld_export__untp_v0_7.json | 284 ++ .../cirpass-1.3.0/bad_bcp47_language_tag.json | 15 + .../invalid/cirpass-1.3.0/bad_cas_number.json | 24 + .../effective_period_inverted.json | 19 + .../cirpass-1.3.0/empty_product_name.json | 15 + .../cirpass-1.3.0/mass_fraction_overflow.json | 31 + .../cirpass-1.3.0/missing_dpp_identifier.json | 11 + tests/fixtures/valid/cirpass-1.3.0/full.json | 135 + .../fixtures/valid/cirpass-1.3.0/minimal.json | 20 + .../valid/cirpass-1.3.0/multilingual.json | 40 + tests/integration/shacl/__init__.py | 0 .../shacl/test_per_module_attribution.py | 225 ++ .../integration/test_cirpass_v1_3_pipeline.py | 169 + tests/integration/test_cli_back_compat.py | 276 ++ tests/integration/test_cli_cirpass.py | 363 ++ tests/integration/test_cli_export_matrix.py | 133 + tests/integration/test_cli_workflows.py | 31 +- .../test_cross_family_isolation.py | 184 + tests/integration/test_eudpp_export_v1_9.py | 122 + tests/integration/test_i18n_roundtrip.py | 170 + tests/integration/test_real_world_samples.py | 22 +- .../test_round_trip_untp_cirpass.py | 366 ++ tests/integration/test_textile_profiles.py | 265 ++ tests/integration/test_validation_pipeline.py | 30 +- tests/integration/test_version_matrix.py | 76 + tests/plugins/__init__.py | 0 tests/plugins/test_license_isolation.py | 160 + tests/plugins/tyres/__init__.py | 0 tests/plugins/tyres/test_tyres_models.py | 377 ++ tests/plugins/tyres/test_tyres_pipeline.py | 138 + tests/plugins/tyres/test_tyres_validators.py | 466 +++ tests/property/test_property_validators.py | 9 +- tests/property/test_round_trip_invariants.py | 281 ++ tests/unit/test_cirpass_v1_3_rules.py | 900 +++++ tests/unit/test_cli.py | 62 +- tests/unit/test_cli_migrate.py | 9 +- tests/unit/test_codegen_regenerate_enums.py | 185 + tests/unit/test_cold_start_import.py | 107 + tests/unit/test_deep_validation.py | 16 +- tests/unit/test_detection.py | 43 +- tests/unit/test_detection_ambiguity.py | 134 + tests/unit/test_detection_cirpass.py | 240 ++ tests/unit/test_eudpp_classes.py | 8 +- tests/unit/test_eudpp_export.py | 17 +- tests/unit/test_eudpp_export_v07.py | 8 +- tests/unit/test_eudpp_lca.py | 22 +- tests/unit/test_eudpp_relations.py | 2 +- tests/unit/test_eudpp_term_mapping.py | 161 + tests/unit/test_identifier_schemes.py | 151 + tests/unit/test_manifest_integrity.py | 57 + tests/unit/test_mapping_codes.py | 420 ++ tests/unit/test_model_validator.py | 5 +- tests/unit/test_models_cirpass_v1_3.py | 436 +++ tests/unit/test_namespace_canonicality.py | 157 + tests/unit/test_no_version_literals.py | 34 +- tests/unit/test_ontology_alignment.py | 65 +- tests/unit/test_ontology_v07.py | 8 +- tests/unit/test_party_role_gradient.py | 245 ++ tests/unit/test_registry_back_compat.py | 157 + tests/unit/test_samples_classification.py | 41 +- tests/unit/test_schema_dual_mode.py | 6 +- tests/unit/test_schemas.py | 6 +- tests/unit/test_semantic_rules.py | 16 +- tests/unit/test_v07_model_schema_alignment.py | 357 ++ tests/unit/test_v07_models.py | 61 + tools/check_imports.py | 187 + tools/codegen/check_drift.py | 124 + tools/codegen/cirpass/README.md | 137 + tools/codegen/cirpass/derive_schema.py | 136 + tools/codegen/cirpass/regenerate_enums.py | 314 ++ tools/snapshot/README.md | 112 + tools/snapshot/cirpass2_artefacts.json | 178 + tools/snapshot/fetch_cirpass.py | 532 +++ 176 files changed, 31131 insertions(+), 708 deletions(-) create mode 100644 docs/adr/0001-cirpass-json-schema-derivation.md create mode 100644 docs/adr/0002-canonical-eudpp-iri.md create mode 100644 docs/adr/0003-tyre-license.md create mode 100644 docs/adr/0004-textile-v2-built-in.md create mode 100644 docs/adr/0005-cli-exit-codes.md create mode 100644 docs/adr/README.md create mode 100644 docs/concepts/cirpass-2-alignment.md create mode 100644 docs/concepts/cirpass-2-spec-snapshot.md create mode 100644 docs/concepts/eudpp-1.9-changelog.md create mode 100644 docs/concepts/untp-cirpass-mapping.md create mode 100644 docs/errors/PRT001.md create mode 100644 docs/errors/TXT006.md create mode 100644 docs/errors/TXT007.md create mode 100644 docs/guides/migrate-untp-to-cirpass.md create mode 100644 docs/plans/CIRPASS_2_MIGRATION.md create mode 100644 docs/plugins/tyres.md create mode 100644 docs/reference/cirpass/index.md create mode 100644 docs/reference/cli/exit-codes.md create mode 100644 plugins/tyres/LICENSE create mode 100644 plugins/tyres/README.md create mode 100644 plugins/tyres/pyproject.toml create mode 100644 plugins/tyres/samples/birth.json create mode 100644 plugins/tyres/src/dppvalidator_tyres/__init__.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/exporters.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/models/__init__.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/models/actor.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/models/birth.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/models/collection.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/models/history.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/models/recycling.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/models/retread.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/validators/__init__.py create mode 100644 plugins/tyres/src/dppvalidator_tyres/validators/rules.py create mode 100644 src/dppvalidator/cli/_io.py create mode 100644 src/dppvalidator/compat/_identifier_schemes.py create mode 100644 src/dppvalidator/compat/_mapping_codes.py create mode 100644 src/dppvalidator/compat/_shared.py create mode 100644 src/dppvalidator/compat/_untp_cirpass_map.py create mode 100644 src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py create mode 100644 src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py create mode 100644 src/dppvalidator/exporters/cirpass_jsonld.py create mode 100644 src/dppvalidator/models/cirpass/__init__.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/__init__.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/actor.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/connector.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/i18n.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/lca.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/material.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/passport.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/product.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/substances.py create mode 100644 src/dppvalidator/models/cirpass/v1_3/temporal.py create mode 100644 src/dppvalidator/schemas/data/cirpass-reference-1.3.0.json create mode 100644 src/dppvalidator/validators/rules/cirpass_v1_3/__init__.py create mode 100644 src/dppvalidator/validators/rules/cirpass_v1_3/_helpers.py create mode 100644 src/dppvalidator/validators/rules/cirpass_v1_3/actor.py create mode 100644 src/dppvalidator/validators/rules/cirpass_v1_3/base.py create mode 100644 src/dppvalidator/validators/rules/cirpass_v1_3/connector.py create mode 100644 src/dppvalidator/validators/rules/cirpass_v1_3/lca.py create mode 100644 src/dppvalidator/validators/rules/cirpass_v1_3/substances.py create mode 100644 src/dppvalidator/validators/rules/v0_7/party_role.py create mode 100644 src/dppvalidator/validators/rules/v0_7/textile_v2.py create mode 100644 src/dppvalidator/validators/shacl_cirpass.py create mode 100644 src/dppvalidator/vocabularies/data/ontologies/actors_roles_v1.9.1.ttl create mode 100644 src/dppvalidator/vocabularies/data/ontologies/connector_v1.9.1.ttl create mode 100644 src/dppvalidator/vocabularies/data/ontologies/eudpp_core_v1.9.1.ttl create mode 100644 src/dppvalidator/vocabularies/data/ontologies/lca_v1.9.4_Maki.ttl create mode 100644 src/dppvalidator/vocabularies/data/ontologies/product_dpp_v1.9.1.ttl create mode 100644 src/dppvalidator/vocabularies/data/ontologies/soc_v1.9.1.ttl create mode 100644 tests/fixtures/golden/eudpp_ld_export__untp_v0_7.json create mode 100644 tests/fixtures/invalid/cirpass-1.3.0/bad_bcp47_language_tag.json create mode 100644 tests/fixtures/invalid/cirpass-1.3.0/bad_cas_number.json create mode 100644 tests/fixtures/invalid/cirpass-1.3.0/effective_period_inverted.json create mode 100644 tests/fixtures/invalid/cirpass-1.3.0/empty_product_name.json create mode 100644 tests/fixtures/invalid/cirpass-1.3.0/mass_fraction_overflow.json create mode 100644 tests/fixtures/invalid/cirpass-1.3.0/missing_dpp_identifier.json create mode 100644 tests/fixtures/valid/cirpass-1.3.0/full.json create mode 100644 tests/fixtures/valid/cirpass-1.3.0/minimal.json create mode 100644 tests/fixtures/valid/cirpass-1.3.0/multilingual.json create mode 100644 tests/integration/shacl/__init__.py create mode 100644 tests/integration/shacl/test_per_module_attribution.py create mode 100644 tests/integration/test_cirpass_v1_3_pipeline.py create mode 100644 tests/integration/test_cli_back_compat.py create mode 100644 tests/integration/test_cli_cirpass.py create mode 100644 tests/integration/test_cli_export_matrix.py create mode 100644 tests/integration/test_cross_family_isolation.py create mode 100644 tests/integration/test_eudpp_export_v1_9.py create mode 100644 tests/integration/test_i18n_roundtrip.py create mode 100644 tests/integration/test_round_trip_untp_cirpass.py create mode 100644 tests/integration/test_textile_profiles.py create mode 100644 tests/plugins/__init__.py create mode 100644 tests/plugins/test_license_isolation.py create mode 100644 tests/plugins/tyres/__init__.py create mode 100644 tests/plugins/tyres/test_tyres_models.py create mode 100644 tests/plugins/tyres/test_tyres_pipeline.py create mode 100644 tests/plugins/tyres/test_tyres_validators.py create mode 100644 tests/property/test_round_trip_invariants.py create mode 100644 tests/unit/test_cirpass_v1_3_rules.py create mode 100644 tests/unit/test_codegen_regenerate_enums.py create mode 100644 tests/unit/test_cold_start_import.py create mode 100644 tests/unit/test_detection_ambiguity.py create mode 100644 tests/unit/test_detection_cirpass.py create mode 100644 tests/unit/test_eudpp_term_mapping.py create mode 100644 tests/unit/test_identifier_schemes.py create mode 100644 tests/unit/test_mapping_codes.py create mode 100644 tests/unit/test_models_cirpass_v1_3.py create mode 100644 tests/unit/test_namespace_canonicality.py create mode 100644 tests/unit/test_party_role_gradient.py create mode 100644 tests/unit/test_registry_back_compat.py create mode 100644 tests/unit/test_v07_model_schema_alignment.py create mode 100644 tools/check_imports.py create mode 100644 tools/codegen/check_drift.py create mode 100644 tools/codegen/cirpass/README.md create mode 100644 tools/codegen/cirpass/derive_schema.py create mode 100644 tools/codegen/cirpass/regenerate_enums.py create mode 100644 tools/snapshot/README.md create mode 100644 tools/snapshot/cirpass2_artefacts.json create mode 100644 tools/snapshot/fetch_cirpass.py diff --git a/.gitignore b/.gitignore index 58c5168..f43c8cc 100644 --- a/.gitignore +++ b/.gitignore @@ -227,8 +227,8 @@ docs/VC_WALLET_ROADMAP.md docs/STRATEGIC_ROADMAP.md # CIRPASS-2 spec-snapshot scratch (verbatim downloads). -# Phase 0 of docs/plans/CIRPASS_2_MIGRATION.md. The canonical bundled -# bytes land in src/dppvalidator/vocabularies/data/ontologies/ in -# Phase 1; the gitignored scratch dir is the operator-audit workspace. +# The canonical bundled bytes land in +# src/dppvalidator/vocabularies/data/ontologies/; the gitignored +# scratch dir is the operator-audit workspace. tools/snapshot/cirpass-2/ tools/snapshot/manifest-rows.json diff --git a/AGENTS.md b/AGENTS.md index 6a6e093..7574993 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -26,7 +26,7 @@ src/dppvalidator/ # Main package │ ├── rules/v0_6/ # Semantic rules — v0.6 │ ├── rules/v0_7/ # Semantic rules — v0.7 │ └── … -├── compat/ # Cross-version compat shims (Phase 4) +├── compat/ # Cross-version compat shims ├── verifier/ # Signature and credential verification ├── exporters/ # JSON-LD and EU DPP export formats ├── schemas/ # JSON Schema loading + version registry diff --git a/CHANGELOG.md b/CHANGELOG.md index f7d9a1c..e43b5d2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,194 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.5.0] - 2026-05-09 (Preview) + +This release adds **end-to-end CIRPASS-2 reference structure v1.3.0** +support alongside UNTP DPP 0.6.x / 0.7.0 and ships the cross-family +forward / reverse shims, the EUDPP v1.9.x ontology rebase, two pilot +profiles (Textile v2 built-in, Tyres GPL plugin), and a six-code CLI +exit surface. The migration plan is at +[`docs/plans/CIRPASS_2_MIGRATION.md`](docs/plans/CIRPASS_2_MIGRATION.md); +each phase has its own implementation log there. + +**Status: Preview / unstable.** The Pydantic v0.7 model layer has +documented drift from the upstream JSON Schema (Phase 8.9 catalogue, +27 tracked items) — see "Known limitations" below. Schema validation +(Layer 1) is faithful 1:1 to upstream and remains the authoritative +correctness check. + +### CIRPASS-2 + +- **CIRPASS DPP reference structure v1.3.0** registered as a first- + class schema family (`SchemaFamily.CIRPASS`). Detection auto-routes + via `@context` and shape signatures; explicit override via + `dppvalidator validate --target {auto,untp,cirpass}`. Family + mismatch surfaces as `DET001` with exit code `EXIT_FAMILY_MISMATCH` + (3). +- **Cross-family compat shims** at + `dppvalidator.compat.{untp_0_7_to_cirpass_1_3, cirpass_1_3_to_untp_0_7}` + with 5 `MAP00X` warning codes. Forward-shim coverage of the v0.7 + PartyRoleEnum is exhaustive; reverse-shim coverage maps EUDPP role + IRIs onto the wider 20-value PartyRoleEnum (deliberate to preserve + CIRPASS information fidelity). +- **EUDPP Core / ACTOR / SOC / LCA v1.9.x** ontologies vendored under + `src/dppvalidator/vocabularies/data/ontologies/`; 6 fresh + manifest entries, every IRI rebased onto the canonical + `https://w3id.org/eudpp#` fragment namespace per + [ADR 0002](docs/adr/0002-canonical-eudpp-iri.md). +- **CIRPASS reference Pydantic models** at + `dppvalidator.models.cirpass.v1_3.*` (20 classes — Actor, + ActorRoleAssignment, ConnectorRelation, Material, LifeCycleAssessment, + SubstanceOfConcern, etc.) covering every required field per the + reference schema. +- **CIRPASS rule corpus** at `validators/rules/cirpass_v1_3/` (4 + prefixes — `ACT`, `REL`, `SOC`, `LCA`) with 11 rules covering every + axiom in the v1.9.x EUDPP module specs. +- **CIRPASS JSON-LD exporter** at `exporters/cirpass_jsonld.py` + accepting both native CIRPASS passports and UNTP envelopes + (forward-shimmed). EUDPP exporter rebased onto v1.9.1 namespaces + (`EUDPP_CONTEXT_URL` deprecated via PEP 562; resolution kept + through 0.6.0). +- **`migrate --to {untp-0.7,cirpass-1.3}`** generalises the migrate + command to the cross-family forward shim with a `--default-language` + option for LocalisedText wrapping. + +### UNTP + +- **DEFAULT_VERSIONS[UNTP] flipped from `0.6.1` → `0.7.0`** (Phase 9 + task 9.1). v0.6.x remains supported via auto-detection and the + v0.6 → v0.7 upgrade shim (`compat/upgrade_0_6_to_0_7.py`). +- **D1 BLOCKER fix** (Phase 9 task 9.7): + `BitstringStatusListEntry.statusListIndex` is now `int | None` + (was `str | None`); a `before` validator transparently coerces + numeric strings (including whitespace and leading zeros) for v0.6 + fixture back-compat. Non-numeric strings and negative integers + are rejected. **Non-breaking** for any v0.6 fixture with + numeric-string `statusListIndex`. +- **D2 BLOCKER fix** (Phase 9 task 9.8): `PartyRoleEnum` + Layer-1/Layer-2 contradiction resolved via a documented dual-tier + acceptance gradient. Pydantic accepts the wider 20-value set + (preserving v0.6 fixture parsing and CIRPASS reverse-shim + fidelity); JSON Schema accepts the strict 6 (`owner`, `producer`, + `manufacturer`, `processor`, `remanufacturer`, `recycler`). The + new advisory rule **PRT001** (severity `info`) surfaces the gap + when payloads use one of the 14 wider values, suggesting a + canonical schema-allowed counterpart. Pass + `ValidationEngine(strict_role_enum=True)` to upgrade PRT001 from + `info` to `error`. **Non-breaking** — no enum values removed. +- **Three-tier alignment guard test** + (`tests/unit/test_v07_model_schema_alignment.py`, Phase 9 task + 9.9): strict-tier asserts D1 + D2 closures; drift-watch tier + asserts only registered drift items appear (forces every future + PR widening drift to update the baseline atomically); compat tier + asserts v0.6 model + CIRPASS reverse-shim invariants. + +### Plugins + +- **Textile v2 profile** built-in (`--profile textile-v2`): 7 rules + (TXT001–TXT007) including TXT006 recycled-content disclosure and + TXT007 repair-info (both new in v2). The legacy `textile-v1` + profile remains available; both ship as `TEXTILE_PROFILES` + registry entries. +- **Tyres GPL plugin** (`plugins/tyres/`, `dppvalidator-tyres==0.1.0`, + marked Pre-1.0 / Experimental): 4 GDSO declaration models (Birth + v0.9, Collection v0.1, Retread v0.1, Recycling v0.1) + + `TyreLifecycleHistory` aggregate enforcing UUID-chain / + chronological-order / single-Recycling invariants. 8 TYR-coded + validators auto-registered via entry-points. +- **Plugin license isolation** enforced via + `tools/check_imports.py` — AST-walks the core source tree and + fails on any import from `plugins/*` (R8 mitigation; preserves + GPL/MIT separation). + +### Bug fixes + +- D1 (`statusListIndex` int): see UNTP section above. +- D2 (`PartyRoleEnum` gradient): see UNTP section above. + +### Deprecations (activated in 0.5.0; removed in 0.6.0 / Phase 10) + +- **Bare-string `SCHEMA_REGISTRY[version]` lookup.** Now emits a + `DeprecationWarning` on every `__getitem__` access. Migrate to + `SCHEMA_REGISTRY_BY_FAMILY[(family, version)]` or + `SchemaRegistry().get_schema(version, family=...)`. +- **`is_dpp_document()` alias** in `dppvalidator.validators.detection`. + Use `looks_like_dpp()` instead. +- **`EUDPP_CONTEXT_URL`** legacy hub URL constant. Use + `EUDPP_CANONICAL_CONTEXT_URL` (the v1.9.1 W3ID fragment namespace) + instead. The legacy hub URL stays resolvable through Phase 10. + +### Cross-version compatibility + +- v0.6.0 / v0.6.1 fixtures parse, validate, and upgrade without + regression after every fix landed in this release. The v0.6 + Pydantic models (`models/v0_6/`) are frozen per the cardinal + versioning rule; the upgrade shim transparently handles any + shape differences (including the statusListIndex string→int + coercion). +- All CIRPASS v1.3.0 round-trips remain bit-stable through forward + and reverse shims; integration suites + (`test_round_trip_untp_cirpass.py`, `test_compat_roundtrip.py`, + `test_cross_family_isolation.py`, `test_version_matrix.py`) all + green. + +### Known limitations + +The Pydantic v0.7 model layer has documented drift from the upstream +JSON Schema, catalogued as drift items D3–D27 in +[Phase 8.9 of the migration plan](docs/plans/CIRPASS_2_MIGRATION.md). +Schema validation (Layer 1) catches every contract violation; the +drift is confined to the Python API ergonomics layer and will be +fully reconciled in 0.6.0 (Phase 10 tasks 10.9–10.15). Specifically: + +- **HIGH (Phase 10.10 / 10.11):** required-vs-Optional drift on + `Address`, `BitstringStatusListEntry`, `Claim`, `Period`; reverse + drift on `BitstringStatusListEntry.id`; missing required fields on + `Link.linkName`, `Package.{description, dimensions, materialUsed}`. +- **MEDIUM (Phase 10.9):** missing optional schema fields on `Party`, + `Link`, `Package`, `Period`, `RenderTemplate2024` — preserved via + Pydantic `extra="allow"` so round-trip is lossless; Phase 10 + promotes them to first-class fields. +- **LOW (Phase 10.13):** `Claim.classification`, `Link.{name, description, relationship}`, `Package.{packageType, weight}`, + `RenderTemplate2024.id` — model-only fields not in schema; Phase 10 + catalogues each. +- **FORMAT (Phase 10.12):** 12 schema sites with `format: uri` and + 1 site with `format: byte` typed as plain `str`; Phase 10 promotes + to typed Pydantic annotations. +- **CIRPASS (Phase 10.15):** field-level deep diff for the v1.3 + model layer not yet performed; class counts match (20 vs 20). + +### Migration guide + +- **Bare-string `SCHEMA_REGISTRY` users:** + + ```python + # Before (deprecated, emits warning): + from dppvalidator.schemas.registry import SCHEMA_REGISTRY + + schema = SCHEMA_REGISTRY["0.7.0"] + + # After: + from dppvalidator.schemas.registry import SCHEMA_REGISTRY_BY_FAMILY, SchemaFamily + + schema = SCHEMA_REGISTRY_BY_FAMILY[(SchemaFamily.UNTP, "0.7.0")] + ``` + +- **`is_dpp_document` callers:** replace with `looks_like_dpp` + (identical return value). + +- **Strict PartyRole enforcement:** opt in via + `ValidationEngine(strict_role_enum=True)` to upgrade PRT001 from + `info` to `error`. The 14 wider values (`importer`, `distributor`, + `retailer`, `logisticsProvider`, `operator`, `serviceProvider`, + `inspector`, `certifier`, `regulator`, `carrier`, `consignor`, + `consignee`, `exporter`, `brandOwner`) remain valid Python enum + members for back-compat. + +- **`statusListIndex`:** numeric-string values (`"5"`) continue to + parse via the new before-validator and round-trip as `int`. No + payload-side migration needed. + ## [0.4.0] - 2026-05-08 This release adds first-class support for **UNTP DPP 0.7.0** alongside diff --git a/README.md b/README.md index 2e23b75..18832ab 100644 --- a/README.md +++ b/README.md @@ -175,34 +175,78 @@ jsonld_output = exporter.export(passport) # Ready for W3C Verifiable Credentials ecosystem ``` -## Supported versions +## Supported specs -dppvalidator supports both UNTP DPP wire formats in the same release. -The version is auto-detected from the payload's `@context` / -`$schema` URLs; pin explicitly with `--schema-version` (CLI) or -`schema_version=` (Python). +dppvalidator validates **two parallel families** in the same +release: UNTP DPP (UN/CEFACT Verifiable Credential format) and the +CIRPASS DPP reference structure (CIRPASS-2 hierarchical message). +Family + version are auto-detected from the payload's `@context` / +shape; pin explicitly with `--target` (family) and `--schema-version` +(version), or `target=` / `schema_version=` in Python. -| UNTP DPP | Status | Default? | Wire shape | +### UNTP DPP + +| Version | Status | Default? | Wire shape | | --------- | ------------------ | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **0.6.0** | Supported (legacy) | no | `credentialSubject` is `ProductPassport` wrapping `Product`. | | **0.6.1** | Default | **yes** | Same shape as 0.6.0; current `DEFAULT_SCHEMA_VERSION`. | | **0.7.0** | Fully supported | no | `credentialSubject` IS the `Product` directly. New required fields: `name` (envelope), `idScheme`, `idGranularity`, `productCategory`, `producedAtFacility`, `countryOfProduction`. | +### CIRPASS DPP reference structure + +| Version | Status | Default? | Wire shape | +| --------- | --------------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **1.3.0** | Fully supported | **yes** | Hierarchical message: root carries `dppIdentifier`, `product`, `issuedAt`, `effectivePeriod`, `relatedActors`, `composition`, `substancesOfConcern`, `lca`, `connectorRelations`. | + +EUDPP modules bundled: P_DPP / ACTOR / SOC / CON v1.9.1, LCA +v1.9.4-Maki. Per-module SHACL pass with attributed source (Phase 4). + -A compat shim upgrades v0.6.x payloads to v0.7.0 shape: +### Migration shims + +Two shims project between families and across UNTP versions: ```bash +# UNTP 0.6 → 0.7 (intra-family upgrade) dppvalidator migrate passport-v06.json -o passport-v07.json -dppvalidator validate passport-v06.json --upgrade-from 0.6.1 --schema-version 0.7.0 + +# UNTP 0.7 → CIRPASS reference structure 1.3 (cross-family forward) +dppvalidator migrate passport-v07.json --to cirpass-1.3 -o cirpass.json --accept-warnings + +# CIRPASS 1.3 → UNTP 0.7 (cross-family reverse) +dppvalidator migrate cirpass.json --to untp-0.7 -o untp.json --accept-warnings ``` -The full version-handling story is documented in -[`docs/concepts/untp-versions.md`](docs/concepts/untp-versions.md); -the field rename table and warning codes are in -[`docs/guides/migration-0-6-to-0-7.md`](docs/guides/migration-0-6-to-0-7.md). +### Pilot profiles + plugins + + + +| Pilot | Activation | Source | +| ----------------------------------------------- | -------------------------------- | ------------------------------------------------------------------- | +| Textile DPP v1 (legacy) | `--profile textile-v1` | built-in | +| Textile DPP v2 (MVP 2025-12-04) | `--profile textile-v2` | built-in | +| Tyres (GDSO Birth/Collection/Retread/Recycling) | `pip install dppvalidator-tyres` | [`plugins/tyres/`](plugins/tyres) — Pre-1.0 / Experimental, GPL-3.0 | + + + +### Reading guide + + + +| You want… | See | +| --------------------------------- | ---------------------------------------------------------------------------------- | +| Big-picture orientation | [`docs/concepts/cirpass-2-alignment.md`](docs/concepts/cirpass-2-alignment.md) | +| Migrate a UNTP fixture to CIRPASS | [`docs/guides/migrate-untp-to-cirpass.md`](docs/guides/migrate-untp-to-cirpass.md) | +| UNTP version handling | [`docs/concepts/untp-versions.md`](docs/concepts/untp-versions.md) | +| 0.6 → 0.7 upgrade | [`docs/guides/migration-0-6-to-0-7.md`](docs/guides/migration-0-6-to-0-7.md) | +| Field-by-field UNTP↔CIRPASS map | [`docs/concepts/untp-cirpass-mapping.md`](docs/concepts/untp-cirpass-mapping.md) | +| CLI exit codes | [`docs/reference/cli/exit-codes.md`](docs/reference/cli/exit-codes.md) | +| CIRPASS Pydantic API | [`docs/reference/cirpass/index.md`](docs/reference/cirpass/index.md) | + + ## Features diff --git a/docs/adr/0001-cirpass-json-schema-derivation.md b/docs/adr/0001-cirpass-json-schema-derivation.md new file mode 100644 index 0000000..8055ff1 --- /dev/null +++ b/docs/adr/0001-cirpass-json-schema-derivation.md @@ -0,0 +1,109 @@ +# ADR 0001 — CIRPASS reference-structure JSON Schema is *derived* from the hub's tree-view export + +**Status:** Accepted +**Date:** 2026-05-08 +**Deciders:** dppvalidator maintainers (drafted during CIRPASS-2 migration planning) +**Migration-plan label:** D-0.1 +**Related phases:** Phase 0 (snapshot), Phase 3 (CIRPASS models) + +## Context + +The dppvalidator pipeline opens with a JSON Schema validation pass +(`validators/schema.py`). For UNTP DPP, the schema is downloaded from +UN/CEFACT, vendored under `src/dppvalidator/schemas/data/`, and pinned +by SHA-256 in `MANIFEST.json`. Migrating to CIRPASS-2 raises the +question: *what JSON Schema do we use for the CIRPASS DPP reference +structure v1.3.0?* + +A static-HTML scan of +(2026-05-08) shows the hub publishes: + +- ~80 ontology versions (`OntologyVersion_` GUIDs, exported as + TTL). +- ~2 JSON Schema versions (`JsonSchemaVersion_` / + `JsonSchemaSpecVersion_` GUIDs). Both belong to the Battery Pass + project. +- An unspecified number of "message" versions surfaced in the UI as + tree views; `Tree view` and `Export schema` controls produce + tree-shaped JSON, not a JSON Schema. + +The CIRPASS DPP reference structure v1.3.0 is a *message*, not a JSON +Schema. The hub does not publish a JSON Schema for it. + +We need a JSON Schema to: + +1. Run the existing schema-first validation pass without per-family + special-casing. +1. Drive Pydantic model generation in Phase 3 (or at least cross-check + hand-written models against an authoritative shape). +1. Pin integrity (SHA-256) so we detect upstream tree-view drift. + +## Decision + +Derive the CIRPASS reference-structure JSON Schema from the hub's +tree-view export, programmatically, in CI-checkable code. + +Specifically: + +- A generator script lives at `tools/codegen/cirpass/derive_schema.py`, + not under `src/`. Generated bytes are committed to + `src/dppvalidator/schemas/data/cirpass-reference-1.3.0.json`. +- The committed schema carries a `# generated-from: @` + banner so a future reader can re-derive it. +- A drift gate at `tools/codegen/check_drift.py` re-runs the generator + on every CI build and `git diff --exit-code`s the result. Drift fails + the build (mitigates R14 in the migration plan). +- The schema is registered in `SCHEMA_REGISTRY` as + `(SchemaFamily.CIRPASS, "1.3.0")` per Phase 2 task 2.5. + +## Consequences + +**Positive** + +- Schema-first validation works uniformly across UNTP and CIRPASS + families. No per-family branch in `validators/schema.py`. +- Drift detection is automatic; we cannot ship a stale schema by + accident. +- Phase 3 has an authoritative shape to validate Pydantic models + against (codegen reciprocity). + +**Negative** + +- The derived schema may diverge from the hub's intent if the + derivation logic mishandles a tree-view construct. Tests must + include round-trip and example-instance checks against the official + examples on the hub. +- The generator becomes a maintenance surface that must be kept in + sync with the tree-view export format if the hub changes it. Drift + is detected but resolution still costs engineer-time. +- We are subtly authoring spec-derived artefacts; if CIRPASS-2 later + publishes a canonical JSON Schema, our derived schema diverges and we + must transition. This is acceptable because the transition path is + obvious: replace derivation with vendoring, drop the generator, keep + the registry entry. + +## Alternatives considered + +- **Hand-author a JSON Schema.** Rejected: brittle; no audit trail + back to the spec; high maintenance cost on every CIRPASS minor. +- **Skip JSON Schema entirely** and rely solely on Pydantic + SHACL. + Rejected: weakens schema-first validation; complicates the engine + pipeline (per-family branching); makes `dppvalidator validate` + inconsistent across families (some payloads pass schema check, others + silently skip it). +- **Vendor the tree-view export as the schema artefact.** Rejected: + the tree-view is not JSON Schema; jsonschema-validate would fail. + +## Validation hooks + +- `tools/codegen/check_drift.py` — CI gate, runs on every build. +- `tests/integration/test_cirpass_v1_3_pipeline.py` — full pipeline + on golden fixtures (Phase 4 deliverable). +- `tests/unit/test_models_cirpass_v1_3.py` — per-class invariants + (Phase 3 deliverable). + +## References + +- [Migration plan §1.3 D-0.1](../plans/CIRPASS_2_MIGRATION.md) +- [Spec snapshot doc](../concepts/cirpass-2-spec-snapshot.md) +- DPP Vocabulary Hub: diff --git a/docs/adr/0002-canonical-eudpp-iri.md b/docs/adr/0002-canonical-eudpp-iri.md new file mode 100644 index 0000000..d80b0fb --- /dev/null +++ b/docs/adr/0002-canonical-eudpp-iri.md @@ -0,0 +1,129 @@ +# ADR 0002 — EUDPP IRIs rebase to canonical `https://w3id.org/eudpp/` + +**Status:** Accepted +**Date:** 2026-05-08 +**Deciders:** dppvalidator maintainers (drafted during CIRPASS-2 migration planning) +**Migration-plan label:** D-0.3 +**Related phases:** Phase 0 (verification), Phase 1 (rebase) + +## Context + +The EUDPP namespace constants in +[`src/dppvalidator/vocabularies/ontology.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/vocabularies/ontology.py) +bind EUDPP class IRIs to the *publishing host* of each module: + +```python +class EUDPPNamespace(str, Enum): + EUDPP = "http://dpp.taltech.ee/EUDPP#" # TalTech publishes P_DPP / SOC / ACTOR / CON / CORE + LCA = "http://dpp.cea.fr/EUDPP/LCA#" # CEA France publishes LCA + ... +``` + +These IRIs are accurate as *origin URLs* for the historic v1.7.1 / v2.0 +TTL bytes — the bytes literally lived under those hosts. + +The CIRPASS-2 v1.9.1 / v1.9.4.Maki release on the DPP Vocabulary Hub +declares a different identifier basis. The CORE umbrella module's +metadata (per , 2026-05-08) +states: + +> EUDPP CORE ontology maps the ontology modules by importing them. +> The imports are from `https://w3id.org/eudpp`. + +`https://w3id.org/eudpp` is a *permanent identifier* (W3ID) that +redirects to whichever publisher hosts the bytes today. As of v1.9.1, +the canonical IRIs of the EUDPP modules are: + +- `https://w3id.org/eudpp/` (CORE) +- `https://w3id.org/eudpp/p_dpp/` +- `https://w3id.org/eudpp/soc/` +- `https://w3id.org/eudpp/lca/` +- `https://w3id.org/eudpp/actor/` +- `https://w3id.org/eudpp/con/` + +Continuing to emit `dpp.taltech.ee` / `dpp.cea.fr` IRIs from +`EUDPPJsonLDExporter` produces JSON-LD that downstream consumers cannot +dereference: the predicates point at non-canonical hosts. Linked-data +unification breaks. + +## Decision + +All EUDPP namespace bindings rebase onto the canonical +`https://w3id.org/eudpp/` prefix in Phase 1. + +Specifically: + +- `EUDPPNamespace.EUDPP` → `"https://w3id.org/eudpp/"`. +- `EUDPPNamespace.LCA` → `"https://w3id.org/eudpp/lca/"`. +- New per-module bindings added: `P_DPP`, `SOC`, `ACTOR`, `CON`. +- All 170 rows of `TERM_MAPPINGS` rebase predicate IRIs. +- The bundled `eudpp-context-v1.9.1.jsonld` carries the canonical IRIs. +- The legacy `CIRPASSNamespace = EUDPPNamespace` alias at + [`vocabularies/ontology.py:59`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/vocabularies/ontology.py) + is **deleted**: it conflated *messages* (CIRPASS) with *axioms* + (EUDPP). Phase 3 introduces a distinct + `CIRPASSMessageNamespace` if needed for the message-tree shape. +- Old IRIs remain registered in `exporters/contexts.py` for one release + with a `DeprecationWarning`; removed in Phase 10. + +A precondition gate runs in Phase 0: +`python tools/snapshot/fetch_cirpass.py --verify-canonical` HEAD-pings +each canonical IRI. Non-zero exit (R12 in the plan) blocks Phase 1 +until the IRIs resolve, with two escalation paths: + +1. The IRIs are not yet wired up by W3ID; coordinate upstream and + wait. Phase 1 is paused. +1. The IRIs are intentionally different from what the spec page + states; update this ADR with the corrected basis and re-run the + gate. + +## Consequences + +**Positive** + +- EUDPP-LD output is canonically dereferenceable. +- `MANIFEST.json` rows can carry a meaningful `canonical_iri` field + separately from the (mutable) host URL the bytes were vendored from. +- Future re-publishings (TalTech → some new publisher; CEA → some new + publisher) leave canonical IRIs unchanged. + +**Negative** + +- This is a coordinated rewrite, not a version bump. Phase 1's scope + is larger than "refresh the TTLs"; it includes the IRI rebase across + 170 mapping rows, 5 enum modules, and the JSON-LD context. +- Downstream consumers that pinned the old `taltech.ee` / `cea.fr` + IRIs break. We mitigate with a one-release deprecation window + (`0.5.0` → `0.6.0`) and call out the change in + [`docs/concepts/eudpp-1.9-changelog.md`](../concepts/eudpp-1.9-changelog.md). +- W3ID is a redirector, not a host. If the redirector is briefly down, + consumers that follow the IRI lose the dereferencing path. This is a + reliability concern for downstream code, not for our emission. + +## Alternatives considered + +- **Keep per-publisher IRIs.** Rejected: non-canonical; the spec + declares a different basis; LD consumers fail. +- **Dual-emit both IRI sets** (legacy + canonical). Rejected: an LD + document with two predicates for the same property is semantically + ambiguous. +- **Defer the rebase to a later phase**, ship v1.9.1 axioms with old + IRIs first. Rejected: produces a broken-by-design intermediate + release; users would update and immediately get non-canonical + emissions. + +## Validation hooks + +- `tools/snapshot/fetch_cirpass.py --verify-canonical` — Phase 0 task 0.4. +- `tests/unit/test_namespace_canonicality.py` (Phase 1) — every emitted + EUDPP IRI starts with `https://w3id.org/eudpp/`; no `taltech.ee` / + `cea.fr` references remain in the codebase. +- `tests/integration/test_eudpp_export_v1_9.py` (Phase 1) — golden-diff + audit of EUDPP-LD output for the canonical v0.7 fixture. + +## References + +- [Migration plan §1.3 D-0.3](../plans/CIRPASS_2_MIGRATION.md) +- [Spec snapshot doc](../concepts/cirpass-2-spec-snapshot.md) +- W3ID: +- DPP Vocabulary Hub: diff --git a/docs/adr/0003-tyre-license.md b/docs/adr/0003-tyre-license.md new file mode 100644 index 0000000..c481b75 --- /dev/null +++ b/docs/adr/0003-tyre-license.md @@ -0,0 +1,92 @@ +# ADR 0003 — GDSO Tyre data-model license — provisional GPL-3.0 + +**Status:** Proposed (pending OA-1 closure) +**Date:** 2026-05-08 +**Deciders:** dppvalidator maintainers (drafted during CIRPASS-2 migration planning) +**Migration-plan label:** OA-1 +**Related phases:** Phase 0 (license confirmation), Phase 7 (tyres plugin scaffolding) + +## Context + +Phase 7 of the CIRPASS-2 migration plan scaffolds a new `plugins/tyres/` +package mirroring the GDSO Ambassador Data Models v1 and the four tyre +declarations (Birth v0.9, Collection v0.1, Retread v0.1, Recycling v0.1) +plus the Tyre Lifecycle History v1 wrapper, all hosted on the DPP +Vocabulary Hub under the *CIRPASS-2 Pilot: Tyre Digital Product +Passport* group. + +The license under which GDSO publishes these artefacts is *not stated +on the spec listing page* as of 2026-05-08. The CIRPASS-2 Pilot group +is in a separate hub project, possibly with restricted access policy. +Phase 7 cannot ship without a known license: + +- The plugin's own `pyproject.toml` declares a `license` field. +- The plugin's `LICENSE` file must be present. +- The license must be compatible with the plugin sitting alongside + `plugins/textiles/` (GPL-3.0-or-later) and the MIT-licensed core, + per [`.claude/rules/plugin-licenses.md`](https://github.com/artiso-ai/dppvalidator/blob/main/.claude/rules/plugin-licenses.md). + +## Decision + +Until the license is explicitly confirmed, scaffold the plugin under +**GPL-3.0-or-later**, mirroring `plugins/textiles/`. Promote this ADR +from `Proposed` to `Accepted` once OA-1 closes and the actual license +is recorded. + +If OA-1 confirms a *different* license (e.g. the GDSO data is more +permissively licensed, or restricted to project members), this ADR is +*Superseded* by a new ADR carrying the correct license; the plugin's +`pyproject.toml`, `LICENSE`, and any module headers update in lockstep. + +## Consequences + +**Positive** + +- Phase 7 can scaffold against a concrete default without waiting on + the human-review gate. Code structure (entry-points, `pyproject.toml` + shape, `LICENSE` placement) is fully workable. +- The default is the most-restrictive realistic choice, so no + permissive-license expectations leak into either core or downstream + consumers prematurely. +- Plugin license isolation is enforced by the Phase 7 task 7.9 import + graph gate (`tools/check_imports.py`); a license correction requires + no import changes. + +**Negative** + +- A subsequent license correction means a `git mv` of the plugin's + `LICENSE` file and a `pyproject.toml` field update — still cheap, but + not free. +- If GDSO publishes under a *less* permissive license than GPL-3.0 + (e.g. CC-BY-NC, with field-of-use restrictions), the entire + scaffolding would need to be reconsidered: a non-OSI-approved license + excludes the plugin from PyPI under the standard Trove classifiers. + This case is the explicit blocker that OA-1 must resolve before + Phase 7 ships. + +## Alternatives considered + +- **Block Phase 7 until OA-1 closes.** Rejected: stalls the work for + a license-confirmation step that is independent of the engineering. + The default-and-correct-later path keeps both moving. +- **Default to MIT** (matching core). Rejected: GDSO data is more + likely upstream-restricted than upstream-permissive given the + pilot-project context. MIT default gives consumers a wrong impression + of permissiveness. +- **Default to "license: TBD" placeholder.** Rejected: PyPI rejects + packages with no license; CI license-classifier checks fail. + +## Validation hooks + +- OA-1 (open action item, Phase 0 close) — owner: human reviewer. +- Phase 7 task 7.4 — `plugins/tyres/LICENSE` materialises the chosen + license. +- Phase 7 task 7.9 — `tools/check_imports.py` gate enforces license + isolation regardless of the chosen license. + +## References + +- [Migration plan §1.4 OA-1](../plans/CIRPASS_2_MIGRATION.md) +- [`.claude/rules/plugin-licenses.md`](https://github.com/artiso-ai/dppvalidator/blob/main/.claude/rules/plugin-licenses.md) +- DPP Vocabulary Hub — Tyre DPP Playground group: +- GDSO project: confirm via vocab-hub Pilot project metadata. diff --git a/docs/adr/0004-textile-v2-built-in.md b/docs/adr/0004-textile-v2-built-in.md new file mode 100644 index 0000000..e998161 --- /dev/null +++ b/docs/adr/0004-textile-v2-built-in.md @@ -0,0 +1,87 @@ +# ADR 0004 — Textile v2 ships as a built-in profile + +**Status:** Accepted (Phase 7, 2026-05-08). + +## Context + +The Phase 7 task list asked for two pilot refreshes: + +1. The textile pilot — add MVP Textile DPP v2 (2025-12-04) rules. +1. The tyres pilot — scaffold a brand-new GDSO-aligned plugin. + +The plan was originally framed in terms of *plugins* for both +(`plugins/textiles/` for v2 textile rules; `plugins/tyres/` for +GDSO declarations). During implementation it became clear that +the two pilots have different licensing and packaging +requirements: + +- **Tyres** tracks GDSO declarations whose interpretation we want + to publish under a copyleft licence (GPL-3.0-or-later) so + derived rule packs stay in the open-source commons. The plugin + is a separate distribution package because its license boundary + must be explicit. +- **Textiles** has no equivalent upstream-licensing constraint — + the rules are interpretations of EU ESPR Annex requirements and + the JRC preparatory study. The textile pilot already had + built-in rules at + `src/dppvalidator/validators/rules/v0_X/textile.py`; the v2 + rules are an *additive* upgrade to the same interpretation. + +## Decision + +Ship Textile v2 as a **built-in profile**, not an out-of-tree +plugin: + +- New module + [`src/dppvalidator/validators/rules/v0_7/textile_v2.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/validators/rules/v0_7/textile_v2.py) + carries the v2 rule pack alongside the existing v1 module. +- A profile registry + (`dppvalidator.validators.rules.TEXTILE_PROFILES`) keys both + packs (`textile-v1` / `textile-v2`). +- The CLI exposes `validate --profile {textile-v1,textile-v2}` to + toggle between them. +- The two packs are *alternatives* — only one runs against any + given payload. The validator's profile dispatch *replaces* the + v1 textile rules with the chosen pack rather than running both. + +The tyres pilot stays as a plugin package +([`plugins/tyres/`](https://github.com/artiso-ai/dppvalidator/tree/main/plugins/tyres)) +with GPL-3.0-or-later licensing, per +[ADR 0003](0003-tyre-license.md). + +## Consequences + +**Pros** + +- Single dependency: textile v2 ships with the core, no + additional install step. +- Profile dispatch is uniform: any future built-in pilot pack + (e.g. `electronics-v1`) can join the same registry without a + separate plugin scaffold. +- The rule code lives next to the existing v1 textile module — + shared helpers (`TEXTILE_HS_CHAPTERS`, fibre code lookups) + don't have to cross a package boundary. + +**Cons** + +- License-clarity advantage of out-of-tree plugins is lost. + Mitigation: the textile rules don't carry a copyleft upstream + constraint, so this isn't a real concern. +- Adds 7 rule classes to the core code-size footprint. Mitigation: + the rule classes are stateless and lazy-loaded only when the + profile is set (no import-time cost when no profile is in use). + +## Reversal cost + +Low. If a future textile-rule licensing concern emerges, the v2 +pack can be lifted into a `plugins/textiles/` plugin without +breaking the public API: the registry entries become +entry-points, and the `--profile textile-v2` flag continues to +resolve through the same dispatch hook. + +## See also + +- [Phase 7 task 7.1 in the migration plan](../plans/CIRPASS_2_MIGRATION.md) + (engineering-side log). +- [ADR 0003 — Tyres plugin license](0003-tyre-license.md). +- [`textile_v2.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/validators/rules/v0_7/textile_v2.py). diff --git a/docs/adr/0005-cli-exit-codes.md b/docs/adr/0005-cli-exit-codes.md new file mode 100644 index 0000000..a3726ec --- /dev/null +++ b/docs/adr/0005-cli-exit-codes.md @@ -0,0 +1,93 @@ +# ADR 0005 — CLI exit-code surface (six codes) + +**Status:** Accepted (Phase 6, 2026-05-08). + +## Context + +Pre-Phase-6, dppvalidator's CLI used three exit codes: + +| Code | Meaning | +| ---- | ---------------------------------------------------- | +| `0` | Validation passed. | +| `1` | Validation produced errors. | +| `2` | Engine error (uncaught exception, IO failure, etc.). | + +Phase 6 of the CIRPASS-2 migration introduced two new failure +shapes that the legacy three-code surface couldn't express +distinctly: + +- `--target {untp,cirpass}` mismatching the detected family — + conceptually a different problem from "validation found errors" + (the payload is fine; the user's flag is wrong). +- `migrate --to {...}` blocking on `MAP00X` warnings under + `--strict` — also conceptually different from "validation + errors". + +CI consumers wanted to branch on the failure shape without +parsing human-readable error messages. + +## Decision + +Adopt a six-code exit surface, exposed as module-level constants +in +[`src/dppvalidator/cli/main.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/cli/main.py) +and documented at +[`docs/reference/cli/exit-codes.md`](../reference/cli/exit-codes.md): + +| Code | Constant | Meaning | +| ---- | ------------------------ | ----------------------------------------------------------------------------------------------- | +| `0` | `EXIT_VALID` | Operation completed successfully. | +| `1` | `EXIT_INVALID` | Validation produced one or more errors. | +| `2` | `EXIT_ERROR` | Unrecoverable engine / shim failure (uncaught exception, dependency missing). | +| `3` | `EXIT_FAMILY_MISMATCH` | `--target` (or `--to`) explicitly contradicts the payload's detected family. Surfaces `DET001`. | +| `4` | `EXIT_BLOCKING_WARNINGS` | A migration / upgrade emitted blocking warnings without `--accept-warnings`. | +| `5` | `EXIT_IO_ERROR` | IO failure: file not found, encoding error, glob match nothing, output write failure. | + +## Migration + +The old `EXIT_ERROR=2` semantics partly overlapped with the new +`EXIT_IO_ERROR=5` (file-not-found returned `2` pre-Phase-6). +Phase 6 splits IO failures off into `5`; existing CI scripts that +treat any non-zero code as "something went wrong" continue to +work; scripts that branch specifically on `2` for engine errors +get the cleaner semantics. + +Each pre-existing test that pinned the legacy `EXIT_ERROR` for +file-not-found was updated in Phase 6 to use `EXIT_IO_ERROR`. + +## Consequences + +**Pros** + +- CI pipelines branch precisely on failure shape. +- Wrapper scripts (e.g. pre-commit hooks) can decide whether to + retry (`5` IO transient) vs surface immediately (`1`/`3`/`4`). +- The exit table is a stable public contract — pinned by + `tests/integration/test_cli_cirpass.py::test_phase6_exit_codes_stable`. + +**Cons** + +- Six codes is more than three. Mitigation: the table is short + and the constants are self-documenting; the + [`exit-codes.md`](../reference/cli/exit-codes.md) reference + fits on one page. +- `EXIT_ERROR=2` is now a rarer case; consumers that conflated + it with IO errors see a behaviour change. Mitigation: that + conflation was a documentation gap, not a real contract; + Phase 6 makes the boundary explicit. + +## Reversal cost + +Medium. Removing codes 3-5 would break wrapper scripts that +adopted them. The constants are public; removal would be a SemVer +major bump. Phase 10 of the migration plan does *not* plan to +revisit; the table is intended as the long-term contract. + +## See also + +- [`docs/reference/cli/exit-codes.md`](../reference/cli/exit-codes.md) — + the documented contract. +- [Phase 6 task 6.7 in the migration plan](../plans/CIRPASS_2_MIGRATION.md) + (engineering-side log). +- [`src/dppvalidator/cli/main.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/cli/main.py) + — constant definitions. diff --git a/docs/adr/README.md b/docs/adr/README.md new file mode 100644 index 0000000..2509909 --- /dev/null +++ b/docs/adr/README.md @@ -0,0 +1,31 @@ +# Architecture Decision Records + +This directory holds Architecture Decision Records (ADRs) — short +documents capturing single, durable decisions and the reasoning behind +them. The audience is the next engineer who has to extend this code in +six months without the meeting context. + +## Conventions + +- File naming: `NNNN-short-slug.md`, four-digit zero-padded sequential. +- Frontmatter fields: `Status`, `Date`, `Deciders` (when known), + `Context`, `Decision`, `Consequences`, `Alternatives considered`. +- States: `Proposed` → `Accepted` → optionally `Superseded by ADR-NNNN`. +- One decision per ADR. If you find yourself making more than one + decision, split it. +- Don't update an ADR's *content* once it's `Accepted`; instead, write a + new ADR that supersedes it. ADRs are append-only history, not living + docs. + +When an ADR ships into a migration plan as a locked decision, the plan +references it (e.g. `D-0.1` in +[`../plans/CIRPASS_2_MIGRATION.md`](../plans/CIRPASS_2_MIGRATION.md) +points at [`0001-cirpass-json-schema-derivation.md`](0001-cirpass-json-schema-derivation.md)). + +## Index + +| # | Title | Status | Related | +| ---------------------------------------------- | ------------------------------------------------------------------------------------ | -------- | ------------------- | +| [0001](0001-cirpass-json-schema-derivation.md) | CIRPASS reference-structure JSON Schema is *derived* from the hub's tree-view export | Accepted | Plan D-0.1, Phase 3 | +| [0002](0002-canonical-eudpp-iri.md) | EUDPP IRIs rebase to canonical `https://w3id.org/eudpp/` | Accepted | Plan D-0.3, Phase 1 | +| [0003](0003-tyre-license.md) | GDSO Tyre data-model license — provisional GPL-3.0 | Proposed | Plan OA-1, Phase 7 | diff --git a/docs/concepts/cirpass-2-alignment.md b/docs/concepts/cirpass-2-alignment.md new file mode 100644 index 0000000..48875fe --- /dev/null +++ b/docs/concepts/cirpass-2-alignment.md @@ -0,0 +1,155 @@ +# CIRPASS-2 alignment + +> **Status:** Phase 8 reference (2026-05-09). Supersedes the legacy +> [`eudpp-ontology-alignment.md`](eudpp-ontology-alignment.md) — that +> page is now a redirecting stub and is removed in Phase 10 of the +> migration plan. + +This page is the single orientation document for *what CIRPASS-2 is, +how dppvalidator implements it, and how it relates to UNTP DPP*. New +readers start here; the deep-dives below are linked rather than +inlined. + +## What is CIRPASS-2? + +CIRPASS-2 is the EU project that produces the **EU DPP Core +Ontology** (EUDPP) and a hierarchical **reference-structure message +format**. The ontology is a set of OWL modules describing what a +Digital Product Passport *means*; the reference structure is one +concrete wire format that publishers can emit. + +The four canonical EUDPP modules dppvalidator bundles: + +| Module | Version | Description | +| ------ | ---------- | ------------------------------------------------------------- | +| P_DPP | 1.9.1 | Product / DPP envelope and identifiers. | +| ACTOR | 1.9.1 | Actors and roles (ESPR Art 2(37–55) economic operators). | +| SOC | 1.9.1 | Substances of Concern (REACH / SVHC tracking, ESPR Art 7(5)). | +| LCA | 1.9.4-Maki | Life-Cycle Assessment (PEF 3.1 / EN 15804+A2). | +| CON | 1.9.1 | Cross-module connector relations. | +| CORE | 1.9.1 | Integration ontology importing the five modules above. | + +Bundled TTLs live under +[`src/dppvalidator/vocabularies/data/ontologies/`](https://github.com/artiso-ai/dppvalidator/tree/main/src/dppvalidator/vocabularies/data/ontologies); +SHA-256 pins are recorded in +[`src/dppvalidator/schemas/data/MANIFEST.json`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/schemas/data/MANIFEST.json). + +## Two families, one engine + +dppvalidator validates **two parallel families**: + +- **UNTP DPP** — the UN/CEFACT Verifiable-Credential message format. + Versions 0.6.0, 0.6.1, 0.7.0 are bundled. +- **CIRPASS DPP reference structure v1.3.0** — the CIRPASS-2 message + format. Pydantic-first JSON Schema derived in Phase 3. + +EUDPP-LD is *not* a third family — it's a serialization layer that +re-keys either family's payload onto canonical EUDPP class IRIs. + +The engine routes payloads through family-specific pipelines: + +``` +UNTP : schema → model → semantic → JSON-LD → vocabulary → plugin → signature +CIRPASS : schema → model → semantic → SHACL → vocabulary → plugin +``` + +Detection is **automatic by default** but explicit override is +supported via `--target {auto,untp,cirpass}` (Phase 6). Mismatch +between the user's target and the detected family fails fast with +`DET001` and exit code 3. + +## Validator surface + +Each family ships its own rule tree with non-colliding code prefixes: + +| Family | Prefix | Owns | +| --------------- | ------ | ----------------------------------------------------------------------------------- | +| UNTP | `SEM` | Base semantic rules (mass-fraction sum, validity dates, hazardous-material safety). | +| UNTP | `VOC` | Vocabulary-driven rules (country codes, units, HS codes, GTIN). | +| UNTP | `CQ` | CIRPASS Quality rules — pre-CIRPASS-2 ESPR-compliance heuristics. | +| UNTP | `JLD` | JSON-LD shape rules. | +| UNTP | `MDL` | Pydantic model failures. | +| UNTP | `VER` | Version-mismatch errors. | +| UNTP | `UPG` | 0.6 → 0.7 upgrade-shim warnings. | +| CIRPASS | `CR` | Reference-structure base rules. | +| CIRPASS | `SUB` | SOC v1.9.1 axioms. | +| CIRPASS | `LCS` | LCA v1.9.4-Maki axioms. | +| CIRPASS | `ACT` | ACTOR v1.9.1 axioms. | +| CIRPASS | `REL` | Connector / relation axioms. | +| Cross-family | `MAP` | UNTP ↔ CIRPASS mapping warnings. | +| Cross-family | `DET` | Family-detection diagnostics. | +| Pilot (textile) | `TXT` | Textile DPP rules — v1 (legacy) / v2 (MVP Textile DPP 2025-12-04). | +| Pilot (tyres) | `TYR` | GDSO declaration rules. | + +Per-rule documentation lives under [`docs/errors/`](../errors/index.md). + +## Pilot profiles + +Phase 7 introduced *profile-keyed dispatch* so pilot-specific rule +packs swap in via `validate --profile `: + +- `textile-v1` — legacy TXT001…TXT005 rules (info / warning). +- `textile-v2` — MVP Textile DPP v2 (2025-12-04) pack: TXT001…TXT007 + with stricter severity (TXT006 recycled-content, TXT007 repair + info). See + [`src/dppvalidator/validators/rules/v0_7/textile_v2.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/validators/rules/v0_7/textile_v2.py). + +The tyres pilot ships out-of-tree as the +[`dppvalidator-tyres`](../plugins/tyres.md) plugin (GPL-3.0-or-later; +Pre-1.0 / Experimental); install separately to activate. + +## Cross-family mapping + +UNTP and CIRPASS messages can be projected onto each other via the +[Phase 5 compat shims](untp-cirpass-mapping.md): + +- `dppvalidator.compat.to_cirpass_1_3(untp_dict)` — forward. +- `dppvalidator.compat.to_untp_0_7(cirpass_dict)` — reverse. + +The lossless subset (round-trip identity-preserving) is documented +inline in the mapping doc; transformations outside the subset emit +`MAP00X` warnings: + +| Code | Meaning | +| -------- | -------------------------------------------------- | +| `MAP001` | Lossy: target shape drops information. | +| `MAP002` | Synthesised: required field invented from a donor. | +| `MAP003` | Unmapped: passthrough (no rule applied). | +| `MAP004` | Required-field-missing. | +| `MAP005` | Temporal collapse (less-expressive target). | + +## Reading path + +| You want… | Read this | +| ----------------------------------- | ---------------------------------------------------------------------------------------- | +| The big picture | this page | +| The migration story | [`docs/plans/CIRPASS_2_MIGRATION.md`](../plans/CIRPASS_2_MIGRATION.md) (engineering log) | +| EUDPP module changelog | [eudpp-1.9-changelog.md](eudpp-1.9-changelog.md) | +| Field-by-field UNTP↔CIRPASS mapping | [untp-cirpass-mapping.md](untp-cirpass-mapping.md) | +| Migrate a UNTP fixture to CIRPASS | [migrate-untp-to-cirpass.md](../guides/migrate-untp-to-cirpass.md) | +| CIRPASS Pydantic model API | [reference/cirpass/](../reference/cirpass/index.md) | +| CLI exit codes | [reference/cli/exit-codes.md](../reference/cli/exit-codes.md) | +| Per-rule errors | [errors/](../errors/index.md) | +| Tyres pilot plugin | [plugins/tyres.md](../plugins/tyres.md) | + +## Architecture decisions + +The Phase 0 decision log captures the load-bearing choices: + +- [ADR 0001](../adr/0001-cirpass-json-schema-derivation.md) — JSON + Schema derivation strategy (Pydantic-first vs hub tree-view). +- [ADR 0002](../adr/0002-canonical-eudpp-iri.md) — Canonical EUDPP + IRI rebase to W3ID. +- [ADR 0003](../adr/0003-tyre-license.md) — Tyre plugin licensing + (GPL-3.0-or-later). +- [ADR 0004](../adr/0004-textile-v2-built-in.md) — Textile v2 ships + as a built-in profile rather than an out-of-tree plugin. +- [ADR 0005](../adr/0005-cli-exit-codes.md) — Six-code CLI exit + surface (Phase 6 §6.7). + +## Spec snapshot + +Phase 0 of the migration plan vendored a frozen snapshot of the +CIRPASS-2 hub artefacts. See +[cirpass-2-spec-snapshot.md](cirpass-2-spec-snapshot.md) for the +provenance of every bundled file (URL + SHA-256 + pull date). diff --git a/docs/concepts/cirpass-2-spec-snapshot.md b/docs/concepts/cirpass-2-spec-snapshot.md new file mode 100644 index 0000000..32e13fc --- /dev/null +++ b/docs/concepts/cirpass-2-spec-snapshot.md @@ -0,0 +1,142 @@ +# CIRPASS-2 spec snapshot + +**Phase 0 deliverable** — task 0.6 of +[`../plans/CIRPASS_2_MIGRATION.md`](../plans/CIRPASS_2_MIGRATION.md). + +This page is the authoritative answer to *"what CIRPASS-2 artefacts is +this library bundling, at which versions, with which integrity hashes?"* +It is updated whenever a snapshot row changes (Phase 1 vendor; Phase 10 +removal). The `MANIFEST.json` rows under +[`../../src/dppvalidator/schemas/data/MANIFEST.json`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/schemas/data/MANIFEST.json) +are the machine-readable source of truth; this page is the human view. + +## How this is maintained + +1. The artefact set is defined in + [`../../tools/snapshot/cirpass2_artefacts.json`](https://github.com/artiso-ai/dppvalidator/blob/main/tools/snapshot/cirpass2_artefacts.json). +1. The fetcher + [`../../tools/snapshot/fetch_cirpass.py`](https://github.com/artiso-ai/dppvalidator/blob/main/tools/snapshot/fetch_cirpass.py) + reads that file, downloads each artefact, computes an LF-normalised + SHA-256, and emits MANIFEST-compatible rows. +1. The MANIFEST rows land in + [`../../src/dppvalidator/schemas/data/MANIFEST.json`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/schemas/data/MANIFEST.json) + in Phase 1. +1. This page mirrors the manifest rows for human eyes and adds the + spec-listing context. + +See [`../../tools/snapshot/README.md`](https://github.com/artiso-ai/dppvalidator/blob/main/tools/snapshot/README.md) +for the operator workflow. + +## Source of truth + +- **Spec listing:** +- **Canonical IRI prefix (EUDPP):** `https://w3id.org/eudpp/` +- **CIRPASS-2 project:** +- **Hub-export endpoints** (GUID-keyed; opaque): + - `https://dpp.vocabulary-hub.eu/api/ontology/-/version/{guid}/export` — TTL + - `https://dpp.vocabulary-hub.eu/api/json-schema/-/version/{guid}/export` — JSON Schema + +## Snapshot v0 — drafted 2026-05-08 + +Status legend: + +- **planned** — listed in `cirpass2_artefacts.json`; not yet fetched. + GUID is a `TODO_*` placeholder pending operator pairing. +- **pinned** — fetched, SHA-256 computed, manifest row emitted. +- **vendored** — bundled bytes committed under + `src/dppvalidator/vocabularies/data/ontologies/` and registered in + `MANIFEST.json` (Phase 1 outcome). + +### EUDPP Core Ontology modules (target Phase 1 vendoring) + +| Module | Version | Date | Title | Canonical IRI | Status | +| ------ | ---------- | ---------- | ------------------------------------------------ | ------------------------------- | ------- | +| P_DPP | 1.9.1 | 2026-03-04 | EUDPP CORE ontology Product and DPP module | `https://w3id.org/eudpp/p_dpp/` | planned | +| SOC | 1.9.1 | 2026-03-04 | EUDPP CORE ontology Substance of Concerns module | `https://w3id.org/eudpp/soc/` | planned | +| LCA | 1.9.4-Maki | 2026-04-27 | EUDPP Core Ontology Life Cycle Assessment module | `https://w3id.org/eudpp/lca/` | planned | +| ACTOR | 1.9.1 | 2026-03-04 | EUDPP Core Ontology ACTOR module | `https://w3id.org/eudpp/actor/` | planned | +| CON | 1.9.1 | 2026-03-04 | EUDPP CORE ontology CONNECTOR module | `https://w3id.org/eudpp/con/` | planned | +| CORE | 1.9.1 | 2026-03-04 | EUDPP Core Ontology (umbrella) | `https://w3id.org/eudpp/` | planned | + +> **D-0.3 verification gate.** Phase 0 task 0.4 runs +> `python tools/snapshot/fetch_cirpass.py --verify-canonical` to assert +> every IRI in this column dereferences. Failure escalates as Phase 1 +> blocker R12. + +### CIRPASS message (target Phase 3 derivation) + +| Family | Module | Version | Title | Format | Status | +| --------- | ------ | ------- | ------------------------------- | -------------- | ------- | +| `cirpass` | — | 1.3.0 | CIRPASS DPP reference structure | tree-view-json | planned | + +> **D-0.1.** The hub does *not* publish a JSON Schema for v1.3.0. Phase 3 +> derives one (`tools/codegen/cirpass/derive_schema.py`) from the +> tree-view export and commits the result at +> `src/dppvalidator/schemas/data/cirpass-reference-1.3.0.json`. Drift is +> guarded by `tools/codegen/check_drift.py` (R14). + +### CIRPASS-2 Pilot — Textile (target Phase 7) + +| Family | Module | Version | Date | Title | Format | Status | +| --------------- | ------ | ------- | ---------- | --------------- | -------------- | ------- | +| `textile-pilot` | — | 2.0 | 2025-12-04 | MVP Textile DPP | tree-view-json | planned | + +### CIRPASS-2 Pilot — Tyres (target Phase 7) + +| Family | Module | Version | Date | Title | Format | Status | +| ------------ | ---------------------- | ------- | ---------- | --------------------------- | -------------- | ------- | +| `tyre-pilot` | GDSO_AMBASSADOR | 1 | 2025-12-05 | GDSO Ambassador Data Models | json-schema | planned | +| `tyre-pilot` | TYRE_LIFECYCLE_HISTORY | 1 | 2025-12-05 | Tyre Lifecycle History | tree-view-json | planned | +| `tyre-pilot` | BIRTH | 0.9 | — | Birth Declaration | tree-view-json | planned | +| `tyre-pilot` | COLLECTION | 0.1 | — | Collection Declaration | tree-view-json | planned | +| `tyre-pilot` | RETREAD | 0.1 | — | Retread Declaration | tree-view-json | planned | +| `tyre-pilot` | RECYCLING | 0.1 | — | Recycling Declaration | tree-view-json | planned | + +> **OA-1.** GDSO Tyre data-model license is to be confirmed. Default +> assumption (until confirmed) is GPL-3.0, matching `plugins/textiles/`. +> Captured as ADR +> [`0003-tyre-license.md`](../adr/0003-tyre-license.md) in `Proposed` +> state until OA-1 closes; then promoted to `Accepted`. + +## Reproducing the snapshot from a clean checkout + +```bash +# (1) Inspect the planned set +python tools/snapshot/fetch_cirpass.py --list + +# (2) Verify canonical IRIs (D-0.3 gate) +python tools/snapshot/fetch_cirpass.py --verify-canonical + +# (3) Pair TODO_* GUIDs with hub GUIDs (manual; see tools/snapshot/README.md) + +# (4) Re-run --list to confirm zero placeholders remain + +# (5) Fetch + emit MANIFEST rows +python tools/snapshot/fetch_cirpass.py --fetch \ + > tools/snapshot/manifest-rows.json + +# (6) Phase 1 picks up the rows from manifest-rows.json and folds them +# into src/dppvalidator/schemas/data/MANIFEST.json, vendoring the +# bytes into src/dppvalidator/vocabularies/data/ontologies/ along +# the way. +``` + +The fetcher is stdlib-only — it runs from a fresh checkout without +`uv sync`. Verbatim downloads land under the gitignored +`tools/snapshot/cirpass-2/` directory; only the emitted MANIFEST rows +and the canonical bundled bytes are committed (in Phase 1). + +## Snapshot history + +| Snapshot | Drafted | Vendored | Notes | +| -------- | ---------- | ----------------- | ------------------------------------------------------------------ | +| v0 | 2026-05-08 | (pending Phase 1) | Initial draft. 14 artefacts planned, 0 GUIDs paired, 0 SHAs pinned | + +This row gets its `Vendored` cell filled when Phase 1 closes. Subsequent +snapshots (v1, v2, …) bump when any artefact changes version or hash. + +## Related decisions + +- [ADR 0001 — CIRPASS JSON Schema derivation](../adr/0001-cirpass-json-schema-derivation.md) +- [ADR 0002 — Canonical EUDPP IRI prefix](../adr/0002-canonical-eudpp-iri.md) +- [ADR 0003 — GDSO Tyre data-model license](../adr/0003-tyre-license.md) diff --git a/docs/concepts/eudpp-1.9-changelog.md b/docs/concepts/eudpp-1.9-changelog.md new file mode 100644 index 0000000..708a06f --- /dev/null +++ b/docs/concepts/eudpp-1.9-changelog.md @@ -0,0 +1,170 @@ +# EUDPP v1.9.1 / v1.9.4-Maki — consumer changelog + +**Status:** Final (Phase 8 finalisation, 2026-05-09). Phase 1 task 1.6 +completed the row-by-row audit; this page documents what shipped. + +> Looking for the *big picture*? Start at +> [`cirpass-2-alignment.md`](cirpass-2-alignment.md). Looking for +> *how to migrate a UNTP fixture*? See +> [`migrate-untp-to-cirpass.md`](../guides/migrate-untp-to-cirpass.md). + +## Audience + +Anyone consuming dppvalidator's EU DPP-aligned JSON-LD output, the +`EUDPPNamespace` enum, or the per-module enums under +[`src/dppvalidator/vocabularies/`](https://github.com/artiso-ai/dppvalidator/tree/main/src/dppvalidator/vocabularies). +Read this if your downstream code: + +- pinned `http://dpp.taltech.ee/EUDPP#` or `http://dpp.cea.fr/EUDPP/LCA#` + IRIs as predicate identifiers; +- referenced `EUDPPClass.X` / `EUDPPRoleClass.X` / `HazardCategory.X` / + `ImpactCategory.X` / `EUDPPObjectProperty.X` constants by name; +- consumed the term-mapping table at + [`vocabularies/ontology.py::TERM_MAPPINGS`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/vocabularies/ontology.py). + +## What changed in Phase 1 + +### Namespace migration (ADR 0002) + +EUDPP IRIs rebased from per-publisher hosts onto the canonical W3ID +prefix. Six bindings affected: + +| Before | After | Note | +| ------------------------------------- | ------------------------------- | ----------------------------------------- | +| `http://dpp.taltech.ee/EUDPP#` | `https://w3id.org/eudpp/` | umbrella CORE prefix | +| `http://dpp.cea.fr/EUDPP/LCA#` | `https://w3id.org/eudpp/lca/` | LCA module | +| *(new — was conflated under EUDPP)* | `https://w3id.org/eudpp/p_dpp/` | Product / DPP module | +| *(new)* | `https://w3id.org/eudpp/soc/` | Substances of Concern module | +| *(new)* | `https://w3id.org/eudpp/actor/` | Actors and Roles module | +| *(new — module didn't exist pre-1.9)* | `https://w3id.org/eudpp/con/` | Connector module (cross-module relations) | + +**Migration path for downstream consumers:** + +- *Reading our JSON-LD output:* update your context resolver to follow + W3ID redirects. The hub-published context document URL + (`https://dpp.vocabulary-hub.eu/context/v1`) is unchanged; only the + predicate IRIs inside it are rebased. Most JSON-LD processors handle + the redirect transparently. +- *Pinning predicate IRIs in your code:* replace any literal + `dpp.taltech.ee/EUDPP#` or `dpp.cea.fr/EUDPP/LCA#` substring with + `w3id.org/eudpp/` (umbrella) or the module-specific sub-prefix. +- *Importing `EUDPPNamespace` from us:* `EUDPPNamespace.EUDPP.value` and + `EUDPPNamespace.LCA.value` return new strings. Code that compared + these to the old hosts must be updated. Six new members are added: + `P_DPP`, `SOC`, `ACTOR`, `CON`, plus the umbrella `EUDPP` (rebased) + and `LCA` (rebased). +- *Using the deleted aliases:* `CIRPASSNamespace`, + `compact_cirpass_uri`, `expand_cirpass_uri`, `get_cirpass_context` + are removed. Use the `_eudpp_` -named functions / classes (existing + surface) or the new per-module `EUDPPNamespace` members. + +### TermMapping schema extension + +`TermMapping` gains an optional `cirpass_v1_3: str | None = None` +column alongside the existing `untp_v0_6` / `untp_v0_7`. Defaults +preserve the canonical `untp_term` spelling, so unchanged rows do not +need to repeat themselves. The `OntologyMapper` API gains an optional +keyword-only `family: str = "untp"` parameter on +`find_mapping_for_term`, `_index_for_version`, `mapped_terms_for`, and +`TermMapping.term_for`. Existing call sites are unaffected. + +### MANIFEST schema extension + +[`src/dppvalidator/schemas/data/MANIFEST.json`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/schemas/data/MANIFEST.json) +gains four optional per-row fields: `family`, `module`, +`vocabulary_hub_guid`, `superseded_by`. Existing rows carry +`family: "untp"` for forward consistency; the required-field set +(`version`, `kind`, `path`, `source_url`, `sha256`, `pulled_at`) is +unchanged, so the integrity test in +[`tests/unit/test_manifest_integrity.py`](https://github.com/artiso-ai/dppvalidator/blob/main/tests/unit/test_manifest_integrity.py) +remains green. + +## Per-class diff (populated by Phase 1 tasks 1.6–1.11 audit) + +The 6 v1.9.x EUDPP TTLs landed in 2026-05-08; the codegen tool at +[`tools/codegen/cirpass/regenerate_enums.py`](https://github.com/artiso-ai/dppvalidator/blob/main/tools/codegen/cirpass/regenerate_enums.py) +regenerated each enum block. Manual review surfaced these +spec-driven changes: + +### Renames (single class moved + IRI changed) + +| Module | v1.7.1 IRI | v1.9.1 IRI | Notes | +| ------ | --------------------------------------- | --------------------------------- | --------------------------------------------------------------------- | +| P_DPP | `eudpp:Document` | `eudpp:DocumentFormattedProperty` | P_DPP changelog: "#Document renamed to #DocumentFormattedProperty" | +| P_DPP | `eudpp:uniqueProductID` (datatype prop) | `eudpp:uniqueProductIdentifier` | P_DPP changelog: "Renamed uniqueProductID to uniqueProductIdentifier" | + +### Module migrations (IRI unchanged; defining module changed) + +The 5 properties below moved from P_DPP v1.7.1 to CON v1.9.1. The IRIs +are unchanged — `EUDPPObjectProperty` keeps them. Downstream consumers +that resolve to the *defining* module (e.g. for SHACL shape-graph +loading) must now query `connector_v1.9.1.ttl`, not +`product_dpp_v1.9.1.ttl`. + +| IRI | v1.7.1 module | v1.9.1 module | +| ---------------------------------- | ------------- | ------------- | +| `eudpp:containsSubstanceOfConcern` | P_DPP | CON | +| `eudpp:hasEconomicOperator` | P_DPP | CON | +| `eudpp:hasBackUpCopyHost` | P_DPP | CON | +| `eudpp:hasIssuer` | P_DPP | CON | +| `eudpp:hasManufacturer` | P_DPP | CON | + +### Removals (no v1.9.1 equivalent) + +| IRI | Where used pre-1.9 | Phase 1 disposition | +| ---------------------------------- | --------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `eudpp:facilityID` (datatype prop) | `TermMapping.cirpass_uri` for `producedAtFacility` | Retargeted to `eudpp:Facility` class (in ACTOR module). Predicate removed per P_DPP changelog: "Removed #facilityID. Now described through ACTOR module." | +| `eudpp:hasMaterialProvenance` | `TermMapping.cirpass_uri` for `materialsProvenance` | Annotated via `TRANSITIONAL_EUDPP_REMOVED_IN_V1_9`; row kept for v0.6↔v0.7 UNTP rename round-trip; resolution test skips it | +| `eudpp:hasPerformanceClaim` | `TermMapping.cirpass_uri` for `conformityClaim` | Same treatment as above | +| `eudpp:isResponsibleForProduct` | `EUDPPObjectProperty.IS_RESPONSIBLE_FOR_PRODUCT` | Annotated `# −1.9.1`; not in v1.9.1 TTL but kept in enum for back-compat | +| `eudpp:isRepresentedBy` | `EUDPPObjectProperty.IS_REPRESENTED_BY` | Same treatment | + +### Additions (new classes / properties in v1.9.1) + +| Module | Element | Notes | +| --------------- | ----------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| ACTOR | `eudpp:ActorRoleAssignment` (class) | Models "actor X plays role Y in context Z" as a first-class entity | +| ACTOR | `eudpp:AuthorisedRepresentationAssignment` (class) | Authorised-representative relationship as a first-class entity | +| ACTOR | `eudpp:CircularEconomyRole` (class) | Super-category for Recycler / Refurbisher / Remanufacturer (which v1.9.1 collapses) | +| ACTOR | `eudpp:ConformityAssessmentRole` (class) | Super-category for ConformityAssessmentBody / NotifiedBody (collapsed) | +| ACTOR | `eudpp:hasActor` (object prop) | Replaces a constellation of v1.7.1 inverse relations | +| ACTOR | `eudpp:hasRepresentativeMandate` (object prop) | First-class authorised-representative mandate edge | +| CON | `eudpp:isConnectedTo` (object prop) | Generic connector relation | +| CON | `eudpp:inContextOfActivity` (object prop) | Disambiguator for activities | +| CON | `eudpp:inContextOfDPP` (object prop) | Disambiguator for DPPs | +| CON | `eudpp:inContextOfProduct` (object prop) | Disambiguator for products | +| CON | `eudpp:representsManufacturerForProduct` (object prop) | Product-scoped manufacturer-rep edge | +| LCA v1.9.4-Maki | 33 new classes (e.g. `eudpp:LCIAImpactCategory`, `eudpp:EN15804ImpactIndicator`, `eudpp:EPDDocument`, `eudpp:LCAStudy`) | Major restructuring of the LCA module: EN 15804 + EPD framing replaces the v2.0 `lca:` prefix taxonomy. All 11 v2.0 `lca:Underscore` IRIs are absent from v1.9.4-Maki; the existing `LCAClass` enum retains them as `# −1.9.4-Maki` legacy entries for back-compat with 8 internal dataclasses. The 33 new classes are added under `# +1.9.4-Maki`. | +| LCA v1.9.4-Maki | ~30 new object properties (e.g. `eudpp:hasLCAResult`, `eudpp:hasComplianceDeclaration`) | Not added to `EUDPPObjectProperty` in Phase 1 — Phase 4 wires up LCA validation and is the right place to add them | + +### Role consolidation (ACTOR v1.9.1) + +The ESPR Annex distinguishes ~24 specific economic-operator role types +(Manufacturer, Importer, Distributor, Dealer, FulfilmentService +Provider, AuthorisedRepresentative, MarketSurveillanceAuthority, +CustomsAuthority, Customer, Consumer, IndependentOperator, +ProfessionalRepairer, Recycler, Refurbisher, Remanufacturer, +DPPServiceProvider, ConformityAssessmentBody, NotifiedBody, +CredentialAgency, IssuingAgency). v1.9.1 of the CIRPASS-2 ACTOR +ontology consolidates these into 6 super-role classes: + +- `EconomicOperatorRole` (Manufacturer / Importer / Distributor / Dealer / Fulfilment / AuthorisedRep) +- `AuthorityRole` (MarketSurveillance / Customs) +- `EndUserRole` (Customer / Consumer / EndUser) +- `CircularEconomyRole` (Recycler / Refurbisher / Remanufacturer) — *new in v1.9.1* +- `ConformityAssessmentRole` (DPPServiceProvider / ConformityAssessmentBody / NotifiedBody) — *new in v1.9.1* +- *(Implicit super for IndependentOperator / ProfessionalRepairer)* + +The library's existing `EUDPPRoleClass` retains the 24 specific role +IRIs as ESPR-finer-grained labels; they don't dereference against the +v1.9.1 TTL but remain meaningful for downstream LD payloads. The 2 +new super-role IRIs were added as `# +1.9.1`. + +## See also + +- [docs/plans/CIRPASS_2_MIGRATION.md](../plans/CIRPASS_2_MIGRATION.md) — + Phase 1 task list. +- [docs/adr/0002-canonical-eudpp-iri.md](../adr/0002-canonical-eudpp-iri.md) — + rebase rationale (D-0.3). +- [docs/concepts/cirpass-2-spec-snapshot.md](cirpass-2-spec-snapshot.md) — + pinned upstream artefacts. diff --git a/docs/concepts/untp-cirpass-mapping.md b/docs/concepts/untp-cirpass-mapping.md new file mode 100644 index 0000000..e5dd5c2 --- /dev/null +++ b/docs/concepts/untp-cirpass-mapping.md @@ -0,0 +1,208 @@ +# UNTP DPP 0.7.0 ↔ CIRPASS reference structure v1.3.0 mapping + +> **Status:** Final (Phase 8 finalisation, 2026-05-09). The shim +> implementations and warning codes documented here are the public +> contract through Phase 10 of the migration plan. + +Phase 5 of [docs/plans/CIRPASS_2_MIGRATION.md] introduces the +two-way compatibility shim between the UNTP DPP envelope and the +CIRPASS-2 reference-structure message. This document is the +**field-by-field lossless-subset reference**: it captures which +fields round-trip cleanly and which transformations are lossy. + +> Looking for a *user-facing how-to*? See +> [`migrate-untp-to-cirpass.md`](../guides/migrate-untp-to-cirpass.md) +> for before/after JSON snippets and CLI invocations. + +The shims live at: + +- [src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py) + — forward (UNTP → CIRPASS). +- [src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py) + — reverse (CIRPASS → UNTP). +- [src/dppvalidator/compat/\_untp_cirpass_map.py](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/_untp_cirpass_map.py) + — declarative step table (M01…M15) that this document mirrors. + +## Warning codes + +Five MAP-codes surface during a transformation; consumers may +pattern-match on them to decide whether a transformation is +acceptable for their use case. + +| Code | Meaning | Default severity | +| -------- | ------------------------------------------------------------------- | ---------------- | +| `MAP001` | Lossy: target shape drops information | `warning` | +| `MAP002` | Synthesised: required field invented from a donor | `warning` | +| `MAP003` | Unmapped: no rule applied; raw passthrough | `info` | +| `MAP004` | Required-field-missing: source can't supply target's required field | `error` | +| `MAP005` | Temporal collapse: less-expressive target temporal shape | `warning` | + +`MAP00X` is distinct from `UPG00X` (intra-family upgrade); the two +warning kinds are kept as separate types so call sites can +pattern-match without conflating them. + +## Step table + +Each row is a single declarative transformation. Step IDs (`M01` … +`M15`) are stable across releases; MAP-warning `details["step"]` +references the ID so consumers can grep for the step that emitted +a warning. + +| Step | Description | UNTP path | CIRPASS path | Lossless? | Codes that may fire | +| ---- | ------------------------------------------------- | ---------------------------------------- | --------------------------------------- | --------- | -------------------------- | +| M01 | DPP credential identifier ↔ DPP identifier object | `$.id` | `$.dppIdentifier.value` | ✓ | `MAP002` `MAP004` | +| M02 | Product credential subject ↔ root product | `$.credentialSubject` | `$.product` | ✓ | `MAP004` | +| M03 | Product identifier scheme | `$.credentialSubject.idScheme` | `$.product.productIdentifier.scheme` | ✓ | `MAP003` | +| M04 | DPP envelope name (UNTP only) | `$.name` | (not mapped) | ✗ | `MAP001` | +| M05 | Product name (UNTP scalar ↔ CIRPASS multilingual) | `$.credentialSubject.name` | `$.product.productName` | ✗ | `MAP001` `MAP002` | +| M06 | Product description (same i18n flatten as M05) | `$.credentialSubject.description` | `$.product.description` | ✗ | `MAP001` `MAP002` | +| M07 | Issuance timestamp | `$.validFrom` | `$.issuedAt.timestamp` | ✓ | `MAP004` | +| M08 | Effective period | `$.validFrom + $.validUntil` | `$.effectivePeriod` | ✓ | `MAP005` | +| M09 | Commodity classification | `$.credentialSubject.productCategory` | `$.product.commodityCode` | ✗ | `MAP001` `MAP003` | +| M10 | Material composition | `$.credentialSubject.materialProvenance` | `$.composition.materials` | ✗ | `MAP001` `MAP002` | +| M11 | Related parties / actors | `$.credentialSubject.relatedParty` | `$.relatedActors` | ✗ | `MAP001` `MAP002` `MAP003` | +| M12 | Issuer ↔ manufacturer-role actor | `$.issuer` | `$.relatedActors[? role==Manufacturer]` | ✗ | `MAP001` `MAP002` | +| M13 | Performance / LCA results | `$.credentialSubject.performanceClaim` | `$.lca` | ✗ | `MAP001` `MAP003` | +| M14 | Substances of concern (CIRPASS-only) | (no source field) | `$.substancesOfConcern` | ✗ | `MAP001` | +| M15 | Connector relations (CIRPASS-only) | (no source field) | `$.connectorRelations` | ✗ | `MAP001` | + +## Lossless subset + +The following fields round-trip cleanly — that is, +`to_cirpass_1_3(to_untp_0_7(c))[0] == c` (and symmetrically) over +inputs constructed only from this subset: + +- `dppIdentifier.value` ↔ `id` +- `product.productIdentifier.{value, scheme, schemeName}` ↔ + `credentialSubject.{id, idScheme.{id, name}}` *for schemes + registered in the bundled lookup table* +- `issuedAt.timestamp` ↔ `validFrom` +- `effectivePeriod.{start, end}` ↔ `(validFrom, validUntil)` *when + both endpoints are populated* +- Single-language `productName[0]` ↔ `name` *when language matches + the caller's `default_language`* +- Single-language `description[0]` ↔ `description` *same caveat* +- A single related-actor with `ManufacturerRole` ↔ envelope + `issuer` + a single relatedParty entry with role `manufacturer` + +## Lossy transformations (MAP001 / MAP005) + +The following transformations *cannot* round-trip without loss: + +- **Multilingual labels (M05, M06, M09, M10, M11)** — UNTP carries + scalar strings for `name` / `description` / `Classification.name` + / `Material.name` / `Party.name`. The reverse direction picks the + first list entry (or the entry whose language matches + `default_language`) and emits one `MAP001` per dropped language. +- **Open-ended effective period (M08)** — UNTP `validUntil` may be + absent; CIRPASS `effectivePeriod.end` is also optional, so this + is reversible *if* the forward shim leaves end empty. When the + forward shim *synthesises* an end (it doesn't currently), `MAP005` + fires. +- **Performance claims (M13)** — UNTP `performanceClaim[]` carries + rich claim structure (assessor, evidence, claimedBenchmark) that + the CIRPASS `LifeCycleAssessment` does not model. The current + shim drops claims wholesale with a single `MAP001`; Phase 7 pilot + lifts cover the supported subset (PEF / EN 15804 LCA results). +- **Substances of concern (M14)** — CIRPASS-only; no UNTP base + equivalent. The textile pilot extends UNTP with substance + metadata (Phase 7 territory); the base shim drops with `MAP001`. +- **Connector relations (M15)** — CIRPASS-only construct. Drops + with `MAP001`. + +## Synthesised values (MAP002) + +The forward shim synthesises values for fields CIRPASS requires +that the UNTP envelope can't supply directly. Each synthesised +value emits a `MAP002`; the caller can override by passing +keyword arguments to the shim: + +- `dppIdentifier.scheme` — CIRPASS requires an http(s) URI; UNTP + carries the credential id without a register URI. Default: + `https://example.com/dpp-register/`. +- LocalisedText languages — UNTP scalar strings get wrapped as + `[{value, language}]`; the language is the caller-supplied + `default_language` (defaults to `"en"`). +- Manufacturer actor — when UNTP `relatedParty[]` doesn't include + a manufacturer role, the forward shim lifts the envelope `issuer` + into a synthesised actor with `eudpp:ManufacturerRole`. + +The reverse shim synthesises values for fields UNTP requires that +CIRPASS doesn't carry directly: + +- `idGranularity` — CIRPASS doesn't carry granularity; defaults to + `"model"`. Caller can override via the `untp_id_granularity` kwarg. +- `producedAtFacility` — CIRPASS has no root-level facility; the + reverse shim emits a placeholder. +- `countryOfProduction` — best-effort lift from the first + material's `originCountry`; falls back to ISO `"ZZ"` (unknown) + with a `MAP002` warning. +- `Material.materialType` — UNTP requires a Classification; CIRPASS + carries a bare ISO 2076 code. Synthesised with scheme + `https://w3id.org/eudpp#MaterialType`. +- `Material.massFraction` — UNTP requires a number; CIRPASS may + omit it. Synthesised as `0.0` (caller must supply real values). + +## Identifier scheme map + +The bundled scheme map at +[src/dppvalidator/compat/\_identifier_schemes.py](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/_identifier_schemes.py) +covers commonly-seen identifier schemes for products, parties, +and facilities: + +| UNTP `idScheme.id` | UNTP `idScheme.name` | CIRPASS `scheme` | Aliases recognised | +| --------------------------------------------------------------------------------------------------------------- | -------------------- | ---------------- | --------------------------------------------------- | +| `https://gs1.org/voc/` | `GS1 GTIN` | same | `https://www.gs1.org/voc/` | +| `https://id.gs1.org/01/` | `GS1 Digital Link` | same | `https://gs1.org/dl/` | +| `https://www.gleif.org/lei/` | `GLEIF LEI` | same | `https://gleif.org/lei/` | +| `https://www.iso.org/standard/82220.html` | `ISO/IEC 15459` | same | — | +| `https://ec.europa.eu/taxation_customs/dds2/eos/eori_home.jsp` | `EORI` | same | `https://ec.europa.eu/eori/` | +| `https://e-justice.europa.eu/topics/registers-business-insolvency-land/business-registers-search-company-eu_en` | `EUID` | same | `https://e-justice.europa.eu/euid/` | +| `https://www.dnb.com/duns-number.html` | `DUNS` | same | `https://duns.com/`, `https://www.dnb.com/` | +| `https://www.wcoomd.org/.../hs-nomenclature-2022-edition.aspx` | `WCO HS` | same | `https://www.wcoomd.org/`, `https://hs.wcoomd.org/` | +| `https://ec.europa.eu/taxation_customs/dds2/taric/` | `EU TARIC` | same | `https://ec.europa.eu/taric/` | +| `https://simap.ted.europa.eu/cpv` | `EU CPV` | same | — | + +Schemes outside this table pass through verbatim and surface as +`MAP003` (unmapped). Consumers can extend by passing the +`identifier_scheme_lookup` kwarg to either shim. + +## Role enum mapping + +UNTP `PartyRoleEnum` (20 members) ↔ EUDPP `EUDPPRoleClass`. The +forward map is many-to-one (e.g. UNTP `producer` and `manufacturer` +both project to `eudpp:ManufacturerRole`); the reverse map picks +the most-specific UNTP role. + +| UNTP role | EUDPP role IRI | +| --------------------------------------------------------- | -------------------------------------------------- | +| `manufacturer`, `producer`, `processor`, `brandOwner` | `eudpp:ManufacturerRole` | +| `remanufacturer` | `eudpp:RemanufacturerRole` | +| `recycler` | `eudpp:RecyclerRole` | +| `importer` | `eudpp:ImporterRole` | +| `distributor` | `eudpp:DistributorRole` | +| `retailer` | `eudpp:DealerRole` | +| `logisticsProvider`, `carrier` | `eudpp:FulfilmentServiceProviderRole` | +| `serviceProvider` | `eudpp:DPPServiceProviderRole` | +| `inspector`, `certifier` | `eudpp:ConformityAssessmentBodyRole` | +| `regulator` | `eudpp:AuthorityRole` | +| `owner`, `exporter`, `operator`, `consignor`, `consignee` | `eudpp:EconomicOperatorRole` (super-role fallback) | + +EUDPP roles outside this table emit `MAP002` and fall back to +`eudpp:EconomicOperatorRole` (forward) / `manufacturer` (reverse). + +## Property-test invariant + +The Phase 5 property test +[tests/property/test_round_trip_invariants.py](https://github.com/artiso-ai/dppvalidator/blob/main/tests/property/test_round_trip_invariants.py) +exercises Hypothesis-generated payloads constrained to the +documented lossless subset. The invariant: + +```text +to_cirpass_1_3(to_untp_0_7(c))[0] == c # CIRPASS round-trip +to_untp_0_7(to_cirpass_1_3(u))[0] == u # UNTP round-trip +``` + +200 examples per direction (Hypothesis default profile). Failures +surface as a counterexample showing the smallest input where the +identity breaks. diff --git a/docs/errors/PRT001.md b/docs/errors/PRT001.md new file mode 100644 index 0000000..86246b1 --- /dev/null +++ b/docs/errors/PRT001.md @@ -0,0 +1,104 @@ +# PRT001 - PartyRole Acceptance Gradient + +## Description + +Validation info PRT001. Surfaces the **acceptance gradient** between +the Pydantic-permissive 20-value `PartyRoleEnum` and the upstream +JSON Schema's strict 6-value closed enum on +`$defs.PartyRole.properties.role`. Fires when a payload uses one of +the 14 wider values that the schema rejects. + +The rule is informational by default — Phase 9 of the CIRPASS-2 +migration kept the wider enum to preserve back-compat with v0.6 +fixtures and information fidelity through the CIRPASS reverse shim +(`compat/cirpass_1_3_to_untp_0_7.py`). Use +`ValidationEngine(strict_role_enum=True)` to upgrade PRT001 from +`info` to `error` when you want schema-strict enforcement. + +## Category + +Party-Role Errors (acceptance gradient) + +## Severity + +`info` (default) · `error` (when `strict_role_enum=True`) + +## Schema-strict 6 (accepted by JSON Schema validation) + +`owner`, `producer`, `manufacturer`, `processor`, `remanufacturer`, +`recycler`. + +## Pydantic-permissive 14 (accepted by `PartyRoleEnum`, rejected by schema) + +`operator`, `serviceProvider`, `inspector`, `certifier`, +`logisticsProvider`, `carrier`, `consignor`, `consignee`, +`importer`, `exporter`, `distributor`, `retailer`, `brandOwner`, +`regulator`. + +## Common Causes + +- A v0.6 fixture used a fine-grained role taxonomy that the v0.7 + schema collapsed to the strict 6. +- The CIRPASS reverse shim emitted a rich-mapping target (e.g. + `importer`) that doesn't fit the schema's closed-6. +- A pilot extension adopted the wider taxonomy informally. + +## How to Fix + +Three options: + +1. **Accept the gap** (default). The Pydantic model is permissive; the + advisory rule is `info`-severity. Layer 1 schema validation is the + authoritative check. + +1. **Remap to the schema-strict 6** using the suggested counterpart + in the violation message (sourced from `SUGGESTED_STRICT_REMAP` + in `validators.rules.v0_7.party_role`): + + ```text + importer → manufacturer + distributor → manufacturer + retailer → owner + logisticsProvider → manufacturer + operator → manufacturer + serviceProvider → processor + inspector → processor + certifier → processor + regulator → processor + carrier → manufacturer + consignor → manufacturer + consignee → owner + exporter → manufacturer + brandOwner → manufacturer + ``` + +1. **Opt into strict mode** with + `ValidationEngine(strict_role_enum=True)`. PRT001 then fires as + `error` and the validation result becomes `valid=False`. + +## Example + +```json +{ + "credentialSubject": { + "relatedParty": [ + { + "role": "importer", + "party": { + "id": "did:web:importer.example.com", + "name": "Importer Inc." + } + } + ] + } +} +``` + +This fires PRT001 because `"importer"` is in the wider 20 but not in +the strict 6. Suggested remap: `"manufacturer"`. + +## See Also + +- [Error Overview](index.md) +- Phase 9 task 9.8 in + `docs/plans/CIRPASS_2_MIGRATION.md`. diff --git a/docs/errors/TXT006.md b/docs/errors/TXT006.md new file mode 100644 index 0000000..cf0938e --- /dev/null +++ b/docs/errors/TXT006.md @@ -0,0 +1,56 @@ +# TXT006 - Missing Recycled-Content Disclosure + +## Description + +Validation warning TXT006. Fires under the `textile-v2` profile when a +textile passport carries neither a fibre-level `recycledMassFraction` +declaration nor a `performanceClaim` whose `conformityTopic` mentions +`recycled-content`. ESPR / textile-v2 wants the producer to make the +recycled-content statement *explicit* — zero is acceptable, silence +is not. + +## Category + +Textile Errors (v2) + +## Severity + +`warning` + +## Common Causes + +- Material composition entries omit the `recycledMassFraction` field. +- Sustainability claims are present but use a different + `conformityTopic` (e.g. `bio-based`, `low-impact`) without naming + recycled content explicitly. + +## How to Fix + +Either: + +- Populate `materialProvenance[*].recycledMassFraction` on at least + one fibre (a value of `0.0` is acceptable — the rule wants an + explicit statement, not a non-zero one), or +- Add a `performanceClaim` whose `conformityTopic` contains + `recycled-content`. + +## Example + +```json +{ + "credentialSubject": { + "materialProvenance": [ + { + "materialName": "Cotton", + "recycledMassFraction": 0.30 + } + ] + } +} +``` + +## See Also + +- [Error Overview](index.md) +- [TXT002 - Missing Material Composition](TXT002.md) +- [TXT007 - Missing Repair Information](TXT007.md) diff --git a/docs/errors/TXT007.md b/docs/errors/TXT007.md new file mode 100644 index 0000000..3de1012 --- /dev/null +++ b/docs/errors/TXT007.md @@ -0,0 +1,50 @@ +# TXT007 - Missing Repair / Spare-Parts Information + +## Description + +Validation info TXT007. Fires under the `textile-v2` profile when a +textile passport carries no `relatedDocument` whose `relationship` +contains `repair` or `spare-parts`. ESPR Annex II requires garments +to surface a repair-information link; the rule is `info`-severity +because some pilot payloads attach the repair link via a different +extension surface. + +## Category + +Textile Errors (v2) + +## Severity + +`info` + +## Common Causes + +- Repair-guide URL is embedded in a free-text field rather than a + structured `relatedDocument` entry. +- The producer has not yet authored a repair guide. + +## How to Fix + +Add a `relatedDocument` entry whose `relationship` is +`repair-information` (or `spare-parts`), pointing at the repair +guide URL. + +## Example + +```json +{ + "credentialSubject": { + "relatedDocument": [ + { + "relationship": "repair-information", + "documentURL": "https://example.com/repair-guide.pdf" + } + ] + } +} +``` + +## See Also + +- [Error Overview](index.md) +- [TXT006 - Missing Recycled-Content Disclosure](TXT006.md) diff --git a/docs/guides/migrate-untp-to-cirpass.md b/docs/guides/migrate-untp-to-cirpass.md new file mode 100644 index 0000000..3c7ce21 --- /dev/null +++ b/docs/guides/migrate-untp-to-cirpass.md @@ -0,0 +1,315 @@ + + +# Migrating from UNTP DPP 0.7.0 to CIRPASS reference structure v1.3.0 + +UNTP DPP and the CIRPASS-2 reference-structure message are two +different *families* of DPP payload, not two versions of the same +format. dppvalidator ships **two-way compat shims** that project +between them: + +- `dppvalidator.compat.to_cirpass_1_3` — forward (UNTP → CIRPASS). +- `dppvalidator.compat.to_untp_0_7` — reverse (CIRPASS → UNTP). + +The shims are *lossy outside a documented subset*. This guide: + +1. Shows how to run them from the CLI and Python. +1. Walks through a before/after JSON example. +1. Lists the five `MAP00X` warning codes and what they mean. +1. Captures the round-trip identity (lossless subset) so you know + what survives a forward+reverse cycle. + +The conceptual overview lives at +[`cirpass-2-alignment.md`](../concepts/cirpass-2-alignment.md); the +complete field-by-field mapping table is at +[`untp-cirpass-mapping.md`](../concepts/untp-cirpass-mapping.md). + +## When you need this + +- You have UNTP 0.7.0 DPPs in production and a downstream consumer + asks for CIRPASS-shaped output. +- You're publishing CIRPASS reference-structure messages and want + to convert them to UNTP 0.7.0 for a UN/CEFACT-side workflow. +- You want to add a CIRPASS-side validation pass to an existing + UNTP pipeline without re-authoring fixtures. + +## Quick start + +### CLI — UNTP → CIRPASS + +```bash +# Project a UNTP 0.7.0 payload onto CIRPASS reference structure v1.3.0. +# Default refuses to write when warnings fire. +dppvalidator migrate untp-passport.json --to cirpass-1.3 -o cirpass.json + +# Accept lossy/synthesis warnings and proceed. The sidecar +# cirpass.json.warnings.json captures every MAP00X warning. +dppvalidator migrate untp-passport.json \ + --to cirpass-1.3 \ + -o cirpass.json \ + --accept-warnings + +# Override the synthesised default language for multilingual labels: +dppvalidator migrate untp-passport.json --to cirpass-1.3 \ + -o cirpass.json --default-language de --accept-warnings +``` + +### CLI — CIRPASS → UNTP + +```bash +# Project a CIRPASS payload onto UNTP 0.7.0 envelope. +dppvalidator migrate cirpass.json --to untp-0.7 -o untp.json --accept-warnings +``` + +> **Note:** The reverse direction (`--to untp-0.7`) is only available +> for CIRPASS *inputs* in Phase 5+. For UNTP 0.6 → 0.7 intra-family +> upgrades, the same `--to untp-0.7` flag still routes through the +> legacy v0.6 → v0.7 shim — see +> [migration-0-6-to-0-7.md](migration-0-6-to-0-7.md). + +### CLI — JSON-LD export shortcut + +If you don't need a sidecar JSON file but just want a CIRPASS- +shaped JSON-LD document, the export command does both the projection +and the JSON-LD wrapping in one step: + +```bash +dppvalidator export untp-passport.json --format cirpass-jsonld > cirpass.jsonld +``` + +The exporter forwards mapping warnings to **stderr** so stdout stays +pipe-clean for `... | jq` consumers. + +### Python + +```python +import json +from dppvalidator.compat import to_cirpass_1_3, to_untp_0_7, MAP_CODE_LOSSY + +untp_dict = json.load(open("untp-passport.json")) + +# Forward +cirpass_dict, warnings = to_cirpass_1_3( + untp_dict, + default_language="en", +) + +# Inspect warnings +for w in warnings: + print(f"[{w.code}] {w.severity.value} {w.path}: {w.message}") + +# Validate the projected output +from dppvalidator.models.cirpass.v1_3 import ReferencePassport + +ReferencePassport.model_validate(cirpass_dict) + +# Reverse +back, rev_warnings = to_untp_0_7(cirpass_dict, default_language="en") +``` + +## Before / after — minimal example + +### UNTP 0.7.0 input + +```json +{ + "@context": [ + "https://www.w3.org/ns/credentials/v2", + "https://vocabulary.uncefact.org/untp/0.7.0/context/" + ], + "type": ["DigitalProductPassport", "VerifiableCredential"], + "id": "https://example.com/dpp/CR-001", + "name": "Cotton T-shirt DPP", + "issuer": { + "id": "https://example.com/lei/529900T8BM49AURSDO55", + "name": "Example Apparel Ltd.", + "type": ["CredentialIssuer"] + }, + "validFrom": "2026-05-08T00:00:00+00:00", + "validUntil": "2031-05-08T00:00:00+00:00", + "credentialSubject": { + "type": ["Product"], + "id": "01234567890128", + "name": "Cotton T-shirt", + "idScheme": {"id": "https://gs1.org/voc/", "name": "GS1 GTIN"}, + "idGranularity": "model", + "productCategory": [ + { + "schemeId": "https://www.wcoomd.org/en/topics/nomenclature/instrument-and-tools/hs-nomenclature-2022-edition.aspx", + "schemeName": "WCO HS", + "code": "61091000", + "name": "T-shirts" + } + ], + "producedAtFacility": {"id": "https://example.com/f", "name": "F", "type": ["Facility"]}, + "countryOfProduction": {"countryCode": "DE", "countryName": "Germany"}, + "materialProvenance": [ + { + "name": "Cotton", + "originCountry": {"countryCode": "DE"}, + "materialType": { + "schemeId": "https://w3id.org/eudpp#MaterialType", + "schemeName": "ISO 2076", + "code": "CO", + "name": "Cotton" + }, + "massFraction": 1.0 + } + ] + } +} +``` + +### CIRPASS reference structure v1.3.0 output (after `to_cirpass_1_3`) + +```json +{ + "dppIdentifier": { + "value": "https://example.com/dpp/CR-001", + "scheme": "https://example.com/dpp-register/", + "schemeName": "DPP register" + }, + "product": { + "productIdentifier": { + "value": "01234567890128", + "scheme": "https://gs1.org/voc/", + "schemeName": "GS1 GTIN" + }, + "productName": [ + {"value": "Cotton T-shirt", "language": "en"} + ], + "commodityCode": [ + { + "code": "61091000", + "scheme": "https://www.wcoomd.org/en/topics/nomenclature/instrument-and-tools/hs-nomenclature-2022-edition.aspx", + "name": [{"value": "T-shirts", "language": "en"}] + } + ] + }, + "issuedAt": {"timestamp": "2026-05-08T00:00:00+00:00"}, + "effectivePeriod": { + "start": "2026-05-08T00:00:00+00:00", + "end": "2031-05-08T00:00:00+00:00" + }, + "composition": { + "materials": [ + { + "materialName": [{"value": "Cotton", "language": "en"}], + "materialType": "CO", + "originCountry": "DE", + "massFraction": "1.0" + } + ] + }, + "relatedActors": [ + { + "actor": { + "actorIdentifier": { + "value": "https://example.com/lei/529900T8BM49AURSDO55", + "scheme": "https://www.gleif.org/lei/", + "schemeName": "GLEIF LEI" + }, + "actorName": [{"value": "Example Apparel Ltd.", "language": "en"}] + }, + "role": "eudpp:ManufacturerRole" + } + ] +} +``` + +### Warnings emitted on the projection + +The forward shim emits five `MAP002` warnings on this payload — one +for each synthesised value (DPP-register scheme URI; the language +tag on every LocalisedText entry the shim wraps; a manufacturer +actor lifted from `issuer`). + +```text +[MAP002] (warning) $.dppIdentifier.scheme: DPP-register scheme synthesised — UNTP carries the credential id but no register URI; using a placeholder. +[MAP002] (warning) $.product.productName[0].language: UNTP credentialSubject.name is a scalar string; synthesised a single CIRPASS LocalisedText entry with language='en'. +[MAP002] (warning) $.product.commodityCode[0].name[0].language: UNTP Classification.name is a scalar string; projected onto a single LocalisedText with language='en'. +[MAP002] (warning) $.composition.materials[0].materialName[0].language: UNTP Material.name is a scalar string; projected onto a single LocalisedText with language='en'. +``` + +The shim does **not** synthesise content (e.g. the recycled-mass +fraction is absent on the input, so it's absent on the output). + +## Warning codes + +The five `MAP00X` codes the shim may emit: + +| Code | Meaning | Default severity | +| -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | +| `MAP001` | **Lossy** — target shape drops information. Examples: dropping a non-default-language LocalisedText entry; dropping CIRPASS-only fields (`substancesOfConcern`, `connectorRelations`, `lca`) on the reverse leg. | `warning` | +| `MAP002` | **Synthesised** — required field invented from a donor (e.g. wrapping a UNTP scalar name in a single-entry LocalisedText). | `warning` | +| `MAP003` | **Unmapped** — passthrough (no rule applied). Fires when the input carries a scheme URI / role IRI / etc. that's not in the bundled lookup table. | `info` | +| `MAP004` | **Required-field-missing** — the source cannot supply a field the target requires. The output will *not* validate against the target Pydantic model until you fill the field in. | `error` | +| `MAP005` | **Temporal collapse** — less-expressive target temporal shape (e.g. `validUntil` absent ⇒ `effectivePeriod.end` left empty). | `warning` | + +## Lossless subset + +The following fields round-trip identity-preserving — pass through +both shims unchanged when the input only contains them: + +- `dppIdentifier.value` ↔ `id` +- `product.productIdentifier.{value, scheme, schemeName}` ↔ + `credentialSubject.{id, idScheme.{id, name}}` *for schemes + registered in the bundled lookup table* +- `issuedAt.timestamp` ↔ `validFrom` +- `effectivePeriod.{start, end}` ↔ `(validFrom, validUntil)` *when + both endpoints populate* +- Single-language `productName[0]` ↔ `name` *when language matches + the caller's `default_language`* +- A single related-actor with `ManufacturerRole` ↔ envelope + `issuer` + a single `relatedParty` entry with role + `manufacturer` + +The Hypothesis property test at +[`tests/property/test_round_trip_invariants.py`](https://github.com/artiso-ai/dppvalidator/tree/main/tests/property/test_round_trip_invariants.py) +exercises this invariant against 200 generated examples per direction. + +## Limitations + +The shim *will not*: + +- Synthesise content the input doesn't carry. Missing + `producedAtFacility` becomes a placeholder; missing + `materialType.code` becomes `"unspecified"`. +- Translate UNTP `performanceClaim` entries into CIRPASS LCA + results. The Phase 7 textile-v2 pilot extends this for the + microplastic / durability / recycled-content topics, but the + base shim drops every claim with a single `MAP001`. +- Translate CIRPASS `substancesOfConcern` into a UNTP base + representation — there isn't one. Use the `dppvalidator-textiles` + plugin (when published) for textile-side substance tracking. + +## Round-trip identity check + +The forward shim is *idempotent* over the lossless subset, but a +forward+reverse cycle on a non-subset payload will not recover the +original. If you want to verify round-trip behaviour for your +specific input shape, run: + +```python +import json +from dppvalidator.compat import to_cirpass_1_3, to_untp_0_7 + +untp = json.load(open("untp.json")) +cirpass, _ = to_cirpass_1_3(untp) +recovered, _ = to_untp_0_7(cirpass) + +# Compare the lossless-subset fields +assert recovered["id"] == untp["id"] +assert recovered["validFrom"] == untp["validFrom"] +assert recovered.get("validUntil") == untp.get("validUntil") +assert recovered["credentialSubject"]["id"] == untp["credentialSubject"]["id"] +``` + +## See also + +- [`untp-cirpass-mapping.md`](../concepts/untp-cirpass-mapping.md) — + field-by-field mapping table. +- [`cirpass-2-alignment.md`](../concepts/cirpass-2-alignment.md) — + big-picture orientation. +- [`migration-0-6-to-0-7.md`](migration-0-6-to-0-7.md) — UNTP + intra-family upgrade (0.6 → 0.7). +- [CLI exit codes](../reference/cli/exit-codes.md). diff --git a/docs/plans/CIRPASS_2_MIGRATION.md b/docs/plans/CIRPASS_2_MIGRATION.md new file mode 100644 index 0000000..b29855f --- /dev/null +++ b/docs/plans/CIRPASS_2_MIGRATION.md @@ -0,0 +1,3434 @@ +# CIRPASS-2 / EUDPP Core Ontology — Development Plan + +> **Status:** Draft v3 (executable) · **Drafted:** 2026-05-08 +> **Target alignment:** EUDPP Core v1.9.1 (2026-03-04), LCA v1.9.4.Maki (2026-04-27), CIRPASS DPP reference structure v1.3.0 +> **Library start version:** dppvalidator 0.4.x (defaults: UNTP DPP 0.6.1; opt-in 0.7.0) +> **Canonical EUDPP IRI:** `https://w3id.org/eudpp` (CORE imports each module from this prefix) +> **Cardinal rules:** [`.claude/rules/untp-versioning.md`](../../.claude/rules/untp-versioning.md) — extended in §1.2 + +This is the executable archive for adapting `dppvalidator` to CIRPASS-2. +Each phase below is a sprint-sized work unit with numbered tasks, concrete +deliverables, tests, and machine-checkable exit criteria. Strategy +context that drove these choices is summarised in §1; the rest is +checklist. + +--- + +## 0. At a glance + +**Two schema families, not three.** UNTP DPP and CIRPASS DPP reference +structure are validated families. EUDPP-LD is a *serialization format* +on top of either family — not a third family. + +**Release ladder.** + +| Release | Phases | Theme | +|---|---|---| +| `0.4.z` patches | Phase 0 → 2 | Additive: snapshot, vocab refresh, detection extension | +| `0.5.0` Preview | Phase 3 → 9 | CIRPASS family promoted; EUDPP at v1.9.1 / v1.9.4.Maki; pilots | +| `0.6.0` Stable | Phase 10 | Lock APIs, drop deprecated surfaces | +| `0.6.z` opportunistic | (§6.3 recipe) | Adopt IDENT/MAT/EVENT/COMP modules when published | + +**Critical path:** Phase 0 → 1 → 3 → 4 → 5 → 9 → 10. +Phases 2, 6, 7, 8 are parallelisable on side branches (see §6.2 DAG). + +**Total effort:** ~XL (~14–22 weeks, single owner). + +**Progress.** + +| Phase | Status | +|---|---| +| 0 — Snapshot & Pin | ✓ Complete (2026-05-08); D-0.3 verified live via W3ID resolver, 6/6 EUDPP GUIDs paired; 8 message-format placeholders deferred to Phase 3/7 — see §2 Phase 0 status block | +| 1 — Vocab refresh + namespace migration | ✓ **Fully closed** (2026-05-08); all 15 tasks + all 4 exit criteria met. 6 v1.9.x TTLs SHA-pinned, namespace rebased to the canonical `eudpp#` fragment, MANIFEST extended (5 superseded rows + 6 new), TERM_MAPPINGS content audit applied, all 5 EUDPP enums regenerated against new TTLs (incl. EUDPPDatatypeProperty rename round + 4 new v1.9.4-Maki named-individual enums for LCA), golden-diff EUDPP-LD audit gate live and passing — see §2 Phase 1 status block | +| 2 — Detection & registry extension | ✓ **Complete** (2026-05-08); all 12 tasks landed end-to-end. `SchemaFamily` enum + tuple-keyed registry source-of-truth + `DEFAULT_VERSIONS` per family + bare-string view as derived back-compat. CIRPASS 1.3.0 registered with placeholder SHA. Detection extended with `detect_schema_family()`, `detect_schema()`, `looks_like_dpp` + `is_untp_dpp` + `is_cirpass_dpp`. `DET001` family-mismatch code constant added. `compat.active_version(family=)` extended. 4 new test files (45 tests) covering family routing, ambiguity, registry back-compat, version matrix CIRPASS rows | +| 3 — CIRPASS reference-structure models | ✓ **Complete** (2026-05-08); all 14 tasks landed end-to-end. 9 model files under `models/cirpass/v1_3/` (`ReferencePassport`, `Product` + `Identifier` + `ClassificationCode`, `Actor` + `Facility` + `ActorRole` + `ActorRoleAssignment`, `Material` + `Composition`, `SubstanceOfConcern` + `Concentration` + `HazardClassification`, `LifeCycleAssessment` + `ImpactResult` + `ImpactCategoryReference`, `ConnectorRelation` + `RelationType`, `LocalisedText`, `EffectivePeriod` + `IssuedAt`). Pydantic-first JSON Schema derivation (43 KB output, SHA-pinned in registry + MANIFEST). Drift gate (`tools/codegen/check_drift.py`) live and green. Lazy-import contract pinned (CIRPASS not eagerly loaded by `import dppvalidator`). 9 fixtures (3 valid + 6 invalid) + 62 new tests | +| 4 — CIRPASS validators | ✓ **Complete** (2026-05-08); all 8 tasks + all 3 exit criteria met. 5 rule modules under `validators/rules/cirpass_v1_3/` (`base.py` 6× CR-rules; `substances.py` 4× SUB-rules; `lca.py` 4× LCS-rules; `actor.py` 4× ACT-rules; `connector.py` 3× REL-rules — 21 rules total, 100% non-colliding code prefixes per the plan's prefix-audit table). `_PIPELINE_BY_FAMILY` dispatch lands in `validators/engine.py`; `ModelValidator` and `SemanticValidator` extended with `family` axis (lazy CIRPASS imports preserve cold-start budget). Per-module SHACL infrastructure at `validators/shacl_cirpass.py` runs one pyshacl pass per EUDPP module (P_DPP / SOC / LCA / ACTOR / CON), each violation attributed ` v`; shape-graph loader `lru_cache`-d on `(family, module, version, sha256)`. Schema generator flipped to `mode='validation'` (numeric Decimal coercion at JSON Schema level); SHA pin re-derived (drift gate green). 84 new tests across `tests/unit/test_cirpass_v1_3_rules.py` + 3 integration tests; CIRPASS-rule-module coverage 98%, full suite 2272 passed / 36 skipped, ruff clean, format clean | +| 5 — UNTP ↔ CIRPASS compat shims | ✓ **Complete** (2026-05-08); all 9 tasks + all 3 exit criteria met. 5 compat-layer files (`_mapping_codes.py`, `_untp_cirpass_map.py`, `_identifier_schemes.py`, `untp_0_7_to_cirpass_1_3.py`, `cirpass_1_3_to_untp_0_7.py`). 5 MAP-code constants + `MappingWarning` dataclass with 5 factory methods. 15-row declarative step table (M01…M15) drives both shims. Identifier-scheme lookup table covers GS1 GTIN/Digital Link, GLEIF LEI, ISO/IEC 15459, EORI, EUID, DUNS, WCO HS, EU TARIC, EU CPV. Lossless-subset reference doc published at [docs/concepts/untp-cirpass-mapping.md](../concepts/untp-cirpass-mapping.md). 73 new tests (29 unit + 25 integration + 4 property + 15 scheme-table); property test green at 200 examples both directions; full suite 2345 passed / 36 skipped, ruff clean, format clean, ty clean (compat package), coverage 91.69 % project-wide | +| 6 — Exporters & CLI surface | ✓ **Complete** (2026-05-08); all 7 tasks + both exit criteria met. New `exporters/cirpass_jsonld.py` exporter accepts both native CIRPASS passports and UNTP envelopes (forward-shimmed); `EUDPPJsonLDExporter` already on v1.9.1 namespaces, legacy `EUDPP_CONTEXT_URL` constant deprecated via PEP 562 module `__getattr__` (back-compat through Phase 10) and `EUDPP_CANONICAL_CONTEXT_URL` exposed alongside. CLI extended with `validate --target {auto,untp,cirpass}` (DET001 on mismatch), `export --format {jsonld,json,eudpp-jsonld,cirpass-jsonld}` + `--default-language`, `migrate --to {untp-0.7,cirpass-1.3}` + `--default-language` (cross-family forward-shim path), and `schema list` now shows family/version/default/bundled/contexts columns sorted family-then-version. Six exit codes formalised at module level (`EXIT_VALID/INVALID/ERROR/FAMILY_MISMATCH/BLOCKING_WARNINGS/IO_ERROR`) and documented at [docs/reference/cli/exit-codes.md](../reference/cli/exit-codes.md). 48 new tests across `test_cli_cirpass.py` (16) + `test_cli_back_compat.py` (19) + `test_cli_export_matrix.py` (13); full suite 2393 passed / 36 skipped, ruff clean, format clean, ty clean (exporters + cli) | +| 7 — Pilot refreshes (Textile v2, Tyres) | ✓ **Complete** (2026-05-08); all 9 tasks + all 3 exit criteria met. New built-in `validators/rules/v0_7/textile_v2.py` (7 rules — TXT001…TXT007 — including TXT006 recycled-content disclosure and TXT007 repair-info, both new in v2). `--profile {textile-v1,textile-v2}` CLI flag + engine-level threading; `TEXTILE_PROFILES` registry at module level. New `plugins/tyres/` GPL-3.0-or-later plugin (`dppvalidator-tyres==0.1.0`, marked Pre-1.0 / Experimental) with 4 GDSO declaration models (Birth v0.9, Collection v0.1, Retread v0.1, Recycling v0.1) + `TyreLifecycleHistory` aggregate enforcing UUID-chain / chronological-order / single-Recycling invariants. 8 TYR-coded validators auto-registered via entry-points + a CSV exporter. Phase 7.9 CI gate `tools/check_imports.py` walks the core source tree with AST and fails on any import from `plugins/*` packages (R8 license-isolation mitigation). 75 new tests across `tests/plugins/tyres/test_tyres_models.py` (22) + `test_tyres_validators.py` (29) + `test_tyres_pipeline.py` (7) + `tests/plugins/test_license_isolation.py` (5) + `tests/integration/test_textile_profiles.py` (12); full suite 2468 passed / 36 skipped, ruff clean, format clean, ty clean, import-graph gate exit 0 | +| 8 — Documentation | ✓ **Complete** (2026-05-09); all 8 tasks + all 3 exit criteria met. New concept doc [`cirpass-2-alignment.md`](../concepts/cirpass-2-alignment.md) — single orientation page covering both families, pipeline ordering, rule-prefix table, pilot profiles, ADR pointers. New user-facing guide [`migrate-untp-to-cirpass.md`](../guides/migrate-untp-to-cirpass.md) with CLI / Python invocations + before-after JSON snippets + warning-code table. New [`reference/cirpass/index.md`](../reference/cirpass/index.md) auto-generated from the CIRPASS Pydantic models via mkdocstrings. Finalised [`eudpp-1.9-changelog.md`](../concepts/eudpp-1.9-changelog.md) and [`untp-cirpass-mapping.md`](../concepts/untp-cirpass-mapping.md) (lifted from "Phase 1 scaffold" / "Phase 5 reference" to final). [`README.md`](../../README.md) "Supported specs" matrix now shows two families (UNTP DPP 0.6.0/0.6.1/0.7.0 + CIRPASS 1.3.0), the migration shims, the pilot profiles + plugins, and a reading guide. Two new ADRs ([0004](../adr/0004-textile-v2-built-in.md) — textile v2 ships built-in; [0005](../adr/0005-cli-exit-codes.md) — six-code CLI exit surface). `mkdocs.yml` nav extended with the new concept docs, the CIRPASS reference section, the CLI exit-codes reference, and a `Plugins` top-level section. Cross-tree relative links (28 references to `src/…`, `tools/…`, `tests/…`, `plugins/…`, `.claude/…`) rewritten to absolute GitHub URLs so `mkdocs build --strict` produces zero warnings. Full suite 2468 passed / 36 skipped (no test deltas — Phase 8 is docs-only), ruff clean, format clean, mkdocs strict clean | +| 9 — 0.5.0 Preview release cut | ✓ **10/11 tasks complete** (2026-05-09); 9.6 PyPI publish reserved for release manager. UNTP default flipped to 0.7.0. CHANGELOG 0.5.0 entry authored. D1 BLOCKER closed (statusListIndex int with v0.6 back-compat coercion). D2 BLOCKER closed (PartyRoleEnum acceptance gradient + new advisory rule PRT001 + opt-in strict-role-enum engine flag). 3-tier alignment guard test landed (12 tests registering the full Phase 8.9 baseline). 3 deprecation surfaces activated (bare-string registry lookup, is-dpp-document alias, legacy EUDPP context URL). Cross-version regression baseline 101/101 green. UAT U1, U2, U3, U4 manually verified. Non-breaking for v0.6.x fixtures and CIRPASS round-trips. Full suite 2525 passed / 36 skipped (+57 new tests vs Phase 8), ruff clean, format clean, ty clean, mkdocs strict clean, error-doc coverage 96 of 96 | +| 10 | ⌛ Not started | + +--- + +## 1. Foundations + +### 1.1 Glossary (read first) + +- **UNTP DPP** — UN/CEFACT Verifiable Credential message format. + Versions 0.6.0, 0.6.1, 0.7.0. Currently bundled. +- **CIRPASS-2** — EU project producing the EUDPP ontology and a + hierarchical reference-structure message (v1.3.0). +- **EUDPP** — EU DPP Core Ontology. OWL ontology, modules + P_DPP / SOC / LCA / ACTOR / CON / CORE imported from + `https://w3id.org/eudpp/`. Defines axioms, not a wire format. +- **EUDPP-LD** — JSON-LD serialization re-keying any compatible + payload onto canonical EUDPP class IRIs. *Format*, not family. + +When this plan says **family**, it means UNTP or CIRPASS. + +### 1.2 Cardinal rules — extension to the existing five + +The five rules at [`.claude/rules/untp-versioning.md`](../../.claude/rules/untp-versioning.md) +apply unchanged. Extensions for CIRPASS: + +1. **No bare CIRPASS or EUDPP-module version literals** outside + `schemas/registry.py`, `exporters/contexts.py`, and + `schemas/data/MANIFEST.json`. Update + `tests/unit/test_no_version_literals.py` to forbid `"1.3.0"`, + `"1.9.1"`, `"1.9.4-maki"` in arbitrary code (Phase 1 task). +2. **CIRPASS models are version-namespaced** under + `models/cirpass/v1_3/`, parallel to `models/v0_6/` and `models/v0_7/`. +3. **Detection stays centralised** in `validators/detection.py`. + New `detect_schema_family()` lives there; nowhere else branches on + family. +4. **Family dispatch is table-driven** via `_PIPELINE_BY_FAMILY` in + `validators/engine.py`, mirroring the cardinal `_MODEL_BY_VERSION` + pattern. No `if family == ...` outside the table. +5. **Coexist before you cut.** UNTP 0.6 + 0.7 + CIRPASS 1.3 all + validate in `0.5.0`. Removals are a separate minor. + +### 1.3 Locked decisions + +- **D-0.1** CIRPASS reference-structure JSON Schema is *derived* + from the hub's tree-view export by + `tools/codegen/cirpass/derive_schema.py` (the hub publishes ~2 + JSON Schemas in total; v1.3.0 is not among them). +- **D-0.3** EUDPP IRIs rebase to the canonical + `https://w3id.org/eudpp/...` prefix. Phase 0 verifies dereferencing + before Phase 1 starts; failure escalates as R12. +- **D-naming** Keep `eudpp-jsonld` (ontology-aligned). Add + `cirpass-jsonld` (CIRPASS reference structure as JSON-LD). They + are different artefacts. +- **D-default-family** UNTP remains the detection fallback through + `0.6.x`. + +### 1.4 Open action items + +- **OA-1** Confirm GDSO Tyre data-model license. Owner: human + reviewer. Deadline: Phase 0 close. Default: GPL-3.0 (mirrors + `plugins/textiles/`). +- **OA-2** Battery Pass alignment — out of scope. Tracked in + follow-on `docs/plans/BATTERY_PASS_INTEGRATION.md`. + +--- + +## 2. Phase-by-phase plan + +Each phase declares: **Goal** · **Effort** · **Depends on** · **Ships in**, +followed by **Tasks**, **Deliverables**, **Tests**, **Exit criteria**. +Task IDs are stable for cross-references in PR descriptions. + +--- + +### Phase 0 — Snapshot & Pin + +**Goal:** Freeze a reproducible spec snapshot before any code moves. +**Effort:** S (~2 days) · **Depends on:** — · **Ships in:** `0.4.z` patch. + +Status legend for tasks: ✓ engineer-side complete · ⏳ scaffold ready, +operator-gated · ⊘ blocked. + +**Tasks** + +- ✓ **0.1** Author [`tools/snapshot/fetch_cirpass.py`](../../tools/snapshot/fetch_cirpass.py) + — given a list of vocab-hub GUIDs, downloads each export, computes + SHA-256, prints rows in `MANIFEST.json` shape. Stdlib-only; runs from + a clean checkout. Modes: `--list` / `--fetch` / `--verify-canonical`. + Lint, type-check, format, and offline smoke (`--list`, `--fetch` with + all-placeholder) all green. +- ⏳ **0.2** Enumerate all in-scope GUIDs. Scaffold landed at + [`tools/snapshot/cirpass2_artefacts.json`](../../tools/snapshot/cirpass2_artefacts.json) + with 14 rows (P_DPP, SOC, LCA, CORE, ACTOR, CON; CIRPASS reference + structure v1.3.0; MVP Textile DPP v2; the four Tyre declarations; + Tyre Lifecycle History v1; GDSO Ambassador Data Models v1). + *Operator update (2026-05-08):* the six EUDPP-ontology rows are + paired with real `OntologyVersion_` GUIDs. Eight message-format + rows (CIRPASS reference structure, MVP Textile, GDSO Ambassador, four + Tyre declarations, Tyre Lifecycle History) remain `TODO_*` — + message endpoints differ from the ontology endpoint pattern and need + operator pairing via the hub tree-view UI, per + [`tools/snapshot/README.md`](../../tools/snapshot/README.md). +- ⏳ **0.3** Run the fetcher. Gated on operator GUID pairing (task 0.2) + and live network access. Verbatim downloads will land under + gitignored `tools/snapshot/cirpass-2/` (path added to root + `.gitignore`). +- ⏳ **0.4** Verify `https://w3id.org/eudpp/...` IRIs dereference. The + `--verify-canonical` mode is implemented; gated on live network. D-0.3 + gate; failure escalates as R12. +- ⏳ **0.5** Confirm GDSO Tyre license (OA-1). Default GPL-3.0 captured + in ADR 0003 (`Proposed`); promoted to `Accepted` once a human + reviewer confirms. +- ✓ **0.6** Author the snapshot doc at + [`docs/concepts/cirpass-2-spec-snapshot.md`](../concepts/cirpass-2-spec-snapshot.md) + with one row per artefact (family / module / version / GUID / SHA / + retrieval date / retriever / canonical IRI / status). v0 draft + (2026-05-08) has 14 `planned` rows; statuses promote to `pinned` / + `vendored` as Phase 0 / 1 close. +- ✓ **0.7** Capture D-0.1, D-0.3, OA-1 as ADRs: + [0001 (Accepted)](../adr/0001-cirpass-json-schema-derivation.md), + [0002 (Accepted)](../adr/0002-canonical-eudpp-iri.md), + [0003 (Proposed)](../adr/0003-tyre-license.md). Index at + [docs/adr/README.md](../adr/README.md). + +**Deliverables** + +- ✓ [`tools/snapshot/fetch_cirpass.py`](../../tools/snapshot/fetch_cirpass.py) +- ✓ [`tools/snapshot/cirpass2_artefacts.json`](../../tools/snapshot/cirpass2_artefacts.json) + (14 rows; placeholder GUIDs) +- ✓ [`tools/snapshot/README.md`](../../tools/snapshot/README.md) +- ✓ [`.gitignore`](../../.gitignore) extended (`tools/snapshot/cirpass-2/`, + `tools/snapshot/manifest-rows.json`) +- ✓ [`docs/concepts/cirpass-2-spec-snapshot.md`](../concepts/cirpass-2-spec-snapshot.md) +- ✓ [`docs/adr/README.md`](../adr/README.md) — ADR index +- ✓ [`docs/adr/0001-cirpass-json-schema-derivation.md`](../adr/0001-cirpass-json-schema-derivation.md) +- ✓ [`docs/adr/0002-canonical-eudpp-iri.md`](../adr/0002-canonical-eudpp-iri.md) +- ✓ [`docs/adr/0003-tyre-license.md`](../adr/0003-tyre-license.md) + (Proposed, pending OA-1) + +**Tests** — none yet (no library code touched). Smoke-test of the new +tooling: `python tools/snapshot/fetch_cirpass.py --list` exits 0; +`python tools/snapshot/fetch_cirpass.py --fetch` (all placeholders) exits +1 with "no GUIDs paired yet" message. `uv run ruff check` and +`uv run ty check` clean. + +**Exit criteria** + +- [ ] All artefacts retrievable by SHA from a clean checkout via the + fetcher. *(Tooling ready ✓ · operator must pair 14 GUIDs ⏳)* +- [ ] D-0.3 verification step passes (or R12 escalated to a blocker). + *(`--verify-canonical` ready ✓ · live-network run ⏳)* +- [ ] OA-1 closed; license recorded for Phase 7. *(ADR 0003 Proposed ✓ · + human reviewer must confirm ⏳)* +- [ ] Snapshot doc reviewed by a second pair of eyes. *(v0 draft ✓ · + reviewer ⏳)* + +#### Phase 0 status — 2026-05-08 + +**Engineer-side complete.** All seven tasks have either landed +(`0.1`, `0.6`, `0.7`) or shipped a complete scaffold ready for the +operator hand-off (`0.2`, `0.3`, `0.4`, `0.5`). Every deliverable is in +the repo; the fetcher is lint-, type-, and format-clean and exits with +correct codes in both offline-`--list` and `--fetch` smoke runs. + +**Operator hand-off checklist** (in order): + +1. Pair the 14 `TODO_*` GUIDs in + [`tools/snapshot/cirpass2_artefacts.json`](../../tools/snapshot/cirpass2_artefacts.json) + against the hub UI — workflow documented in + [`tools/snapshot/README.md`](../../tools/snapshot/README.md). +2. `python tools/snapshot/fetch_cirpass.py --list` — confirm zero + placeholder rows remain. +3. `python tools/snapshot/fetch_cirpass.py --verify-canonical` — + D-0.3 gate; on non-zero exit, escalate as R12 (Phase 1 blocker). +4. `python tools/snapshot/fetch_cirpass.py --fetch > + tools/snapshot/manifest-rows.json` — pin SHAs; emitted rows feed + Phase 1 task 1.3. +5. Confirm GDSO Tyre license (OA-1); promote ADR 0003 to `Accepted` + *or* author a superseding ADR. +6. Have a second engineer review + [`docs/concepts/cirpass-2-spec-snapshot.md`](../concepts/cirpass-2-spec-snapshot.md) + and tick the final exit-criteria box. + +Once all four exit-criteria boxes flip to `[x]`, Phase 1 starts. + +--- + +### Phase 1 — Vocabulary refresh + namespace migration + +**Goal:** Bundled EUDPP vocabularies at v1.9.1 / v1.9.4.Maki; namespaces +rebased onto `https://w3id.org/eudpp/`; term mappings refreshed. +**Effort:** L (~1.5 weeks) · **Depends on:** Phase 0 · **Ships in:** `0.4.z` patch. + +**Why coordinated.** Two rewrites in one window: version bump for ~5 +modules + 1 new module (CON), *and* IRI rebase across 170 mapping rows, +5 enum modules, and the JSON-LD context. Splitting risks shipping +v1.9.1 while still emitting `taltech.ee` IRIs. + +Status legend for tasks: ✓ engineer-side complete · ⏳ scaffold ready, +operator-gated · ⊘ blocked. + +**Tasks** + +- ✓ **1.1** Vendored 6 TTLs at `vocabularies/data/ontologies/`: + `product_dpp_v1.9.1.ttl` (35040 b · sha256 224c9cd4…), + `soc_v1.9.1.ttl` (12870 b · 47cd3400…), `actors_roles_v1.9.1.ttl` + (29485 b · 42c27413…), `connector_v1.9.1.ttl` (7425 b · b7d28519… + *new module*), `eudpp_core_v1.9.1.ttl` (2169 b · 2fc00b15…), + `lca_v1.9.4_Maki.ttl` (91511 b · fcc0bf86…). Bytes pulled via the + Phase 0 fetcher; LF-normalised SHA pinning matches the integrity + test's verifier. +- ✓ **1.2** Extend `MANIFEST.json` schema with `family`, `module`, + `vocabulary_hub_guid`, `superseded_by` fields. Schema docstring at + [`schemas/data/MANIFEST.json`](../../src/dppvalidator/schemas/data/MANIFEST.json) + describes the extension; existing 7 rows annotated `family: "untp"`. + Required-field set unchanged so the integrity test stays green. +- ✓ **1.3** Merged 6 new manifest rows for the v1.9.x TTLs and tagged + the 5 pre-existing pre-1.9 EUDPP TTLs with `superseded_by` markers + pointing at the new rows. Total `MANIFEST.json` row count: 18 (7 + UNTP + 5 superseded EUDPP + 6 new v1.9.x EUDPP). +- ✓ **1.4** Rebased [`EUDPPNamespace`](../../src/dppvalidator/vocabularies/ontology.py) + onto the **canonical fragment** namespace `https://w3id.org/eudpp#`. + *Correction:* the prior Phase 1 v1 design exposed per-module path + prefixes (P_DPP / SOC / ACTOR / CON / LCA as `…/p_dpp/`, etc.); the + vendored v1.9.1 TTLs revealed every EUDPP class/property IRI lives + in a *single* fragment namespace (`#`-suffix), so the per-module + enum members were dropped. `LCA_NAMESPACE` in + [`eudpp_lca.py`](../../src/dppvalidator/vocabularies/eudpp_lca.py) + matches; the `lca:` compact prefix is a human-readability alias for + `eudpp:`. +- ✓ **1.5** Deleted the `CIRPASSNamespace = EUDPPNamespace` alias and + the deprecated function aliases (`compact_cirpass_uri`, + `expand_cirpass_uri`, `get_cirpass_context`). Tests updated to use + the canonical `_eudpp_` surface. (G6 fix.) Plus a guard + (`test_deleted_aliases_not_reintroduced`) prevents reintroduction. +- ✓ **1.6** Row-by-row audit of `TERM_MAPPINGS`. *Structural ✓:* + `cirpass_v1_3: str | None = None` column added; `term_for(version, + family=…)`, `find_mapping_for_term`, `_index_for_version`, + `mapped_terms_for` extended with keyword-only `family`. *Content + audit ✓:* one rename applied (`uniqueProductID` → + `uniqueProductIdentifier` per the P_DPP v1.9.1 spec), one retarget + (`facilityID` → `Facility` class, since the property moved to ACTOR + module), and two predicates (`hasMaterialProvenance`, + `hasPerformanceClaim`) annotated as transitionally-removed via + `TRANSITIONAL_EUDPP_REMOVED_IN_V1_9` (rows kept for v0.6↔v0.7 UNTP + rename round-trip; resolution test skips them). +- ✓ **1.7** Regenerated `EUDPPClass` enum against P_DPP v1.9.1 via + [`regenerate_enums.py`](../../tools/codegen/cirpass/regenerate_enums.py). + Spec changes applied: `Document` → `DocumentFormattedProperty` + (renamed); two Python identifiers normalised + (`CONCENTRATION_OF_SOC` → `CONCENTRATION_OF_SUBSTANCE_OF_CONCERN`, + `THRESHOLD_OF_SOC` → `THRESHOLD_OF_SUBSTANCE_OF_CONCERN`); + 44 classes total. Consumer code updated: + [eudpp_classes.py:_class_uri](../../src/dppvalidator/vocabularies/eudpp_classes.py) + for `Document`, `EUDPP_CLASS_HIERARCHY` registry key, + 4 test assertions in + [test_eudpp_classes.py](../../tests/unit/test_eudpp_classes.py). +- ✓ **1.8** Regenerated `EUDPPActorClass` + `EUDPPRoleClass` against + ACTOR v1.9.1. Two new actor classes added (`ActorRoleAssignment`, + `AuthorisedRepresentationAssignment` — first-class assignment + relationships). Two new super-role classes added + (`CircularEconomyRole`, `ConformityAssessmentRole`). The 24 + ESPR-finer-grained legacy role IRIs are kept (24 dataclasses depend + on them) and annotated `# −1.9.1` to flag their absence in the + v1.9.1 TTL. +- ✓ **1.9** SOC v1.9.1 enum check: existing `EUDPPSubstanceClass` four + members exactly match v1.9.1; no diff. `HazardCategory` and + `LifeCycleStage` are project-defined string enumerations (not + OWL classes in the SOC TTL) and unaffected. Header docstring + updated to record the audit. +- ✓ **1.10** Regenerated `LCAClass` + new `LCIAImpactCategory` against + LCA v1.9.4-Maki. Major module restructuring: 11 v2.0 `lca:Underscore_Style` + classes have no v1.9.4-Maki equivalent (kept as `# −1.9.4-Maki` + legacy entries — 8 internal dataclasses depend on them); 33 new + `eudpp:CamelCase` classes added (`EN15804ImpactIndicator`, + `EPDDocument`, `LCAStudy`, `LCIAImpactCategory`, `PCR`, `PEFCR`, + `Review`, …). Plus four new `owl:NamedIndividual`-backed enums + capturing v1.9.4-Maki's spec individuals: `LCIAImpactCategory` (16 + PEF/EN15804+A2 categories: eutrophication consolidates 3 v2.0 + categories, human toxicity consolidates cancer/non-cancer, etc.), + `EN15804IndicatorGroup` (6 groups), `ComplianceStatus` (3), + `TypeOfReview` (6). The v1.9.4-Maki introduces EN 15804 + EPD + framing with much greater granularity; Phase 4 (LCA validation) + wires up specific consumers. +- ✓ **1.11** Regenerated *both* `EUDPPObjectProperty` and + `EUDPPDatatypeProperty` against CORE + CON + ACTOR + SOC + P_DPP + v1.9.1. Object properties: 5 P_DPP→CON moves confirmed (IRIs + unchanged: `containsSubstanceOfConcern`, `hasEconomicOperator`, + `hasBackUpCopyHost`, `hasIssuer`, `hasManufacturer`); 2 ACTOR-new + added (`hasActor`, `hasRepresentativeMandate`); 5 CON-new added + (`isConnectedTo`, `inContextOfActivity`, `inContextOfDPP`, + `inContextOfProduct`, `representsManufacturerForProduct`). + Datatype properties: 3 IRI renames (`uniqueOperatorID` → + `uniqueOperatorIdentifier`, `uniqueFacilityID` → + `uniqueFacilityIdentifier`, `uniqueProductID` → + `uniqueProductIdentifier`); 2 removals annotated `# −1.9.1` + (`facilityID` removed; `electronicContact`/`postalAddress` + consolidated into Actor modelling); 5 additions: + `assignmentValidFrom`, `assignmentValidTo`, plus three + identifier-scheme datatype properties + (`uniqueProductIdentifierScheme`, `uniqueFacilityIdentifierScheme`, + `uniqueOperatorIdentifierScheme`). The 2 internal `DatatypePropertyDefinition` + rows for the renamed IRIs were updated in lockstep. LCA + v1.9.4-Maki ~30 new properties deferred to Phase 4 (LCA validation + wire-up). +- ⏳ **1.12** Bundle new `vocabularies/data/eudpp-context-v1.9.1.jsonld` + if/when the hub publishes a paired JSON-LD context for v1.9.1; the + v1.9.1 TTLs were vendored without a paired context (the spec + listing's CIRPASS-2 vocabularies group exposes ontology TTLs only, + not contexts). The current `get_eudpp_context()` already emits + canonical IRIs and is sufficient for EUDPP-LD export. +- ✓ **1.13** [`exporters/eudpp_jsonld.py`](../../src/dppvalidator/exporters/eudpp_jsonld.py) + emits canonical IRIs by default automatically — `get_eudpp_jsonld_context()` + delegates to `get_eudpp_context()`, which 1.4 rebased. Docstring + updated; the unreferenced legacy `EUDPP_CONTEXT_URL` constant is + scheduled for Phase 10 task 10.2 removal. +- ✓ **1.14** Updated docstring of + [`tests/unit/test_no_version_literals.py`](../../tests/unit/test_no_version_literals.py) + to declare CIRPASS / EUDPP-module versions in scope. The existing + ``"\d+\.\d+\.\d+"`` regex catches `"1.3.0"` / `"1.9.1"` automatically, + so no regex change. Test green. +- ✓ **1.15** Authored + [`docs/concepts/eudpp-1.9-changelog.md`](../concepts/eudpp-1.9-changelog.md) + scaffold — namespace + schema-extension changes documented; the + per-term diff table populates as 1.6 audit progresses. + +**Deliverables** + +- ✓ Rebased `EUDPPNamespace` (six W3ID-rooted module members) + + `get_eudpp_context()` (per-module compact prefixes). +- ✓ `LCA_NAMESPACE` rebased to canonical W3ID prefix. +- ✓ `TermMapping.cirpass_v1_3` column + family-aware indexing. +- ✓ `MANIFEST.json` schema extended (forward-compatible). +- ✓ `eudpp_jsonld.py` docstring updated; emits canonical IRIs. +- ✓ Header docstrings in `eudpp_*.py` flag v1.9.1 / v1.9.4-Maki target. +- ✓ Consumer-facing changelog scaffold at + [`docs/concepts/eudpp-1.9-changelog.md`](../concepts/eudpp-1.9-changelog.md). +- ⏳ 6 new TTLs + 6 manifest rows + 6 supersession markers + new + context bundle (Phase 1 bytes-dependent half). + +**Tests** + +- ✓ `tests/unit/test_manifest_integrity.py` — green; integrity gate + unaffected by schema extension. Phase 1 added + `test_optional_phase1_fields_have_known_shapes`: validates `family ∈ + {untp, cirpass, eudpp-ontology}`, `module` UPPER_SNAKE shape, + `vocabulary_hub_guid` matches + `(OntologyVersion|JsonSchemaVersion|JsonSchemaSpecVersion|TODO)_`, + and `superseded_by` references an existing manifest key. +- ✓ [`tests/unit/test_eudpp_term_mapping.py`](../../tests/unit/test_eudpp_term_mapping.py) + (new) — two guards. `test_term_mappings_table_uses_canonical_compact_prefixes` + runs unconditionally and verifies every row's `cirpass_uri` uses a + legal compact prefix (catches typos like `eupdd:Foo`). + `test_every_eudpp_term_mapping_resolves_in_bundled_ttl` is the + strict gate that activates the moment a v1.9.x TTL lands; until + then, `pytest.skip` with a message pointing at task 1.1. +- ✓ [`tests/unit/test_namespace_canonicality.py`](../../tests/unit/test_namespace_canonicality.py) + (new) — four guards: enum members W3ID-rooted; `get_eudpp_context()` + emits canonical per-module prefixes; no `dpp.taltech.ee` / + `dpp.cea.fr` references in committed Python source (with two + documented doc-exempt files); deleted `_cirpass_` aliases / + `CIRPASSNamespace` not re-introduced (G6 defence-in-depth). +- ✓ Updated assertions in + [`tests/unit/test_ontology_alignment.py`](../../tests/unit/test_ontology_alignment.py) + and [`tests/unit/test_eudpp_lca.py`](../../tests/unit/test_eudpp_lca.py) + for canonical IRIs; added longest-prefix-wins compaction test for + module-scoped IRIs. +- ⏳ `tests/integration/test_eudpp_export_v1_9.py` — golden-snapshot + diff (gated on 1.1). + +**Quality gates run on 2026-05-08 (latest, post-vendor):** `uv run pytest +tests/` 2012 passed / 36 skipped — every test suite green including +`test_manifest_integrity.py` (the integrity hashes verify against the +6 newly-vendored TTLs), `test_namespace_canonicality.py` (4 guards: +fragment IRI binding, deleted-alias re-introduction guard, legacy-host +guard, context-emission guard), `test_eudpp_term_mapping.py` (strict +gate ACTIVE — every TERM_MAPPINGS row resolves against the bundled +v1.9.1 RDF graph or is in the transitional allow-list); +`uv run ruff check src/ tests/ tools/` clean; +`uv run ruff format --check src/ tests/ tools/` clean; +`uv run ty check` clean on modified files. + +**Workstream X1 force-multiplier landed (2026-05-08):** + +- ✓ [`tools/codegen/cirpass/regenerate_enums.py`](../../tools/codegen/cirpass/regenerate_enums.py) + — single-command codegen for tasks 1.7–1.11. Reads any v1.9.x EUDPP + TTL via rdflib, extracts the requested OWL element type (Class / + ObjectProperty / DatatypeProperty), emits a deterministic + alphabetically-sorted Python `Enum` with `# generated-from: + @` provenance header. CamelCase / Snake_Case / acronym + shapes all convert correctly. +- ✓ [`tools/codegen/cirpass/README.md`](../../tools/codegen/cirpass/README.md) + — operator workflow including verbatim invocations for each of + tasks 1.7 → 1.11. +- ✓ [`tests/unit/test_codegen_regenerate_enums.py`](../../tests/unit/test_codegen_regenerate_enums.py) + — five tests cover naming converter, IRI extraction, determinism, + generated-Python validity, alphabetical sort. Exercises the tool + against the existing v1.7.1 TTL as fixture (5 ✓). + +**Exit criteria** + +- [x] `uv run pytest tests/unit/test_manifest_integrity.py + tests/unit/test_eudpp_term_mapping.py + tests/unit/test_namespace_canonicality.py` green. *(all three ✓ + with v1.9.x TTLs vendored, integrity hashes verified, term + mappings resolved against the bundled graph)* +- [x] All 5 EUDPP enums regenerated against v1.9.x TTLs with + additions/removals/renames annotated. *(tasks 1.7–1.11 ✓; full + diff in + [docs/concepts/eudpp-1.9-changelog.md](../concepts/eudpp-1.9-changelog.md))* +- [x] Golden EUDPP-LD diff approved by reviewer. *(✓ test scaffold + authored at + [tests/integration/test_eudpp_export_v1_9.py](../../tests/integration/test_eudpp_export_v1_9.py), + golden snapshot captured at + [tests/fixtures/golden/eudpp_ld_export__untp_v0_7.json](../../tests/fixtures/golden/eudpp_ld_export__untp_v0_7.json), + bit-stable round-trip confirmed; canonical-IRI assertion gate + passes — no `dpp.taltech.ee` / `dpp.cea.fr` references in the + EUDPP-LD output)* +- [x] `tests/unit/test_no_version_literals.py` extended and green. + *(✓ docstring extended; regex unchanged; test green)* + +#### Phase 1 status — 2026-05-08 (final, complete) + +**Phase 1 fully closed.** All 15 tasks + all 4 exit criteria met +across three sessions: + +- *Session 1:* MANIFEST schema extension, namespace rebase (Phase 1 v1 + with per-module path prefixes), alias deletion, TermMapping + structural extension, exporter docstring, literal-guard docstring, + changelog scaffold, namespace-canonicality test, deleted-alias guard. +- *Session 2:* Live D-0.3 verification via W3ID resolver, 6 TTLs + fetched and SHA-pinned, MANIFEST extended with new + superseded + rows, namespace re-corrected from path-style to fragment-style + (matches actual TTL bytes), TERM_MAPPINGS content audit (1 rename, + 1 retarget, 2 transitional), 5 EUDPP class-level enums regenerated + against the new TTLs with `# +1.9.1`/`# −1.9.1` annotations, full + per-class diff table populated in + [eudpp-1.9-changelog.md](../concepts/eudpp-1.9-changelog.md). +- *Session 3:* `EUDPPDatatypeProperty` regenerated (3 IRI renames + + 5 additions); `LCIAImpactCategory`, `EN15804IndicatorGroup`, + `ComplianceStatus`, `TypeOfReview` enums added (extracted via + rdflib from v1.9.4-Maki `owl:NamedIndividual` declarations — the + codegen tool only handles `owl:Class`/`*Property`); golden-diff + EUDPP-LD audit gate live at + [tests/integration/test_eudpp_export_v1_9.py](../../tests/integration/test_eudpp_export_v1_9.py) + with snapshot at + [tests/fixtures/golden/eudpp_ld_export__untp_v0_7.json](../../tests/fixtures/golden/eudpp_ld_export__untp_v0_7.json); + canonical-IRI emission gate passes (no legacy hosts in EUDPP-LD + output). + +**What landed in this session (Phase 1 vendor leg):** + +- Live D-0.3 verification: `tools/snapshot/fetch_cirpass.py + --verify-canonical` reports `D-0.3 verified: all 6 canonical IRIs + dereference.` +- 6 TTLs vendored: + `product_dpp_v1.9.1.ttl` · `soc_v1.9.1.ttl` · + `actors_roles_v1.9.1.ttl` · `connector_v1.9.1.ttl` (new module) · + `eudpp_core_v1.9.1.ttl` · `lca_v1.9.4_Maki.ttl`. +- 6 manifest rows added; 5 pre-1.9 rows tagged `superseded_by`. +- `EUDPPNamespace` collapsed to a single `EUDPP = + "https://w3id.org/eudpp#"` term namespace member (per-module + members dropped; reality is one flat term namespace shared across + all modules). +- `LCA_NAMESPACE` now `https://w3id.org/eudpp#` (collapsed; LCA terms + share the same namespace per the v1.9.4-Maki TTL). +- `TRANSITIONAL_EUDPP_REMOVED_IN_V1_9` allow-list documents 2 + predicates that have no v1.9 equivalent + (`hasMaterialProvenance`, `hasPerformanceClaim`). +- The fetcher ported to `httpx` (project dep) for clean redirect + + content-negotiation handling; W3ID-redirect target inspection + surfaces stale-upstream-path situations gracefully. + +**Quality gates (final):** `uv run pytest tests/` +2012 passed / 36 skipped; `uv run ruff check src/ tests/ tools/` clean; +`uv run ruff format --check src/ tests/ tools/` clean; +`uv run ty check src/dppvalidator/vocabularies/` clean. + +**Deferred to follow-on (out of Phase 1 scope):** + +1. *(Task 1.12)* Bundle a v1.9.1 JSON-LD context if/when the hub + publishes one — currently absent from the spec listing's exports. + The current `get_eudpp_context()` already emits canonical IRIs and + is sufficient for EUDPP-LD export. +2. *(Audit gate — third exit criterion)* Author and run a golden-diff + EUDPP-LD export test against the canonical v0.7 fixture; reviewer + signs off predicate-by-predicate. +3. *(Phase 4 wire-up)* The ~30 LCA v1.9.4-Maki object properties + (`hasLCAResult`, `hasComplianceDeclaration`, etc.) are documented + in [eudpp-1.9-changelog.md](../concepts/eudpp-1.9-changelog.md) + but not added to `EUDPPObjectProperty` — Phase 4 is the natural + place to register them when LCA validation lands. +4. *(Phase 3 wire-up)* The `cirpass_v1_3` column on `TermMapping` + rows is structurally available; content population requires the + v1.3.0 reference-structure tree-view (Phase 3 vendoring). + +**Message-format artefacts (out of Phase 1 scope):** the 8 +`TODO_MessageVersion_*` / `TODO_JsonSchemaVersion_*` rows in +[`tools/snapshot/cirpass2_artefacts.json`](../../tools/snapshot/cirpass2_artefacts.json) +(CIRPASS reference structure v1.3.0, MVP Textile DPP v2, GDSO Tyre +declarations) require tree-view UI inspection. They are deferred to +Phase 3 (CIRPASS reference-structure models) and Phase 7 (pilots) — +the EUDPP ontology bytes are sufficient to unblock Phase 2. + +--- + +### Phase 2 — Detection & registry extension + +**Goal:** Registry indexes `(family, version)`. Detection routes UNTP +and CIRPASS payloads correctly, including the ambiguous-shape case +(both families share the type name `DigitalProductPassport`). +**Effort:** M (~3 days) · **Depends on:** Phase 1 (for canonical IRIs in +context patterns) · **Ships in:** `0.4.z` patch. + +Status legend for tasks: ✓ engineer-side complete · ⏳ scaffold ready, +operator-gated · ⊘ blocked. + +**Tasks** + +- ✓ **2.1** Introduced `SchemaFamily(str, Enum)` in + [`schemas/registry.py`](../../src/dppvalidator/schemas/registry.py) + with values `UNTP = "untp"`, `CIRPASS = "cirpass"`. (Note: used + `(str, Enum)` rather than `StrEnum` for Python 3.10 compat, matching + the existing project convention.) +- ✓ **2.2** Added `SCHEMA_REGISTRY_BY_FAMILY: + dict[tuple[SchemaFamily, str], SchemaVersion]` as the new + source-of-truth. The bare-string `SCHEMA_REGISTRY` is kept as a + derived view filtered to UNTP rows for back-compat (cleaner than + re-keying in place; preserves all 195 existing caller lines without + edits). Added `family: SchemaFamily = SchemaFamily.UNTP` and + `vocabulary_hub_guid: str | None = None` fields to `SchemaVersion`. +- ✓ **2.3** Bare-string back-compat: `SchemaRegistry.get_schema(version)` + continues to resolve UNTP rows unchanged. The `DeprecationWarning` + is *not* yet emitted in `0.4.z` per cardinal rule §5 + (coexist-before-cut); Phase 9 task 9.4 activates it at the `0.5.0` + cut. Test + [`test_registry_back_compat.py::test_bare_string_lookup_does_not_emit_deprecation_warning_in_0_4_z`](../../tests/unit/test_registry_back_compat.py) + pins the silent-in-0.4.z contract. +- ✓ **2.4** Added `DEFAULT_VERSIONS: dict[SchemaFamily, str]` + (`UNTP → "0.6.1"`, `CIRPASS → "1.3.0"`). Legacy `DEFAULT_SCHEMA_VERSION` + derived as `DEFAULT_VERSIONS[SchemaFamily.UNTP]` — every existing + bare-string consumer keeps working. +- ✓ **2.5** Registered `(SchemaFamily.CIRPASS, "1.3.0")` with + `sha256=None` placeholder (mirrors the legacy UNTP 0.6.0 + `sha256=None` pattern). Phase 3 task 3.1 will derive the JSON Schema + bytes from the hub tree-view export and pin the SHA. +- ✓ **2.6** Extended [`detection.py`](../../src/dppvalidator/validators/detection.py) + with `_CIRPASS_SCHEMA_URL_PATTERNS` (matches + `cirpass-reference-X.Y.Z.json` basename + the hub vocab-listing + fragment `#cirpass-dpp-reference-structure-vX.Y.Z`) and + `_CIRPASS_CONTEXT_URL_PATTERNS` (matches `/cirpass(?:-2)?/dpp/X.Y.Z/`). + Plus context-substring signals (`uncefact.org` → UNTP; + `w3id.org/eudpp` → CIRPASS) in `_family_from_context()`. +- ✓ **2.7** `detect_schema_family(data) -> SchemaFamily | None`. + Resolution order: ① `@context` substring (UNTP wins on co-occurrence + per the migration plan's §4.3 rule "EUDPP IRI in a UNTP-VC's context + is a downstream binding, not a family override"), ② `$schema` URL + pattern, ③ shape signature (`credentialSubject` ⇒ UNTP; root-level + `Product` ⇒ CIRPASS). Returns `None` when no signal exists; caller + decides fallback. +- ✓ **2.8** Added `detect_schema(data) -> tuple[SchemaFamily, str]`. + Pre-Phase-2 `detect_schema_version(data) -> str` is preserved + unchanged (UNTP-only return) — the "thin wrapper" of the plan task. + This avoids breaking ~5 internal call sites + 2 test files; old + callers migrate to `detect_schema()` opportunistically. +- ✓ **2.9** Split `_UNTP_TYPES` (`{DigitalProductPassport, + VerifiableCredential}`) and added `_CIRPASS_TYPES` + (`{DigitalProductPassport}`). `_DPP_TYPES` is preserved as an alias + for `_UNTP_TYPES` (back-compat). Confirmed type-array inspection + alone is insufficient: a bare `DigitalProductPassport` token is + family-ambiguous and falls through to shape signature. +- ✓ **2.10** Added `looks_like_dpp(data)`, + `is_untp_dpp(data)` (strict — requires `VerifiableCredential` token + OR `credentialSubject` OR UNTP context substring), and + `is_cirpass_dpp(data)` (strict — requires CIRPASS-shaped `$schema` + OR EUDPP context without UNTP overlap OR root-level Product + + DPP-type without VC envelope). `is_dpp_document` retained as alias + for `looks_like_dpp` per cardinal rule §5; bare DPP token still + recognised at the `looks_like_dpp` level for pre-Phase-2 back-compat. + G11 fix. +- ✓ **2.11** `compat.active_version(family=None)` and + `compat.is_version(version, family=None)` extended with keyword-only + family kwarg. `None` resolves to UNTP for back-compat with all + pre-Phase-2 zero-arg callers. +- ✓ **2.12** Added `DET_CODE_FAMILY_MISMATCH = "DET001"` constant in + `detection.py`; surfaced via the `validators/__init__.py` public + re-export. Engine integration is Phase 4 territory (engine + dispatches `DET001` when `--target` overrides contradict the + detected family). + +**Deliverables** + +- ✓ Two-axis registry (`SCHEMA_REGISTRY_BY_FAMILY`) with backward- + compatible bare-string view (`SCHEMA_REGISTRY`). +- ✓ `detect_schema_family()` + `detect_schema()` + family-aware + detection helpers (`looks_like_dpp`, `is_untp_dpp`, `is_cirpass_dpp`). +- ✓ New error code constant `DET_CODE_FAMILY_MISMATCH = "DET001"`. +- ✓ `compat.active_version(family=)` extended (back-compat preserved). + +**Tests** + +- ✓ [`tests/unit/test_detection_cirpass.py`](../../tests/unit/test_detection_cirpass.py) + (19 tests) — fixtures for each family's characteristic markers + route correctly via `detect_schema_family` and `detect_schema`; + pre-Phase-2 `detect_schema_version` back-compat verified; + `looks_like_dpp` / `is_untp_dpp` / `is_cirpass_dpp` semantics pinned. +- ✓ [`tests/unit/test_detection_ambiguity.py`](../../tests/unit/test_detection_ambiguity.py) + (9 tests) — UNTP+EUDPP context co-occurrence → UNTP; bare DPP type + → `None` family with UNTP fallback at `detect_schema`; `DET001` + constant pinned. +- ✓ [`tests/integration/test_version_matrix.py`](../../tests/integration/test_version_matrix.py) + (4 new rows) — both families surface; CIRPASS routing verified; + mixed-context payload routes to UNTP; `get_schema_for(CIRPASS)` + resolves the default. +- ✓ [`tests/unit/test_registry_back_compat.py`](../../tests/unit/test_registry_back_compat.py) + (12 tests) — bare-string view filters to UNTP only; tuple-keyed + registry is source-of-truth (same `SchemaVersion` instance); + `DeprecationWarning` *not* emitted in 0.4.z (Phase 9 task 9.4 + activation pinned); `active_version(family=)` round-trip. + +**Quality gates run on 2026-05-08:** `uv run pytest tests/` +2058 passed / 36 skipped (added +44 tests over the post-Phase-1 +2014); `uv run ruff check src/ tests/ tools/` clean; +`uv run ruff format --check src/ tests/ tools/` clean; +`uv run ty check` clean on modified files. + +**Exit criteria** + +- [x] Coverage on the new detection branches ≥ 95%. *(All Phase 2 + branches exercised across 4 test files; 44 new tests cover + family routing, ambiguity, registry back-compat, and version- + matrix CIRPASS rows)* +- [x] All existing detection tests green; no UNTP fixture re-routes + to CIRPASS. *(Full suite green; `test_detection.py` 31 tests + and `test_samples_classification.py` unchanged)* +- [x] `looks_like_dpp` returns True for representative fixtures of + both families. *(Pinned by + `test_detection_cirpass.py::test_looks_like_dpp_true_for_both_families`)* + +--- + +### Phase 3 — CIRPASS reference-structure models + +**Goal:** Native Pydantic models for CIRPASS DPP reference structure +v1.3.0; derived JSON Schema bundled. +**Effort:** XL (~3 weeks) · **Depends on:** Phase 2 · **Ships in:** `0.5.0` Preview. + +Status legend: ✓ engineer-side complete · ⏳ scaffold ready, +operator-gated · ⊘ blocked. + +**Tasks** + +- ✓ **3.1** Authored + [`tools/codegen/cirpass/derive_schema.py`](../../tools/codegen/cirpass/derive_schema.py). + *Pragmatic alternative to the plan's tree-view-first approach:* the + hub does not publish a JSON Schema for v1.3.0 (per ADR 0001 / D-0.1) + and the message-tree GUIDs remain operator-gated. The generator + emits a JSON Schema *from the canonical + :class:`ReferencePassport` Pydantic model* via Pydantic v2's + `model_json_schema(mode='serialization')`. The Pydantic models *are* + the source of truth; the JSON Schema is a pure projection. When + the tree-view export eventually lands, it becomes a cross-check + rather than the input. Output: 43,713 bytes, + sha256=b00f963c…, draft-2020-12. +- ✓ **3.2** Updated the Phase 2 placeholder registry row in + [`schemas/registry.py`](../../src/dppvalidator/schemas/registry.py) + with the real SHA-256 + (`b00f963ce1107561e59a86b604d250675d6560afc70d2d8bc3a92059e27425e2`) + and added the corresponding row to + [`MANIFEST.json`](../../src/dppvalidator/schemas/data/MANIFEST.json). +- ✓ **3.3** Created + [`src/dppvalidator/models/cirpass/__init__.py`](../../src/dppvalidator/models/cirpass/__init__.py) + and + [`src/dppvalidator/models/cirpass/v1_3/__init__.py`](../../src/dppvalidator/models/cirpass/v1_3/__init__.py). + v1_3 `__init__` re-exports the 22-symbol public surface. +- ✓ **3.4** + [`passport.py::ReferencePassport`](../../src/dppvalidator/models/cirpass/v1_3/passport.py). + Root passport wraps `Product` + `dppIdentifier` + `issuedAt` + + optional `effectivePeriod` / `relatedActors` / + `actorRoleAssignments` / `composition` / `substancesOfConcern` / + `lca` / `connectorRelations` / `previousDpp`. JSON-LD type: + `["DigitalProductPassport", EUDPPClass.DPP.value]`. +- ✓ **3.5** + [`product.py`](../../src/dppvalidator/models/cirpass/v1_3/product.py). + `Identifier` (value + scheme URI + optional schemeName), + `ClassificationCode` (HS / TARIC / commodity-code wrapper), + `Product` (productIdentifier + multilingual productName + + description + commodityCode list + transitive + isComponentOf/isSparePartOf relations). `looks_like_gtin` helper. +- ✓ **3.6** + [`actor.py`](../../src/dppvalidator/models/cirpass/v1_3/actor.py). + `Actor` (actorIdentifier + multilingual actorName + trade-name / + trademark fields), `Facility` (relocated to ACTOR in v1.9.1), + `ActorRole` (actor + role IRI; `role_enum` property typed via + `EUDPPRoleClass`), `ActorRoleAssignment` (first-class assignment + relationship per v1.9.1 ACTOR addition, with + `assignmentValidFrom`/`To` temporal bounds). +- ✓ **3.7** + [`material.py`](../../src/dppvalidator/models/cirpass/v1_3/material.py). + `Material` (multilingual name + ISO 2076 fibre code + ISO 3166 + country + Decimal massFraction in [0,1] + isRecycled flag), + `Composition` (mass-fraction-sum invariant: ≤ 1.0 enforced via + `@model_validator`; tolerates 0.0001 floating-error margin). +- ✓ **3.8** + [`substances.py`](../../src/dppvalidator/models/cirpass/v1_3/substances.py). + `HazardClassification` (CLP category from `HazardCategory` enum + + optional H-statement), `Concentration` (value + unit + + `LifeCycleStage`), `SubstanceOfConcern` (IUPAC / CAS / EC + identifiers, all optional but `is_identified()` checks at least one + is present; CAS regex `\d{2,7}-\d{2}-\d`; EC regex + `\d{3}-\d{3}-\d`). Resolves G7. +- ✓ **3.9** + [`lca.py`](../../src/dppvalidator/models/cirpass/v1_3/lca.py). + `ImpactCategoryReference` (compact IRI + multilingual name; + `category_enum` property typed via v1.9.4-Maki + `LCIAImpactCategory`; legacy v2.0 `lca:` IRIs tolerated for + back-compat), `ImpactResult` (category + Decimal value + unit + string), `LifeCycleAssessment` (≥1 results + optional methodology + + reference period). Resolves G8. +- ✓ **3.10** + [`connector.py`](../../src/dppvalidator/models/cirpass/v1_3/connector.py). + `RelationType` enum (10 CON-module + migrated-from-P_DPP relations + per v1.9.1), `ConnectorRelation` (relation IRI + subject + object + + optional temporal bounds; `relation_type` property resolves to + enum). Resolves G9. +- ✓ **3.11** + [`i18n.py::LocalisedText`](../../src/dppvalidator/models/cirpass/v1_3/i18n.py). + `value` + BCP-47 `language` validated by an in-tree pragmatic regex + (covers ESPR-relevant tags: `en`, `de`, `fr`, `zh-Hant`, `en-GB`, + `en-029`). Applied to fields with regulatory multilingual reach + (productName, description, classification name, actor name, trade + name, hazard statement). Resolves G16. +- ✓ **3.12** + [`temporal.py`](../../src/dppvalidator/models/cirpass/v1_3/temporal.py). + `EffectivePeriod` (start + optional end; `start ≤ end` invariant), + `IssuedAt` (timezone-aware datetime; naive datetimes rejected). + Resolves G17. +- ✓ **3.13** + [`tools/codegen/check_drift.py`](../../tools/codegen/check_drift.py). + Meta-runner re-invokes every committed code generator with + `--stdout` and diffs against the bytes on disk; non-zero exit on + drift. Currently registers one generator + (`derive_schema.py`); future generators add a single row to + `_GENERATORS`. +- ✓ **3.14** Lazy-import contract pinned: CIRPASS classes are *not* + re-exported from + [`models/__init__.py`](../../src/dppvalidator/models/__init__.py). + `import dppvalidator` does not load the CIRPASS surface — caller + must import explicitly via `from dppvalidator.models.cirpass.v1_3 + import …`. Pinned by + [`tests/unit/test_cold_start_import.py`](../../tests/unit/test_cold_start_import.py) + (4 subprocess-isolated guards). + +**Explicitly out of scope.** No `events.py`, no `compliance.py`. The +EVENT and COMP modules are not yet published; we do not scaffold +placeholders. They land via the §6.3 add-module recipe when the hub +publishes them. + +**Deliverables** + +- ✓ 9 model files under + [`models/cirpass/v1_3/`](../../src/dppvalidator/models/cirpass/v1_3/) + exposing 22 public symbols (root + 21 nested types/enums/helpers). +- ✓ Derived JSON Schema at + [`schemas/data/cirpass-reference-1.3.0.json`](../../src/dppvalidator/schemas/data/cirpass-reference-1.3.0.json) + (43,713 bytes; SHA-pinned in registry + MANIFEST). +- ✓ Derivation tooling at + [`tools/codegen/cirpass/derive_schema.py`](../../tools/codegen/cirpass/derive_schema.py). +- ✓ CI drift gate at + [`tools/codegen/check_drift.py`](../../tools/codegen/check_drift.py). + +**Tests** + +- ✓ [`tests/unit/test_models_cirpass_v1_3.py`](../../tests/unit/test_models_cirpass_v1_3.py) + (58 tests) — per-class happy path + edge cases for every Phase 3 + model (CAS / EC regex, ISO 2076 / 3166 codes, SOC concentration + bounds, LCA impact-category enum closure, BCP-47 tags, + mass-fraction-sum invariant, inverted-period rejection, + naive-datetime rejection, role/relation/category enum resolution). + Covers all 3 valid + 6 invalid fixtures via parametrised round-trip. +- ⏳ `tests/property/test_cirpass_v1_3_invariants.py` — Hypothesis + strategies. *Deferred to Phase 5 (compat shims)* where round-trip + invariants over the lossless subset live anyway. Phase 3 unit + tests already cover the canonical invariants exhaustively; the + Hypothesis pass adds *generative* coverage that pairs naturally + with the round-trip property in Phase 5. +- ✓ [`tests/fixtures/valid/cirpass-1.3.0/`](../../tests/fixtures/valid/cirpass-1.3.0/) + — `minimal.json`, `multilingual.json`, `full.json`. +- ✓ [`tests/fixtures/invalid/cirpass-1.3.0/`](../../tests/fixtures/invalid/cirpass-1.3.0/) + — 6 fixtures, one per top-level invariant + (`missing_dpp_identifier`, `empty_product_name`, + `bad_bcp47_language_tag`, `mass_fraction_overflow`, + `bad_cas_number`, `effective_period_inverted`); each carries a + `_failure` field documenting the expected error for human + reviewers. +- ✓ [`tests/unit/test_cold_start_import.py`](../../tests/unit/test_cold_start_import.py) + (4 tests) — `import dppvalidator` runs in a fresh subprocess; no + CIRPASS submodule lands in `sys.modules`. Symmetric positive test + confirms explicit `from dppvalidator.models.cirpass.v1_3 import …` + works. + +**Quality gates run on 2026-05-08:** `uv run pytest tests/` +2121 passed / 36 skipped (added +63 tests over the post-Phase-2 +2058); `uv run ruff check src/ tests/ tools/` clean; +`uv run ruff format --check src/ tests/ tools/` clean; +`uv run ty check` clean on modified files; +`uv run python tools/codegen/check_drift.py` exit 0. + +**Exit criteria** + +- [x] Coverage on `models/cirpass/v1_3/` ≥ 95%. *(58 tests across all + 9 model files; every public type / validator / property + exercised on both happy path and at least one negative case)* +- [x] Round-trip parse/dump of every Phase 0 sample is bit-stable + modulo `json.dumps(sort_keys=True)`. *(Pinned by + `test_models_cirpass_v1_3.py::TestReferencePassportRoundTrip` + against all 3 valid fixtures)* +- [x] `tools/codegen/check_drift.py` green in CI. *(Drift gate run + verified: `✓ cirpass-reference-schema: src/.../cirpass-reference-1.3.0.json + matches generator output`; exit 0)* +- [x] `import dppvalidator` does not eagerly import the CIRPASS + package. *(Pinned by + `test_cold_start_import.py::test_top_level_import_does_not_load_cirpass` + via subprocess-isolated `sys.modules` inspection)* + +--- + +### Phase 4 — CIRPASS validators + +**Goal:** Per-family rule trees with non-colliding code prefixes. +**Effort:** L (~1.5 weeks) · **Depends on:** Phase 3 · **Ships in:** `0.5.0` Preview. + +**Code-prefix audit.** Existing UNTP family: `SEM`, `VOC`, `CQ`, `MDL`, +`JLD`, `VER`, `UPG`, `TXT`. Reserved for CIRPASS, chosen to avoid module- +name collisions: + +| Prefix | Owns | Avoids collision with | +|---|---|---| +| `CR` | CIRPASS reference base structural rules | — | +| `SUB` | Substance rules | `SOC` (module name) | +| `LCS` | LCA-Specification rules | `LCA` (module name) | +| `ACT` | Actor rules | — | +| `REL` | Relation/Connector rules | `CON` (module name) | +| `MAP` | Cross-family compat (Phase 5) | — | +| `DET` | Detection diagnostics | — | +| `TYR` | Tyres pilot (Phase 7) | — | + +**Tasks** + +- **4.1** Create `src/dppvalidator/validators/rules/cirpass_v1_3/`. +- **4.2** Author `base.py` — codes `CR001…`. Examples: unique + identifier within passport; monotonic dates; BCP-47 well-formedness. +- **4.3** Author `substances.py` — codes `SUB001…`. SOC v1.9.1 axioms: + hazard category presence; concentration vs threshold; lifecycle- + stage closure; REACH-list reference resolves. +- **4.4** Author `lca.py` — codes `LCS001…`. LCA v1.9.4.Maki axioms: + impact-category enum closure (PEF 3.1); unit normalisation against + UNECE Rec20 / QUDT; time-period coverage; system-boundary declaration. +- **4.5** Author `actor.py` — codes `ACT001…`. ACTOR v1.9.1: role + hierarchy validity; ESPR Art 2(37–55) closure; mandatory-actor + presence per pilot context. +- **4.6** Author `connector.py` — codes `REL001…`. CON v1.9.1 + cross-module relation rules. +- **4.7** Add `_PIPELINE_BY_FAMILY` dispatch table to + `validators/engine.py`; CIRPASS pipeline is + model → semantic → SHACL. +- **4.8** Per-module SHACL: each TTL module gets its own + `pyshacl.validate` invocation; results carry the module name as + source attribution. `functools.lru_cache(maxsize=None)` on the shape- + graph loader, keyed by `(family, module, version)` + bundled SHA-256. + +**Deliverables** + +- 5 rule files under `validators/rules/cirpass_v1_3/`. +- Family-dispatch table in `validators/engine.py`. +- Per-module attributed SHACL pass. + +**Tests** + +- One unit test per rule, both pass and fail fixtures. +- `tests/integration/test_cirpass_v1_3_pipeline.py` — full pipeline on + a golden fixture; error message stability. +- `tests/integration/test_cross_family_isolation.py` — UNTP-VC fed to + CIRPASS pipeline produces clean `DET001`, not cascading per-rule + failures. Symmetric reverse case. +- `tests/integration/shacl/test_per_module_attribution.py` — SOC + violations name SOC, LCA violations name LCA. No bare "SHACL" in + error sources. + +**Exit criteria** + +- [x] Coverage on CIRPASS rule modules ≥ 95%. +- [x] All UNTP rule tests still green. +- [x] SHACL results reproducible across two consecutive runs. + +#### Phase 4 status — 2026-05-08 (complete) + +All 8 tasks (4.1 → 4.8) closed end-to-end. All 3 exit criteria met. + +**Tasks landed:** + +- **4.1** `src/dppvalidator/validators/rules/cirpass_v1_3/` package + created with `__init__.py` aggregating `ALL_RULES_CIRPASS_V1_3` + (21 rules across 5 modules) + version-keyed + `ALL_RULES_BY_VERSION_CIRPASS` dispatch table. +- **4.2** `base.py` — 6 CR-coded rules (`CR001` `DPPIdentifierUniqueRule`, + `CR002` `ProductIdentifierShapeRule`, `CR003` `EffectivePeriodMonotonicRule`, + `CR004` `IssuedAtBeforeEffectiveRule`, `CR005` `BCP47LanguageTagRule`, + `CR006` `PreviousDPPDistinctRule`). +- **4.3** `substances.py` — 4 SUB-coded rules (`SUB001…SUB004`) covering + identifier-presence, hazard-category closure, mass-fraction bound, + lifecycle-stage closure (SOC v1.9.1 axioms). +- **4.4** `lca.py` — 4 LCS-coded rules (`LCS001…LCS004`): results + presence, impact-category closure (v1.9.4-Maki canonical set; + legacy `lca:` IRIs tolerated), unit normalisation against PEF 3.1 + / EN 15804+A2 conventional shorthand, reference-period monotonicity. +- **4.5** `actor.py` — 4 ACT-coded rules (`ACT001…ACT004`): + identifier presence, role closure, mandatory-economic-operator, + assignment-temporal monotonicity. +- **4.6** `connector.py` — 3 REL-coded rules (`REL001…REL003`): + predicate closure, subject/object distinctness, temporal + monotonicity. +- **4.7** `_PIPELINE_BY_FAMILY` dispatch table added to + `validators/engine.py` (UNTP: schema → model → semantic → JSON-LD; + CIRPASS: schema → model → semantic → SHACL). `ModelValidator` and + `SemanticValidator` extended with `family` axis; + `_MODEL_BY_FAMILY_VERSION` uses lazy callable for the CIRPASS + row so the cold-start contract (no eager `models.cirpass` import + on `import dppvalidator`) holds. `SchemaValidator._load_cirpass_schema` + prefers the Phase-3-derived `cirpass-reference-{version}.json`. + Detection extended to recognise the v1.3.0 message tree-view + shape (`dppIdentifier` + lowercase `product` + `issuedAt`). +- **4.8** Per-module SHACL pass at `validators/shacl_cirpass.py` — + one `pyshacl.validate` invocation per EUDPP module (P_DPP / SOC / + LCA / ACTOR / CON), each violation attributed via + ` v` source string. Shape-graph loader is + `lru_cache(maxsize=None)` keyed on + `(family, module, version, sha256_of_bundle)` so a vendored TTL + bump invalidates entries cleanly. `LCS-SHACL-UNAVAILABLE` info- + level diagnostic surfaces when the optional `[rdf]` extra is + missing — never raises at validate time. + +**Schema regeneration:** `tools/codegen/cirpass/derive_schema.py` flipped +from `mode='serialization'` to `mode='validation'` so the input-side +JSON Schema mirrors Pydantic's coercion (numeric Decimal accepted as +number / int / string). New SHA-pin in `schemas/registry.py` +(`3c957b6a5c6e9d92ae582e1c2acd20bb73820be8d27860cffdb5b721489025a6`) ++ `MANIFEST.json` bytes (44 145). Drift gate green. + +**Tests:** + +- `tests/unit/test_cirpass_v1_3_rules.py` — 84 unit tests covering + every rule's pass / fail paths + dispatch sanity + the + `_walk_localised_text` / `_walk_actors` helpers. +- `tests/integration/test_cirpass_v1_3_pipeline.py` — 13 end-to-end + tests against the bundled valid + invalid fixtures with stable + error-code attribution. +- `tests/integration/test_cross_family_isolation.py` — 9 tests + asserting UNTP / CIRPASS rule sets don't leak into each other's + pipeline. Symmetric reverse case included. +- `tests/integration/shacl/test_per_module_attribution.py` — 11 tests: + manifest sanity, SHA stability, two-run reproducibility, no bare + `"SHACL"` source string, `lru_cache` cache-hit reuse. +- `tests/unit/test_no_version_literals.py` — `cirpass_v1_3/__init__.py` + added to the version-literal allow-list (analogous to the parent + `rules/__init__.py` UNTP dispatch). + +**Quality gates after Phase 4:** + +- `uv run pytest tests/`: 2272 passed / 36 skipped (was 2260 in + Phase 3 — net +12 tests after the version-literal allow-list + change folded in). +- `uv run ruff check`: clean. +- `uv run ruff format --check`: clean. +- `uv run python tools/codegen/check_drift.py`: exit 0. +- CIRPASS-rule-module coverage: 98 % (411 stmts, 4 misses, 170 + branches, 4 partial — exceeds the 95 % exit criterion). +- `tests/unit/test_cold_start_import.py`: 4/4 passing — + `import dppvalidator` still does not eagerly load + `models.cirpass` or the new rule package. + +**Carried forward to later phases:** + +- Phase 5 (mapping shims) — translate UNTP 0.7.0 ↔ CIRPASS 1.3.0; + legacy `lca:` impact-category IRIs tolerated by `LCS002` until + the Phase 5 mapping shim normalises them. +- Phase 6 (exporters / CLI) — surface `--family cirpass` in CLI. +- Phase 8 (docs) — write `errors/CR001.md` … `errors/REL003.md` + per the `docs_url` references on each rule. +- Phase 4 left a single deferred surface: actual SHACL constraint + shapes (the v1.9.x EUDPP TTLs are OWL ontologies, not constraint + graphs). Phase 4.8 wires the *infrastructure*; bundling separate + `*.shacl.ttl` shapes is a Phase 8 / opportunistic 0.6.z task. + +--- + +### Phase 5 — UNTP ↔ CIRPASS compat shims + +**Goal:** Two-way mapping between UNTP DPP 0.7.0 and CIRPASS reference +structure v1.3.0; lossless subset documented; round-trip identity proven. +**Effort:** L (~1.5 weeks) · **Depends on:** Phase 3, Phase 4 · **Ships in:** `0.5.0` Preview. + +**Convention.** Mirrors [compat/upgrade_0_6_to_0_7.py](src/dppvalidator/compat/upgrade_0_6_to_0_7.py) +*style* (pure functions, deep-copy input, deterministic order, +structured warning codes). Step count is determined by the +transformation, not mirrored from the 17-step UNTP shim. + +**Warning codes (`MAP00X`).** Distinct from `UPG00X` (intra-family +upgrade). + +| Code | Meaning | +|---|---| +| `MAP001` | Lossy: target shape drops information | +| `MAP002` | Synthesised: required field synthesised from a donor | +| `MAP003` | Unmapped: no rule applied; raw passthrough | +| `MAP004` | Required-field-missing: source cannot supply target's required field | +| `MAP005` | Temporal collapse: source temporal semantics lossily folded | + +**Tasks** + +- **5.1** Author `src/dppvalidator/compat/_mapping_codes.py` exporting + `MAP_CODE_*` constants and a `MappingWarning` dataclass mirroring + `UpgradeWarning`. +- **5.2** Author `src/dppvalidator/compat/_untp_cirpass_map.py` — + declarative step table. +- **5.3** Author `src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py`: + `to_cirpass_1_3(data, *, country_lookup=None, + identifier_scheme_lookup=None) -> tuple[dict, list[MappingWarning]]`. +- **5.4** Author `src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py` + (reverse). +- **5.5** Author `src/dppvalidator/compat/_identifier_schemes.py` — + static lookup table between UNTP `IdentifierScheme.id` values + (GTIN, SPC, etc.) and CIRPASS expected scheme codes (G18 fix). + Unmapped values emit `MAP003`. +- **5.6** Implement temporal mapping: UNTP `validFrom`/`validUntil` → + CIRPASS `EffectivePeriod`; UNTP `issuanceDate` → CIRPASS `issuedAt`. + Reverse prefers re-emitting the same fields when present; emits + `MAP005` only on actual collapse (G17 fix). +- **5.7** Implement multilingual mapping: CIRPASS `LocalisedText` → + UNTP default-language string + `MAP001` listing dropped languages. + Reverse wraps a UNTP string in a single-entry `LocalisedText`, + language inferred from `country_lookup`-style hint (G16 fix). +- **5.8** Implement relation typing: UNTP `relatedParty.role` → CIRPASS + `Actor.role` mapped through `EUDPPRoleClass`. Unmapped values → + `EUDPPRoleClass.UNSPECIFIED` + `MAP002`. +- **5.9** Author `docs/concepts/untp-cirpass-mapping.md` — field-by- + field lossless-subset table. + +**Deliverables** + +- 5 compat-layer files (mapping codes, mapping table, two shims, two + lookup tables). +- Lossless-subset reference doc. + +**Tests** + +- `tests/integration/test_round_trip_untp_cirpass.py` — golden + fixtures both directions; warning-code totals asserted. +- `tests/property/test_round_trip_invariants.py` — Hypothesis-driven + identity over the lossless subset, both directions: + `to_cirpass_1_3(cirpass_1_3_to_untp_0_7(c))[0] == c` and + `cirpass_1_3_to_untp_0_7(to_cirpass_1_3(u))[0] == u`. Strategies + filtered to the lossless subset. +- `tests/unit/test_mapping_codes.py` — every `MAP00X` code has a + reproducible fixture. +- `tests/integration/test_i18n_roundtrip.py` — multilingual CIRPASS + → UNTP → CIRPASS asserts exactly one `MAP001` per dropped language. + +**Exit criteria** + +- [x] Lossless-subset table published. +- [x] All `MAP00X` codes have a reproducible fixture. +- [x] Property test (200 examples, default profile) green both + directions. + +#### Phase 5 status — 2026-05-08 (complete) + +All 9 tasks (5.1 → 5.9) closed end-to-end. All 3 exit criteria met. + +**Tasks landed:** + +- **5.1** `src/dppvalidator/compat/_mapping_codes.py` — 5 `MAP_CODE_*` + constants + `MappingWarning` dataclass with 5 factory methods + (`lossy`, `synthesised`, `unmapped`, `required_missing`, + `temporal_collapse`). `MappingSeverity` enum mirrors + `UpgradeSeverity`; `DEFAULT_SEVERITY_BY_CODE` ladder pinned. +- **5.2** `src/dppvalidator/compat/_untp_cirpass_map.py` — 15-row + declarative step table (M01 → M15) covering identifiers, names, + temporal, classification, materials, actors, performance, and + CIRPASS-only fields. `lossless_step_ids()`, `codes_for_step()`, + `all_codes_in_use()` helpers feed both shims. +- **5.3** `src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py`: + `to_cirpass_1_3(data, *, default_language='en', country_lookup=None, + identifier_scheme_lookup=None)` — pure forward shim, deep-copy + input, deterministic output, structured `MAP00X` warnings. ~700 + lines. +- **5.4** `src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py`: + `to_untp_0_7(data, *, default_language='en', issuer_did, + issuer_name, untp_id_granularity, country_lookup, + identifier_scheme_lookup)` — pure reverse shim with caller- + controlled synthesis defaults for fields CIRPASS doesn't carry + (idGranularity, producedAtFacility, countryOfProduction). +- **5.5** `src/dppvalidator/compat/_identifier_schemes.py` — static + lookup table covering 10 commonly-seen schemes (GS1 GTIN, GS1 + Digital Link, GLEIF LEI, ISO/IEC 15459, EORI, EUID, DUNS, WCO HS, + EU TARIC, EU CPV) + alias index. `to_cirpass(uri, name)` and + `to_untp(uri, name)` canonicalise + return mapping rows; + unmapped values pass through with `MAP003`. URI synthesis + fallback (`_scheme_name_from_uri`) covers schemes outside the + table. +- **5.6** Temporal mapping wired in M07 / M08 — UNTP `validFrom` + → CIRPASS `issuedAt.timestamp` (lossless); UNTP + `(validFrom, validUntil)` → CIRPASS `effectivePeriod` (lossless + when both endpoints populated; `MAP005` when forward synthesises + open-ended period). +- **5.7** Multilingual mapping wired in M05 / M06 / M09 / M10 — + CIRPASS `LocalisedText[]` → UNTP scalar drops every entry whose + language ≠ caller's `default_language` with one `MAP001` per + dropped language. Forward wraps UNTP scalars in a single-entry + list with the caller-supplied default language (emits `MAP002`). +- **5.8** Role-typing wired in M11 / M12 — UNTP `PartyRole.role` + ↔ EUDPP `EUDPPRoleClass` IRI through static maps + (`_UNTP_TO_EUDPP_ROLE` and `_EUDPP_TO_UNTP_ROLE`). Unmapped UNTP + roles fall back to `EconomicOperatorRole` with `MAP002`; + unmapped EUDPP roles fall back to `manufacturer` with `MAP002`. + M12 lifts the UNTP envelope `issuer` into a synthesised + manufacturer-role actor when `relatedParty[]` doesn't include + one. +- **5.9** [docs/concepts/untp-cirpass-mapping.md](../concepts/untp-cirpass-mapping.md) + — field-by-field lossless-subset reference. Documents the 5 + warning codes, the 15 mapping steps (with paths + lossless + flags), the lossless subset (round-trip identity-preserving + fields), the lossy / synthesised / scheme / role tables, and + the property-test invariant. + +**Public API additions:** + +- `dppvalidator.compat.to_cirpass_1_3(data, **kwargs) -> tuple[dict, list[MappingWarning]]` +- `dppvalidator.compat.to_untp_0_7(data, **kwargs) -> tuple[dict, list[MappingWarning]]` +- `dppvalidator.compat.MappingWarning` (frozen dataclass) +- `dppvalidator.compat.MappingSeverity` (str-Enum) +- `dppvalidator.compat.MAP_CODES` (canonical-order tuple) +- `dppvalidator.compat.MAP_CODE_LOSSY`, `MAP_CODE_SYNTHESISED`, + `MAP_CODE_UNMAPPED`, `MAP_CODE_REQUIRED_FIELD_MISSING`, + `MAP_CODE_TEMPORAL_COLLAPSE` + +**Tests:** + +- `tests/unit/test_mapping_codes.py` — 29 tests: code constants + + factory methods + step-table coverage + per-code reproducible + fixtures + warning ordering / determinism. +- `tests/unit/test_identifier_schemes.py` — 15 tests: bundled + table coverage + canonical/alias resolution + URI synthesis + fallback + collision absence. +- `tests/integration/test_round_trip_untp_cirpass.py` — 19 tests: + forward / reverse model-validity, determinism, lossless-subset + round-trip identity, MAP-code totals, CIRPASS-pilot fixtures + (minimal / multilingual / full), defensive type-error guards. +- `tests/integration/test_i18n_roundtrip.py` — 6 tests: per- + language `MAP001` count invariants, default-language fallback, + forward synthesis of LocalisedText, round-trip preservation of + the picked language. +- `tests/property/test_round_trip_invariants.py` — 4 Hypothesis + tests at 200 examples each: CIRPASS round-trip, UNTP round-trip, + forward purity, reverse purity. Strategies constrained to the + documented lossless subset; the picked-language / + registered-scheme constraints are explicit. + +**Quality gates after Phase 5:** + +- `uv run pytest tests/`: **2345 passed / 36 skipped** (+73 + Phase 5 tests on the Phase 4 baseline of 2272). +- `uv run ruff check`: clean. +- `uv run ruff format --check`: clean. +- `uv run ty check src/dppvalidator/compat/`: clean. +- Compat package coverage: `_mapping_codes.py` 100%, + `_identifier_schemes.py` 100%, `_untp_cirpass_map.py` 100%, + `untp_0_7_to_cirpass_1_3.py` 85%, `cirpass_1_3_to_untp_0_7.py` + 82% (the shim coverage trails because synthesis-fallback paths + fire only on malformed inputs that model validation rejects + upstream — those branches are guard rails). +- Project-wide coverage: **91.69 %** (above the 90% project + threshold). + +**Carried forward to later phases:** + +- Phase 6 (exporters / CLI) — surface `dppvalidator migrate + --target {untp|cirpass}` driving the shims. +- Phase 7 (Tyres / Textile pilots) — extend `_step_lca` (M13) + with pilot-aware lifts so PEF impact-result claims round-trip + rather than dropping with `MAP001`. +- Phase 8 (docs) — write `errors/MAP001.md` … `errors/MAP005.md` + to round out the public error reference. +- The current shim is a one-shot projection, *not* a fully-typed + Pydantic-model-driven mapper. Phase 10 / 0.6.x may revisit if + pilot uptake warrants deeper integration with the model layer. + +--- + +### Phase 6 — Exporters & CLI surface + +**Goal:** First-class CLI access to CIRPASS reference-structure output +and refreshed EUDPP-LD output. Backwards-compatible defaults preserved. +**Effort:** M (~5 days) · **Depends on:** Phase 5 · **Ships in:** `0.5.0` Preview. + +**Tasks** + +- **6.1** Author `src/dppvalidator/exporters/cirpass_jsonld.py` — + emits CIRPASS reference structure v1.3.0 shape with the bundled + v1.9.1 context. +- **6.2** Update [exporters/eudpp_jsonld.py](src/dppvalidator/exporters/eudpp_jsonld.py) + to default to v1.9.1 namespaces; old `EUDPP_CONTEXT_URL` continues + to resolve with a `DeprecationWarning` until Phase 10. +- **6.3** Extend `dppvalidator validate` with optional + `--target {auto|untp|cirpass}` (default `auto` ⇒ + `detect_schema_family`); explicit `--target` is an *override* and + surfaces `DET001` if it contradicts the payload. +- **6.4** Extend `dppvalidator export` with + `--format {json|jsonld|eudpp-jsonld|cirpass-jsonld}`. The first three + exist; `cirpass-jsonld` is new. +- **6.5** Generalise `dppvalidator migrate` with + `--to {untp-0.7|cirpass-1.3}`. Default `--to=untp-0.7` (back-compat + with the existing 0.6 → 0.7 migrate behaviour). +- **6.6** Update `dppvalidator schema list` to show family + version + columns and a `default` marker per family. +- **6.7** Document CLI exit codes: + `0` valid; `2` validation errors; `3` family mismatch (`DET001`); + `4` upgrade/mapping warnings (with `--strict`); `5` IO/parse error. + Captured in `docs/reference/cli/exit-codes.md`. + +**Deliverables** + +- New CIRPASS-LD exporter. +- Refreshed EUDPP-LD exporter (canonical IRIs). +- Three new/extended CLI flags + exit-code table. + +**Tests** + +- `tests/integration/test_cli_cirpass.py` — golden CLI runs. +- `tests/integration/test_cli_export_matrix.py` — every `--format` × + every fixture; output is schema-valid where applicable. +- `tests/integration/test_cli_back_compat.py` — every previously-valid + CLI invocation yields the same exit code and a stable message body. + +**Exit criteria** + +- [x] `uv run dppvalidator export --format cirpass-jsonld + ` produces a payload accepted by the Phase 4 + CIRPASS pipeline. +- [x] All pre-existing CLI invocations bit-stable (golden snapshots). + +#### Phase 6 status — 2026-05-08 (complete) + +All 7 tasks (6.1 → 6.7) closed end-to-end. Both exit criteria met. + +**Tasks landed:** + +- **6.1** `src/dppvalidator/exporters/cirpass_jsonld.py`: + `CIRPASSJsonLDExporter` + `export_cirpass_jsonld` / + `export_cirpass_jsonld_dict` convenience functions. The exporter + accepts a native :class:`ReferencePassport`, a UNTP envelope + (`DigitalProductPassport` of either v0.6 or v0.7 — duck-typed via + `model_dump`), or a parsed dict (auto-detected by structural + signature). UNTP envelopes route through the Phase 5 forward shim; + mapping warnings surface via `last_mapping_warnings`. Output's + `@context` always carries the canonical EUDPP v1.9.1 namespace + binding alongside the W3C VC v2 context. +- **6.2** `EUDPP_CONTEXT_URL` in `exporters/eudpp_jsonld.py` + deprecated via PEP 562 module `__getattr__` — accessing the + legacy constant now emits a `DeprecationWarning` referencing the + new canonical alias `EUDPP_CANONICAL_CONTEXT_URL`. The legacy URL + remains resolvable through Phase 10 of the migration plan. + `exporters/__init__.py` re-export uses a sibling `__getattr__` + hook so the warning fires at use site (not at package-import time) + — `from dppvalidator.exporters import EUDPP_CONTEXT_URL` triggers + exactly one warning. The actual JSON-LD output of + :class:`EUDPPJsonLDExporter` already used the v1.9.1 namespaces + (Phase 1 work); no output bytes shifted. +- **6.3** `dppvalidator validate --target {auto,untp,cirpass}` + added to [src/dppvalidator/cli/commands/validate.py](../../src/dppvalidator/cli/commands/validate.py). + `auto` (default) runs the existing detection. Explicit `untp` / + `cirpass` is treated as an *override*: if the detected family + contradicts the user's pin, the command exits with code 3 + (`EXIT_FAMILY_MISMATCH`) and emits a `DET001` ValidationError + carrying both the configured target and the detected family in + `context`. Detection-agreement and no-detection-signal paths + fall through silently to ordinary validation. +- **6.4** `dppvalidator export --format + {jsonld,json,eudpp-jsonld,cirpass-jsonld}` extended in + [src/dppvalidator/cli/commands/export.py](../../src/dppvalidator/cli/commands/export.py). + `--default-language` kwarg threads through to the CIRPASS + exporter. Mapping warnings emitted by the CIRPASS forward shim + go to **stderr** so stdout stays pipe-clean for `... | jq` + consumers. The legacy `jsonld` / `json` output bytes are + bit-stable. +- **6.5** `dppvalidator migrate --to {untp-0.7,cirpass-1.3}` + + `--default-language` added to + [src/dppvalidator/cli/commands/migrate.py](../../src/dppvalidator/cli/commands/migrate.py). + `--to=untp-0.7` (default) is the pre-Phase-6 UNTP 0.6 → 0.7 path, + byte-stable. `--to=cirpass-1.3` runs the Phase 5 forward shim; + blocking `MAP00X` warnings exit with code 4 + (`EXIT_BLOCKING_WARNINGS`); `--accept-warnings` lets the write + proceed. Sidecar warnings file shape extended with `family_from` + /`family_to` keys; `MappingWarning.details` (a tuple) is + flattened to a dict for JSON serialisation. +- **6.6** `dppvalidator schema list` rewritten on top of + `SCHEMA_REGISTRY_BY_FAMILY` and `DEFAULT_VERSIONS`. The table + now has Family, Version, Default, Bundled, Contexts columns and + shows UNTP 0.6.0/0.6.1/0.7.0 + CIRPASS 1.3.0 in one view, sorted + family-then-version with per-family default markers. +- **6.7** [docs/reference/cli/exit-codes.md](../reference/cli/exit-codes.md) + documents the full exit-code surface: `0` valid, `1` invalid, + `2` engine error, `3` family mismatch (DET001), `4` blocking + warnings (`UPG`/`MAP`-coded), `5` IO/parse error. Constants + centralised at module level in + [src/dppvalidator/cli/main.py](../../src/dppvalidator/cli/main.py) + (`EXIT_VALID`, `EXIT_INVALID`, `EXIT_ERROR`, `EXIT_FAMILY_MISMATCH`, + `EXIT_BLOCKING_WARNINGS`, `EXIT_IO_ERROR`). + +**Public API additions:** + +- `dppvalidator.exporters.CIRPASSJsonLDExporter` +- `dppvalidator.exporters.export_cirpass_jsonld` +- `dppvalidator.exporters.export_cirpass_jsonld_dict` +- `dppvalidator.exporters.EUDPP_CANONICAL_CONTEXT_URL` +- `dppvalidator.cli.main.EXIT_FAMILY_MISMATCH` (= 3) +- `dppvalidator.cli.main.EXIT_BLOCKING_WARNINGS` (= 4) +- `dppvalidator.cli.main.EXIT_IO_ERROR` (= 5) +- `dppvalidator.cli.commands.migrate.MIGRATE_TARGETS` + +**Tests:** + +- `tests/integration/test_cli_cirpass.py` — 16 tests covering + every Phase 6 CLI surface end-to-end: `--target` agreement / + contradiction, `--format cirpass-jsonld` / `eudpp-jsonld` + outputs, `--default-language` threading, `migrate --to + cirpass-1.3` write paths (with and without + `--accept-warnings`), `schema list` family display, exit-code + integer stability. +- `tests/integration/test_cli_back_compat.py` — 19 tests pinning + the back-compat contract: every pre-Phase-6 invocation produces + the same exit code and structural-message-body shape; the new + `--target auto` matches the no-flag baseline; `--format json` / + `--format jsonld` outputs are bit-stable; new exit codes don't + shadow pre-existing ones. +- `tests/integration/test_cli_export_matrix.py` — 13 + parametrised tests covering format × fixture matrix: + `{json,jsonld,eudpp-jsonld,cirpass-jsonld}` × UNTP 0.7.0 fixture + plus cirpass-jsonld × every CIRPASS valid fixture; `--compact` + collapses whitespace; `-o`/`--output` writes to file. +- Existing CLI tests (`tests/unit/test_cli.py`, + `tests/unit/test_cli_migrate.py`, + `tests/integration/test_cli_workflows.py`) updated to use the + new `EXIT_IO_ERROR` (5) for missing-file / parse-error paths + and `EXIT_BLOCKING_WARNINGS` (4) for blocking-migrate paths. + +**Quality gates after Phase 6:** + +- `uv run pytest tests/`: **2393 passed / 36 skipped** (+48 + Phase 6 tests on the Phase 5 baseline of 2345). +- `uv run ruff check`: clean. +- `uv run ruff format --check`: clean. +- `uv run ty check src/dppvalidator/exporters/ src/dppvalidator/cli/`: clean. +- `uv run python tools/codegen/check_drift.py`: exit 0. + +**Carried forward to later phases:** + +- Phase 7 (Tyres / Textile pilots) — wire pilot-aware lifts into + the `_step_lca` (M13) path so LCA performance claims round-trip + rather than dropping with `MAP001`. +- Phase 8 (docs) — author `errors/DET001.md` / + `errors/MAP001.md … MAP005.md` and a CLI cookbook. +- Phase 9 (release cut) — flip `DEFAULT_VERSIONS[UNTP]` to + `0.7.0`; the new `schema list` already shows the per-family + default marker so the cutover is a one-line registry change. +- Phase 10 (cleanup) — remove the deprecation-warned + `EUDPP_CONTEXT_URL` legacy constant; remove the legacy + EXIT_ERROR fallback for IO failures (currently distinct from + EXIT_IO_ERROR). + +--- + +### Phase 7 — Pilot refreshes (Textile v2, Tyres) + +**Goal:** Bring textiles plugin to MVP Textile DPP v2; scaffold tyres +plugin against the GDSO declarations. +**Effort:** M (~5 days) · **Depends on:** Phase 5 (parallelisable with Phase 6) · **Ships in:** `0.5.0` Preview. + +**Tasks** + +- **7.1** Author the Textile v2 rule pack against MVP Textile DPP v2 + (2025-12-04). Old v1 rules remain available behind a + `--profile textile-v1` flag. +- **7.2** Register two distinct entry-points (`textile-v1`, + `textile-v2`) under `dppvalidator.validators` group; the `--profile` + flag selects. +- **7.3** Create `plugins/tyres/pyproject.toml` with the license + decided in OA-1 (default GPL-3.0). +- **7.4** Add `plugins/tyres/LICENSE`. +- **7.5** Implement `plugins/tyres/dppvalidator_tyres/models/` for + Birth v0.9, Collection v0.1, Retread v0.1, Recycling v0.1, plus the + Tyre Lifecycle History v1 wrapper. +- **7.6** Implement `plugins/tyres/dppvalidator_tyres/validators/` — + rule codes `TYR001…`. +- **7.7** Register validator + exporter entry-points. +- **7.8** Author `docs/plugins/tyres.md`. Mark plugin + `Pre-1.0 / Experimental` in README and CLI help (Birth v0.9 and + Recycling v0.1 versions are still moving). +- **7.9** Add CI gate `tools/check_imports.py` — fails if any module + under `src/dppvalidator/` imports from any `plugins/*` package + (R8 mitigation). + +**Deliverables** + +- Refreshed `plugins/textiles/` with v1/v2 profile selection. +- New `plugins/tyres/` plugin (pre-1.0). +- Import-graph CI gate. + +**Tests** + +- Each plugin under `tests/plugins//`. +- Integration: activate the plugin via entry-points, run the full + pipeline on a pilot fixture. +- `tests/plugins/test_license_isolation.py` — re-asserts no + `from dppvalidator_textiles import …` or + `from dppvalidator_tyres import …` in the core import graph. + +**Exit criteria** + +- [x] `uv run dppvalidator validate plugins/tyres/samples/birth.json` + returns zero errors against the GDSO Birth v0.9 spec. +- [x] Textile v1 fixtures still pass under `--profile textile-v1`. +- [x] License-isolation gate green. + +#### Phase 7 status — 2026-05-08 (complete) + +All 9 tasks (7.1 → 7.9) closed end-to-end. All 3 exit criteria met. + +**Tasks landed:** + +- **7.1** New built-in `src/dppvalidator/validators/rules/v0_7/textile_v2.py` + with the MVP Textile DPP v2 (2025-12-04) rule pack: 7 rules + (TXT001…TXT007). Tighter than v1 — TXT002 enforces strict + mass-fraction sum (±0.005 vs v1's ±0.01) plus mandatory + `materialType.code`; TXT003/TXT004/TXT005 promoted from + `info` to `error`; TXT006 (recycled-content disclosure) and + TXT007 (repair / spare-parts info) are new in v2. +- **7.2** Profile dispatch wired end-to-end. `TEXTILE_PROFILES` + registry at `validators/rules/__init__.py` keyed on + `textile-v1` / `textile-v2`; `SemanticValidator.__init__` + accepts `profile=…` and *replaces* the v1 textile rules with + the chosen pack (rule swap rather than additive). Engine + threads `profile` through `_init_validators`; CLI exposes + `validate --profile {textile-v1,textile-v2}`. Default + (`profile=None`) keeps the pre-Phase-7 behaviour where no + textile rules run on a v0.7 engine — non-textile DPPs see no + TXT diagnostics. +- **7.3** New `plugins/tyres/pyproject.toml` declaring + `dppvalidator-tyres==0.1.0`, GPL-3.0-or-later (per OA-1). + Eight validator entry-points + one exporter entry-point + registered under the `dppvalidator.validators` / + `dppvalidator.exporters` groups for auto-discovery via + `src/dppvalidator/plugins/registry.py`. +- **7.4** `plugins/tyres/LICENSE` — short pointer file with the + SPDX identifier `GPL-3.0-or-later`, copyright notice, and a + link to the canonical license text. The `pyproject.toml` SPDX + declaration carries the legal weight. +- **7.5** Tyres models package (`plugins/tyres/src/dppvalidator_tyres/models/`): + five Pydantic v2 models — `Birth`, `Collection`, `Retread`, + `Recycling` (with `RecoveryFraction` sub-model), and the + aggregate `TyreLifecycleHistory`. Field-level validators pin + DOT plant/size codes, ETRTO speed-rating letters, ETRTO + size dimensions, GLEIF LEI shape, and recovery-fraction sum + ≤ 1.0. The aggregate enforces three cross-event invariants: + UUID chain, chronological order, and single-Recycling + terminator. +- **7.6** Tyres validators (`plugins/tyres/src/dppvalidator_tyres/validators/rules.py`): + eight TYR-coded rules walking the + `credentialSubject.extensions.tyreLifecycleHistory` extension + slot. Non-tyre passports are silently skipped — every rule + returns `[]` when the extension is absent. The rules + decouple from the plugin's own model classes (read raw + dicts) so they survive future model refactors. +- **7.7** Validator + exporter entry-points registered via + `[project.entry-points]` in the plugin's `pyproject.toml`. + Verified: `from dppvalidator.plugins.discovery import + list_available_plugins; print(list_available_plugins())` + surfaces all 8 TYR validators + 1 CSV exporter. The plugin + works alongside the in-tree example plugin + (`examples/dppvalidator_example_plugin`) without entry-point + conflicts. +- **7.8** [docs/plugins/tyres.md](../plugins/tyres.md) — full + reference: install instructions, wire-shape contract, rule + table, model table, exporter usage, license rationale, + Pre-1.0 status note. The plugin's own + `plugins/tyres/README.md` is the PyPI-side equivalent. +- **7.9** [tools/check_imports.py](../../tools/check_imports.py) + CI gate: walks every `.py` file under `src/dppvalidator/` + via `ast`, parses imports, and fails (exit 1) when any + import resolves to a top-level package under `plugins/*`. + The forbidden-package list is auto-derived from + `plugins//pyproject.toml` so adding a new plugin + doesn't require updating the script. Includes a synthetic- + violation test that injects a fake offending file under + `tmp_path` and confirms the gate fires. + +**Public API additions:** + +- `dppvalidator.validators.rules.TEXTILE_PROFILES` — + profile-keyed dispatch table. +- `dppvalidator.validators.rules.DEFAULT_TEXTILE_PROFILE` — + stable default (`"textile-v1"`). +- `dppvalidator.validators.rules.TEXTILE_RULES_V0_7_V2` — v2 + rule list re-exported from the v0.7 textile module. +- `ValidationEngine(profile=...)` and `SemanticValidator(profile=...)` + kwargs. +- `validate --profile {textile-v1,textile-v2}` CLI flag. + +**Tests:** + +- `tests/plugins/tyres/test_tyres_models.py` — 22 tests: + per-model field validation + the `TyreLifecycleHistory` + cross-event invariants (UUID chain, chronological order, + single-Recycling, no-events-after-Recycling). +- `tests/plugins/tyres/test_tyres_validators.py` — 29 tests: + per-rule positive (clean fixture) + negative (broken fixture + → expected violation), plus rule-set sanity (all 8 TYR rules + registered, every rule carries `rule_id`/`description`/ + `severity`/`suggestion`/`docs_url`), purity check (rules + don't mutate input), and skip-on-non-tyre-passport + invariant. +- `tests/plugins/tyres/test_tyres_pipeline.py` — 7 + end-to-end tests: birth.json validates with zero errors + (Phase 7 exit criterion §1), TYR rules discoverable via + `list_available_plugins()`, CSV exporter flattens the + history correctly, broken-fixture path fires TYR001 through + the engine pipeline. +- `tests/plugins/test_license_isolation.py` — 5 tests: + AST-walk-based check that no core file imports a forbidden + plugin package, the CI gate exits 0 on the current tree, + the gate detects a synthetic violation, the + forbidden-package list is complete, and the legal direction + (plugin → core) still works. +- `tests/integration/test_textile_profiles.py` — 12 tests: + registry shape, severity-promotion contract, engine profile + threading (`textile-v1` keeps v1 rules, `textile-v2` + swaps in v2 rules), CLI `--profile` flag accepted, help + text lists both profiles, back-compat invariant + (`--profile textile-v1` matches no-profile baseline on + non-textile fixture), and v2-only TXT006/TXT007 fire on a + textile payload. + +**Quality gates after Phase 7:** + +- `uv run pytest tests/`: **2468 passed / 36 skipped** + (+75 Phase 7 tests on the Phase 6 baseline of 2393). +- `uv run ruff check`: clean. +- `uv run ruff format --check`: clean. +- `uv run ty check`: clean (compat / cli / exporters / + validators / plugin tree). +- `uv run python tools/check_imports.py`: exit 0 on the + current core tree. + +**Carried forward to later phases:** + +- Phase 5's M13 lossy stub (UNTP performanceClaim → CIRPASS + LCA) can now be tightened by a Phase 7-style pilot: a + textile-v2-aware lift could project the v0.7 microplastic + / durability claims onto CIRPASS LCAResult entries, since + the Textile v2 pack standardises the conformityTopic + spelling. Tracked for a follow-on pilot iteration. +- The textiles plugin proper (`plugins/textiles/`) was *not* + scaffolded in this phase — the Textile v2 rules ship as a + built-in profile rather than an out-of-tree plugin, on the + rationale that they're additive to the core and don't have + their own GPL-3.0 upstream constraints (unlike the GDSO + declarations the tyres plugin tracks). Phase 10 may revisit + if we want the textile v2 rules behind a plugin boundary + for license-clarity reasons. +- Phase 8 (docs) — extend the rule reference with + `errors/TXT001.md … TXT007.md` and `errors/TYR001.md … + TYR008.md`, and surface the `--profile` flag in the CLI + cookbook. + +--- + +### Phase 8 — Documentation + +**Goal:** Single coherent reading path from "what is CIRPASS" to "how +do I ship it". +**Effort:** M (~3 days) · **Depends on:** Phase 6, Phase 7 · **Ships in:** `0.5.0` Preview. + +**Tasks** + +- **8.1** Author `docs/concepts/cirpass-2-alignment.md` (orientation; + supersedes `docs/concepts/eudpp-ontology-alignment.md`, which becomes + a one-paragraph stub redirecting forward and is removed in Phase 10). +- **8.2** Finalise `docs/concepts/eudpp-1.9-changelog.md` (drafted in + Phase 1). +- **8.3** Finalise `docs/concepts/untp-cirpass-mapping.md` (drafted in + Phase 5). +- **8.4** Author `docs/guides/migrate-untp-to-cirpass.md` — user-facing + how-to with before/after JSON snippets, parallel to + `docs/guides/migration-0-6-to-0-7.md`. +- **8.5** Generate `docs/reference/cirpass/` from CIRPASS Pydantic + models via `mkdocstrings`. +- **8.6** Update `mkdocs.yml` nav. +- **8.7** Update `README.md` "what specs does this support?" matrix + with two family columns. +- **8.8** Capture remaining ADRs from this phase / Phase 9 cuts in + `docs/adr/` (naming convention `NNNN-short-slug.md`). + +**Deliverables** + +- 3 concept docs, 1 guide, 1 reference section, README + nav updates. + +**Tests** + +- `uv run mkdocs build --strict` produces zero warnings. +- Link-checker run (no broken intra-doc links). + +**Exit criteria** + +- [x] `/docs-health` clean. +- [x] `mkdocs build --strict` clean. +- [x] All cross-links resolve. + +#### Phase 8 status — 2026-05-09 (complete) + +All 8 tasks (8.1 → 8.8) closed end-to-end. All 3 exit criteria met. + +**Tasks landed:** + +- **8.1** [`docs/concepts/cirpass-2-alignment.md`](../concepts/cirpass-2-alignment.md) + authored — orientation page covering: what CIRPASS-2 is, the + six EUDPP modules + versions, the two-family architecture + (UNTP / CIRPASS), the per-family pipelines (UNTP runs JSON-LD + layer; CIRPASS runs SHACL), the validator rule-prefix table + (16 rows: SEM/VOC/CQ/JLD/MDL/VER/UPG/CR/SUB/LCS/ACT/REL/MAP/ + DET/TXT/TYR), pilot profiles, cross-family mapping, and a + reading-path matrix linking out to the deeper docs. The legacy + [`eudpp-ontology-alignment.md`](../concepts/eudpp-ontology-alignment.md) + is left in place; Phase 10 collapses it to a redirecting stub + per the task description. +- **8.2** [`docs/concepts/eudpp-1.9-changelog.md`](../concepts/eudpp-1.9-changelog.md) + promoted from "Phase 1 scaffold" header to "Final + (Phase 8 finalisation, 2026-05-09)". Cross-tree links + rewritten to GitHub URLs so the strict build resolves them. +- **8.3** [`docs/concepts/untp-cirpass-mapping.md`](../concepts/untp-cirpass-mapping.md) + promoted from "Phase 5 reference" to "Final + (Phase 8 finalisation, 2026-05-09)". Adds an explicit + pointer to the user-facing migration guide. Cross-tree + links rewritten. +- **8.4** [`docs/guides/migrate-untp-to-cirpass.md`](../guides/migrate-untp-to-cirpass.md) + authored — user-facing how-to with CLI invocations + (UNTP→CIRPASS, CIRPASS→UNTP, JSON-LD shortcut), Python + example, before / after JSON snippets, the five `MAP00X` + warning codes table, the documented lossless subset, and + limitation notes. Pattern-matches + [`docs/guides/migration-0-6-to-0-7.md`](../guides/migration-0-6-to-0-7.md). +- **8.5** [`docs/reference/cirpass/index.md`](../reference/cirpass/index.md) + authored — auto-generated CIRPASS Pydantic API reference via + `mkdocstrings`. Documents `ReferencePassport` (root) plus 16 + sub-models grouped by topic (Product / Actor / Material / + Substances / LCA / Connector / i18n / Temporal). Front-matter + links back to the concept-doc reading path. +- **8.6** [`mkdocs.yml`](../../mkdocs.yml) nav extended: + new `Migrate UNTP → CIRPASS` guide entry, new `CLI Exit Codes` + reference entry, new `CIRPASS Models` reference entry, new + `Concepts` rows for CIRPASS-2 alignment / spec-snapshot / + v1.9 changelog / UNTP↔CIRPASS mapping (legacy entries kept, + marked `(legacy)`), new top-level `Plugins` section with the + tyres pre-1.0 page. +- **8.7** [`README.md`](../../README.md) "Supported versions" → + "Supported specs" rewrite. Now shows two family tables (UNTP + DPP 0.6.0/0.6.1/0.7.0 + CIRPASS 1.3.0); summarises the three + migration shim invocations (UNTP intra-family + UNTP↔CIRPASS + cross-family); lists pilot profiles + tyres plugin; ends with + a reading-path matrix. +- **8.8** Two new ADRs: + - [ADR 0004 — Textile v2 ships as a built-in profile](../adr/0004-textile-v2-built-in.md): + documents the Phase 7 design choice not to scaffold an + out-of-tree `plugins/textiles/` for v2 (vs the GPL-3.0 + tyres plugin). + - [ADR 0005 — CLI exit-code surface](../adr/0005-cli-exit-codes.md): + documents the Phase 6 §6.7 six-code surface, the migration + impact (legacy `EXIT_ERROR=2` semantics partly overlap with + new `EXIT_IO_ERROR=5`), and the public-contract guarantee. + +**Strict-mode link cleanup:** + +The Phase 8 docs deliberately link out to source files +(`src/dppvalidator/...`), tooling (`tools/...`), tests +(`tests/...`), plugin packages (`plugins/tyres/...`), and the +local rules directory (`.claude/rules/...`). These paths are +*outside* the mkdocs build tree and consequently fail strict +mode unless absolute URLs are used. A one-shot script rewrote +28 such links across 7 files (the 5 new docs + 2 pre-existing +finalised ones) to `https://github.com/artiso-ai/dppvalidator/{blob,tree}/main/...`. +Result: `uv run mkdocs build --strict` is clean (zero +WARNINGs; the only INFO-level diagnostics are about the +`docs/plans/` excluded-from-build directory, which is by design +per the existing `mkdocs.yml` `exclude_docs` setting). + +**Quality gates after Phase 8:** + +- `uv run pytest tests/`: **2468 passed / 36 skipped** (no test + deltas vs Phase 7 — Phase 8 is docs-only). +- `uv run ruff check`: clean. +- `uv run ruff format --check`: clean. +- `uv run mkdocs build --strict`: clean (zero WARNINGs). + +**Carried forward to later phases:** + +- The legacy concept pages + [`concepts/eudpp-ontology-alignment.md`](../concepts/eudpp-ontology-alignment.md) + and [`concepts/cirpass-implementation.md`](../concepts/cirpass-implementation.md) + remain on the nav as `(legacy)` entries. Phase 10 collapses + them to one-paragraph stubs that redirect to + `cirpass-2-alignment.md` and removes them entirely. +- Per-rule error pages for the new rule prefixes (`CR001… + CR006`, `SUB001…SUB004`, `LCS001…LCS004`, `ACT001…ACT004`, + `REL001…REL003`, `MAP001…MAP005`, `DET001`, `TXT006/TXT007`, + `TYR001…TYR008`) — placeholders not yet authored. Phase 9 + release cut decides whether to ship them in 0.5.0 or defer + to 0.5.1. + +#### Phase 8.5 status — 2026-05-09 (pre-release polish, complete) + +A surgical polish pass before the Phase 9 release cut. Goal: +remove duplication, drop dead args, keep the package lean + +sustainable. **Zero behaviour changes** — same 2468 tests pass +with identical exit codes and warning multisets. + +**Refactors applied:** + +- New module + [`src/dppvalidator/compat/_shared.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/_shared.py) + centralises two helpers previously duplicated in both shim + modules: + - `normalise_iso8601(value)` — datetime / string ISO 8601 + normalisation. The forward shim's version had dead parse- + then-return-anyway code; the lifted helper unifies both + branches into a single permissive pass-through. + - `pick_localised(items, default_language)` — pick a single + string from a LocalisedText list with dropped-language + accounting. Previously only on the reverse shim. +- New module + [`src/dppvalidator/validators/rules/cirpass_v1_3/_helpers.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/validators/rules/cirpass_v1_3/_helpers.py) + centralises `parse_iso_datetime(value)`, previously duplicated + in `actor.py` and `connector.py`. +- `_step_lca` (private) lost its unused `default_language` + kwarg — the LCA stub doesn't emit any localised content. Plan + spec §5.3's public `to_cirpass_1_3` signature is unchanged + (the `default_language` / `country_lookup` / `identifier_scheme_lookup` + public kwargs all stay; the latter two are documented as + reserved-for-future-use). +- The forward shim no longer imports `datetime` directly — the + helper does it. The reverse shim drops `datetime` from its + imports entirely, and gains explicit `_shared.normalise_iso8601` + / `_shared.pick_localised` re-imports under their original + private aliases for call-site stability. + +**Surface stability:** + +- Public API unchanged. `dppvalidator.compat.to_cirpass_1_3` / + `to_untp_0_7` / `MappingWarning` / `MAP_CODE_*` / etc. all + resolve identically. +- Internal helper module (`_shared.py`) is private (underscore + prefix); not added to `compat/__init__.py.__all__`. +- Cold-start contract still holds (verified via + `tests/unit/test_cold_start_import.py`); the new helper + modules don't pull cirpass models at import time. + +**Diagnostic intentionally retained:** + +- `shacl_cirpass.py` defensive guards (family check, unknown- + module check, version-mismatch check) are kept. Coverage on + the module is 62% — the uncovered branches are violation- + extraction paths that fire only when bundled SHACL constraint + graphs produce non-conforming results, and the bundled v1.9.x + TTLs are pure OWL ontologies (no SHACL constraints). Phase 8 + authors of bundled SHACL shape graphs (deferred 0.6.x work) + will exercise those paths. +- The forward shim's documented-but-unused `country_lookup` + kwarg stays for plan-spec parity (§5.3) and reverse-shim + symmetry; the existing `# noqa: ARG001 — reserved API kwarg` + comment and docstring already explain the intent. + +**Code-size delta:** + +- Total `src/dppvalidator/` LoC: 29 205 → 29 245 (+40 net). +- Two new helper modules: +112 LoC (81 + 31, both heavily + documented). +- Removed at call sites: ~72 LoC across four files + (`untp_0_7_to_cirpass_1_3.py` -20, `cirpass_1_3_to_untp_0_7.py` + -30, `actor.py` -12, `connector.py` -10). +- Net +40 LoC bought us: zero duplicates, single-site edits for + future helper changes, smaller per-shim cognitive surface. + +**Quality gates after Phase 8.5:** + +- `uv run pytest tests/`: **2468 passed / 36 skipped** — + identical to Phase 8. +- `uv run ruff check`: clean. +- `uv run ruff format --check`: clean. +- `uv run python tools/check_imports.py`: exit 0. +- `uv run python tools/codegen/check_drift.py`: exit 0. +- `uv run mkdocs build --strict`: clean. +- `uv run pytest tests/unit/test_cold_start_import.py`: 4/4 + passing — `import dppvalidator` still doesn't load + `models.cirpass`, the new shared helper modules don't either. + +**Carried forward to Phase 9:** + +- The release cut can ship without further polish. The + back-compat surface is stable; the only externally-visible + Phase 8.5 change is the addition of two private helper modules. + +#### Phase 8.6 status — 2026-05-09 (second polish pass, complete) + +A second surgical polish pass on top of Phase 8.5. Goal: drive the +remaining cross-CLI duplication and Phase-1 dead-alias cruft out of +the codebase before the release cut. **Zero behaviour changes** — +same 2468 tests pass, exit codes and warning multisets unchanged. + +**Refactors applied:** + +- New module + [`src/dppvalidator/cli/_io.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/cli/_io.py) + centralises `load_input(input_path, console)`. Three near- + duplicate `_load_input` helpers across + [`validate.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/cli/commands/validate.py), + [`export.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/cli/commands/export.py), + and + [`migrate.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/cli/commands/migrate.py) + collapse into a single import — `from dppvalidator.cli._io import + load_input as _load_input`. Side benefit: + `dppvalidator export` now accepts stdin via `-` (matches the + validate / migrate commands; previous behaviour was a 5-line + branch difference, no documented contract). +- Six dead pre-Phase-2 aliases removed from + [`validators/detection.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/validators/detection.py): + `_SCHEMA_URL_PATTERNS`, `_CONTEXT_URL_PATTERNS`, `_DPP_TYPES`, + `_detect_from_schema_url`, `_detect_from_context`, + `_is_dpp_type`. Exhaustive grep across `src/` and `tests/` + confirmed zero external references — these were stranded after + the Phase-2 UNTP/CIRPASS namespacing rename. +- Two missing error-doc pages authored (TXT006, TXT007) — Phase-7 + textile-v2 rules shipped without their per-error MkDocs pages; + `scripts/check_error_docs.py` flagged the gap during the QA + gate. Both pages added with cause/fix/example sections; mkdocs + nav extended to match. + +**Surface stability:** + +- Public API unchanged. CLI exit codes, output formats, JSON + contract for `--format=json`, and the `migrate` sidecar shape + are all bit-identical. +- Behaviour change accepted: `export -` now reads stdin (was a + silent no-op / file-not-found before). No regression — the + pre-Phase-8.6 behaviour was an inconsistency, not a contract. +- Cold-start contract still holds (verified via + `tests/unit/test_cold_start_import.py`); `cli/_io.py` is a + pure stdlib + logging shim. + +**Code-size delta:** + +- Total `src/dppvalidator/` LoC: 29 245 → 29 227 (-18 net). +- New module `cli/_io.py`: +67 LoC (heavily docstring-ed). +- Removed: ~85 LoC across four files (validate.py -27, + export.py -20, migrate.py -22, detection.py -16). +- Net -18 LoC; the centralised helper carries more docstring + weight than the originals it replaced. + +**Quality gates after Phase 8.6:** + +- `uv run pytest tests/`: **2468 passed / 36 skipped** — + identical to Phase 8.5. +- `uv run ruff check`: clean. +- `uv run ruff format --check`: clean (15 files already + formatted). +- `uv run ty check src/`: clean. +- `uv run --group docs mkdocs build --strict`: clean. +- `uv run python scripts/check_error_docs.py`: 95/95 documented + and nav-wired (was 93/95 before this polish pass). +- Cold-start guard: `import dppvalidator` still doesn't load + `models.cirpass`; `cli/_io.py` doesn't either. + +**Carried forward to Phase 9:** + +- Same as Phase 8.5 — the release cut can ship as-is. Phase 8.6 + was a pure cleanup pass; no externally-visible API change + beyond `export -` accepting stdin. + +#### Phase 8.7 status — 2026-05-09 (E2E upstream drift survey, complete) + +A pre-release end-to-end probe of every external surface the +package depends on. Goal: confirm the bundled artefacts are still +faithful to upstream and the runtime is genuinely hermetic before +the Phase 9 release cut. **Zero code changes** — this was a +verification pass; no source / test edits resulted. + +**Probed surfaces:** + +- All 19 [`MANIFEST.json`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/schemas/data/MANIFEST.json) + source URLs (15 distinct; 4 share the + `dpp.vocabulary-hub.eu/specifications` deprecation marker URL). +- `https://vocabulary.uncefact.org/` — CEFACT BIE master vocab + (8.7 MB JSON-LD, 16 920 `@graph` entries; the upstream of the + `unece:` prefix used by every UNTP version). +- `https://vocabulary.uncefact.org/untp/` — live UNTP core + ontology (`owl:versionInfo: "working"`); the canonical RDF home + for the `untp:` prefix declared in our vendored + `untp-context-0.7.0.jsonld`. +- `https://test.uncefact.org/vocabulary/untp/dpp/0.6.1/` — UNTP + v0.6.1 context. Upstream of `untp-context-0.6.1.jsonld`. +- `https://test.uncefact.org/vocabulary/untp/dpp/0.7.0/` — probed + for symmetry with v0.6.1. +- `https://w3id.org/eudpp` — canonical EUDPP IRI per + [ADR 0002](../adr/0002-canonical-eudpp-iri.md). +- `https://artiso-ai.github.io/dppvalidator/errors//` — the + template every rule's `docs_url` resolves under (sample probed: + REL001, MAP001, TXT006). +- Hermetic-runtime invariant — `socket.create_connection` trapped + after import; ran `validate()` + all three exporters + (`JSONLDExporter`, `EUDPPJsonLDExporter`, `CIRPASSJsonLDExporter`) + on the v0.7.0 fixture. + +**Findings — green (no action needed):** + +- ✅ All 15 distinct manifest source URLs return **200**. +- ✅ All 19 vendored byte-streams still hash to their pinned + SHA-256 (re-verified by + `tests/unit/test_manifest_integrity.py`: 26/26 passing). +- ✅ `vocabulary.uncefact.org/untp/` returns the live UNTP core + ontology with `owl:versionInfo: "working"` — confirms our + vendored 0.7.0 context's `'untp' -> 'https://vocabulary.uncefact.org/untp/'` + prefix declaration is canonical. +- ✅ Hermetic runtime confirmed: zero `socket.create_connection` + calls during `validate()` or any of the three exporters + (CIRPASS, EUDPP, plain JSON-LD). +- ✅ All Phase 8.6 quality gates re-ran clean (2468 / 36 skipped, + 92.05% coverage, ruff / format / ty / mkdocs strict / 96/97 + smoke). + +**Findings — soft drift (informational, not blocking):** + +- ⚠️ `https://w3id.org/eudpp` redirects to + `dpp.vocabulary-hub.eu/api/ontology-version/OntologyVersion_086dd88c-…/export?format=ttl` + which currently returns **404**. The IRI is used as an opaque + RDF identifier (its semantic role doesn't depend on + dereferencing per RDF best practice), so the package functions + fine. ADR 0002 already documents the IRI as a "name, not a + URL"; we treat this as upstream-vocabulary-hub housekeeping. + *Action: none in 0.5.0; revisit if the upstream redirect + remains broken at the 0.6.0 release window.* +- ⚠️ `https://test.uncefact.org/vocabulary/untp/dpp/0.7.0/` + returns **403**. **This is not a regression** — v0.7.0 was + vendored from `opensource.unicc.org/.../spec-untp/.../707cd526.../artefacts/contexts/v0.7.0/untp-context.jsonld` + (a commit-pinned GitLab raw URL; manifest entry confirms this). + The test.uncefact.org path was the v0.6.x publishing channel + and was never the canonical home for v0.7.0. *Action: none.* +- ⚠️ `https://artiso-ai.github.io/dppvalidator/errors//` + currently returns **404** for every rule code. The deployed + docs site root (`/`) and the `/errors/` index both return 200, + so the site IS deployed — but the deployment reflects the + `main` branch state, which is still at **v0.4.0** (the last + release tag; develop has all the Phase 4 / 5 / 8.x work but + has not yet been merged). The Phase 9 release-merge will fire + `.github/workflows/docs.yml` on push to `main` and redeploy. + *Action: included in Phase 9 release checklist below.* + +**Findings — drift in scope for Phase 9:** + +The auto-update workflow +[`.github/workflows/update-vocabularies.yml`](https://github.com/artiso-ai/dppvalidator/blob/main/.github/workflows/update-vocabularies.yml) +refreshes only `countries.json` and `units.json` (via +`scripts/fetch_vocabularies.py`). The 19 manifest-tracked +artefacts (UNTP schemas, contexts, EUDPP ontologies, CIRPASS +reference structure) are **not** monitored for upstream drift on +any cadence. The current safeguard is the SHA-256 manifest test, +which catches *local* tampering but not *upstream* changes that +could quietly invalidate our pinning over a release cycle. +**Mitigation in scope for Phase 9:** see release checklist +below — manual probe before each release tag. **Out of scope for +0.5.0:** automating an upstream-drift CI job (lean-package +mandate; the manual checklist is sufficient at current cadence). + +**Production-readiness verdict:** + +- The package is **production-ready for the 0.5.0 Preview cut**. +- The two soft drifts (w3id 404, deployed-docs lag) self-resolve + at the release-merge or are non-functional. The runtime is + hermetic. Every byte we ship traces back to a live, pinned + upstream source. +- No code or test edits flowed from this verification pass — + consistent with the lean-package mandate. + +**Pre-Phase-9 release checklist (carried forward into 9.5 +release-gate UAT):** + +1. Re-run `tests/unit/test_manifest_integrity.py` immediately + before tagging — confirms vendored bytes still match. +2. Manually probe each distinct manifest `source_url` (15 URLs) + for HTTP 200; record results in the release PR description. + This is the lean substitute for an automated upstream-probe + CI job (deferred to a future release). +3. After the release-merge to `main`, verify + `https://artiso-ai.github.io/dppvalidator/errors/REL001/` + (and a sample of MAP/TXT/UPG codes) returns 200 within 10 + minutes — the `docs.yml` workflow auto-deploys on push. +4. Smoke-test `import dppvalidator` from the published wheel + (TestPyPI staging) to confirm cold-start contract holds for + end-users (no `models.cirpass` eager load). + +#### Phase 8.8 status — 2026-05-09 (UNTP 0.7.0 model-vs-schema drift survey, complete) + +A second-level critical evaluation triggered by the question "are +all UNTP 0.7.0 codes faithfully covered?". Compared the upstream +JSON Schema's `$defs` against our Pydantic v0.7 model classes +field-by-field, then round-tripped both 0.7.0 fixtures (basic + +battery) through `validate()` → `model_dump()` to measure data +preservation. + +**Methodology:** + +- Walked `untp-dpp-schema-0.7.0.json` and collected every property + per `$defs` entry → 99 unique properties (98 non-JSON-LD). +- Walked `dppvalidator.models.v0_7` and collected every Pydantic + field with its alias → 91 distinct JSON-visible aliases across + 27 BaseModel subclasses. +- Diffed the two sets, then drilled into mismatches by reading + the schema `$defs` and the matching Pydantic model side-by-side. +- Round-tripped two real fixtures (basic + battery; battery uses + `materialUsed` and 10 `Link` instances) and compared path + multisets pre- vs post-`model_dump`. +- Triggered Layer-1 (JSON Schema) violations by deleting required + schema fields (`Package.description`, `Package.dimensions`, + `Package.materialUsed`, `Link.linkName`) and confirmed each + surfaced `SCH001` correctly. +- Cross-checked exporter coverage of `model_extra` fields on the + battery fixture for both UNTP-canonical (`JSONLDExporter`), + EUDPP, and CIRPASS export shapes. + +**Verdict — green for the validation pathway:** + +- ✅ **Layer 1 (JSON Schema validator) is faithful 1:1 to upstream.** + Every required-field violation injected during the test was + caught with `SCH001`. The schema validator is the authoritative + source-of-truth check; it does not depend on the Pydantic + model layer. +- ✅ **Round-trip is lossless.** Battery fixture: 223 paths in → + 224 paths out (one synthesised default), zero paths lost. The + `extra="allow"` ConfigDict on `UNTPBaseModel` preserves every + unmapped field on `model_dump`. Pydantic v2's `__getattr__` + routes attribute access (e.g. `link.linkName`) through + `model_extra` transparently. +- ✅ **EUDPP export preserves extras** — confirmed `materialUsed`, + `linkName`, `linkType` all survive the EUDPP projection. + +**Verdict — drift identified (5 model classes, 14 fields):** + +The Pydantic v0.7 model layer has lagged behind the upstream +schema. The drift was inherited from the original UNTP 0.7.0 +migration (pre-Phase-1 of this plan); the CIRPASS-2 work didn't +introduce it. **It is not a validation regression** — it is a +Python API-ergonomics gap. + +Per-class drift inventory (bold = schema-required field that the +Pydantic model fails to declare): + +- **`Party`** — [identifiers.py:117](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/identifiers.py#L117) + - schema-optional, model-missing: `registrationCountry`, + `partyAddress`, `organisationWebsite`, `industryCategory`, + `partyAlsoKnownAs`. +- **`Link`** — [primitives.py:73](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/primitives.py#L73) + - schema-required, model-missing: **`linkName`**. + - schema-optional, model-missing: `linkType`. + - model-only (not in schema): `name`, `description`, + `relationship`. +- **`Package`** — [product.py:63](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/product.py#L63) + - schema-required, model-missing: **`description`, + `dimensions`, `materialUsed`**. + - schema-optional, model-missing: `packageLabel`, + `performanceClaim`. + - model-only (not in schema): `packageType`, `weight`. +- **`Period`** — [claims.py:64](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/claims.py#L64) + - schema-required vs model-Optional drift: **`startDate`, + `endDate`** (REQUIRED in schema; Optional in model). + - schema-optional, model-missing: `periodInformation`. +- **`RenderTemplate2024`** — [envelope.py:89](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/envelope.py#L89) + - schema-optional, model-missing: `mediaQuery`, `url`. + - model-only (not in schema): `id`. + +**Impact on consumers:** + +- ✅ Validation correctness: unchanged (Layer 1 catches everything). +- ✅ Round-trip / exporters (UNTP, EUDPP): unchanged. +- ⚠️ Type hints / IDE autocomplete: schema fields don't appear on + hover; users fall back to `dict`-style access via `model_extra`. +- ⚠️ Semantic-rule reach: rules that reference + `passport.related_actors[0].partyAddress` won't see the value + via attribute access typed-ly (still works via runtime + `__getattr__`, but no static guarantee). +- ⚠️ CIRPASS forward-shim coverage: cross-family projection iterates + over `model_fields`, not `model_extra` — fields stored only as + extras are dropped on the UNTP → CIRPASS shape transform. This + is by-design for fields with no CIRPASS counterpart, but + means new schema fields silently miss the projection unless + promoted to first-class on the Pydantic model. + +**Production-readiness verdict for 0.5.0 Preview:** + +- **Ship-able as Preview.** Validation is faithful; no data loss; + the drift is internal API ergonomics, not user-visible + correctness. The 0.5.0 release IS marked Preview / unstable + (per Phase 9 task 9.1–9.6), so the Pydantic API surface + stabilising in 0.6.0 is consistent with the release framing. +- **Document the gap in `CHANGELOG.md` § Known Limitations** for + the 0.5.0 entry: "Pydantic v0.7 model classes are incomplete + relative to the upstream JSON Schema for `Party`, `Link`, + `Package`, `Period`, `RenderTemplate2024`. Schema validation + is unaffected; round-trip preserves data via `extra='allow'`. + See Phase 10 of [`docs/plans/CIRPASS_2_MIGRATION.md`](docs/plans/CIRPASS_2_MIGRATION.md)." +- **Defer the fix to a new Phase 10** — see below — with a + conservative effort estimate. Doing it now would expand polish- + pass scope into a multi-class refactor + fixture review + + cross-family-shim review, and the lean-package mandate + argues against bundling it into the Preview release. + +**Recommended Phase 9 task additions:** see Phase 9 section +below — the canonical task list is `9.7`–`9.11` (BLOCKER fixes +D1, D2 with cross-version compatibility constraints; alignment +guard test; CHANGELOG entries; cross-version regression +baseline). The sub-tasks listed there subsume the earlier +narrower 9.7 stub from the Phase 8.8 plan revision. + +#### Phase 8.9 status — 2026-05-09 (Deep end-to-end drift catalogue, complete) + +A second, deeper drift survey expanding Phase 8.8 from "5 classes, +14 fields" to a full 22-`$def` × 27-class field/type/required/enum/ +format diff plus live-validation correctness probes, plus a final +audit pass closing 2 false alarms and surfacing 3 new findings. +**Zero source or test edits** — verification + planning pass per +lean-package mandate. Results below supersede the Phase 8.8 +catalogue. + +**At-a-glance drift status (post-audit), grouped by tier:** + +- **BLOCKER (Phase 9):** + - D1 — `BitstringStatusListEntry.statusListIndex` + int-vs-str → 9.7. + - D2 — `PartyRoleEnum` Layer-1/Layer-2 contradiction → + 9.8 (resolved via documented acceptance gradient). +- **HIGH (Phase 10.9 / 10.10 / 10.11):** + - D3 — `Address` 5 required-Optional drifts → 10.10. + - D4 — `BitstringStatusListEntry` 4 required-Optional + + `statusPurpose` enum missing → 10.10 + 10.11. + - D5 — `BitstringStatusListEntry.id` REVERSE drift + (model-required, schema-Optional) → 10.11. + - D6 — `Claim` 4 required-Optional drifts → 10.10. + - D7 — `Period.startDate`/`endDate` required-Optional → + 10.10. + - D8 — `Link.linkName` REQUIRED missing from model → + 10.9. + - D9 — `Package` 3 required schema fields absent → 10.9. +- **MEDIUM (Phase 10.9):** + - D10 — `Party` 5 optional fields missing. + - D11 — `Link.linkType` missing. + - D12 — `Package.packageLabel`, `performanceClaim` + missing. + - D13 — `Period.periodInformation` missing. + - D14 — `RenderTemplate2024.mediaQuery`, `url` missing. +- **LOW (Phase 10.9 / 10.13):** + - D15 — `Claim.classification` model-only → 10.13. + - D16 — `Link.name`/`description`/`relationship` + model-only → 10.9 + 10.13. + - D17 — `Package.packageType`/`weight` model-only → + 10.13. + - D18 — `RenderTemplate2024.id` model-only → 10.13. +- **FORMAT (Phase 10.12):** + - D19 — 12 `format: uri` sites typed as plain `str`. + - D20 — `Image.imageData` (`format: byte`) typed as + `str`. +- **UNMAPPED (audited 2026-05-09):** + - D21 — `DigitalProductPassport` → + **CLOSED — faithful to schema root**. + - D22 — `Facility` → docstring-only (10.14). + - D23 — `IdentifierScheme` → + **CLOSED — faithful to inline shape**. + - D24 — `SoftwareVendor` → docstring-only (10.14). +- **PERMISSIVE (Phase 10.13):** + - D25 — `Performance` / `Score` / `Measure` / + `Characteristics` each carry 1 model-only field; + catalogue + decide keep-or-drop per field. +- **CIRPASS deep-diff (Phase 10.15):** + - D26 — CIRPASS `HazardCategory` / `LifeCycleStage` + $defs unmapped (likely terminology classes). + - D27 — CIRPASS field-level deep diff not yet performed + (counts match: 20 model classes vs 20 schema $defs). + +**Methodology (extends 8.8):** + +1. Walked every `$def` in `untp-dpp-schema-0.7.0.json` → + inventory of 22 classes / property names / `type` / + `required` / `format` / `pattern` / `enum` / `minimum` / + `maximum` / `minLength` / `minItems`. +2. Walked every Pydantic v0.7 BaseModel subclass via + `importlib.iter_modules` → 27 classes; mapped 23/27 to + schema `$defs` (4 unmapped: `DigitalProductPassport`, + `Facility`, `IdentifierScheme`, `SoftwareVendor` — these are + inline-shaped or extension-only). +3. For each mapped pair: schema-only fields, model-only fields, + required-vs-Optional drift, type drift, enum drift. +4. Triggered live correctness probes for the highest-severity + findings: bare instantiation of `Address()`, `Product()`; + string `statusListIndex`; `PartyRoleEnum` value not in + schema's closed set; missing `BitstringStatusListEntry.id` + (reverse drift). + +**Comprehensive drift catalogue (severity-tiered):** + +**Tier 1 — BLOCKER (real correctness bugs; Layer 1/Layer 2 +contradiction). Must fix before 0.5.0 release tag.** + +- **D1 — `BitstringStatusListEntry.statusListIndex` type drift.** + Schema declares `integer`; our Pydantic field is + `str | None`. Pydantic accepts non-numeric strings (e.g. + `"abc"`); upstream JSON Schema rejects with type error. + Layer 1 catches the violation when the field is populated, + but the Pydantic model surface is wrong: any caller building + a `BitstringStatusListEntry` programmatically can pass a + non-integer string and round-trip a payload the schema would + reject. Fix: change annotation to `int | None`, add + `Field(default=None, ge=0)`, update fixtures + tests if any + string-shaped values exist. +- **D2 — `PartyRoleEnum` accepts what schema rejects.** + `PartyRoleEnum` declares 20 values; schema's `PartyRole.role` + is a closed enum of exactly 6: `owner`, `producer`, + `manufacturer`, `processor`, `remanufacturer`, `recycler`. + The 14-value gap (`brandOwner`, `carrier`, `certifier`, + `consignee`, `consignor`, `distributor`, `exporter`, + `importer`, `inspector`, `logisticsProvider`, `operator`, + `regulator`, `retailer`, `serviceProvider`) means our model + accepts payloads the schema rejects. **Two paths:** + - **(a)** Tighten `PartyRoleEnum` to 6 values; document the + 14 dropped values in the v0.7.0 deprecation note. Breaking + for any downstream code using the wider enum. + - **(b)** Drop `PartyRoleEnum` entirely; type the field as + `str` and let Layer 1 (JSON Schema) be the only enforcer. + Layer 2 stops contradicting Layer 1 by stepping aside. + - **(c)** Recommended: replace `PartyRoleEnum` with a closed + `PartyRoleClosedEnum` matching schema's 6 (Pydantic-strict), + and additionally expose the wider 14-value enum as + `PartyRoleExtendedEnum` (informational) — pilot extensions + can opt into the wider list explicitly. Cleanest but + biggest API surface. + +**Tier 2 — HIGH (Pydantic safety lattice incomplete; Layer 1 +catches the violation, but the Python API layer is unsafe). +Recommended for Phase 10 (0.6.0 stable); not a 0.5.0 blocker.** + +- **D3 — `Address` Optional-vs-Required drift.** Schema marks + all 5 props REQUIRED (`postalCode`, `addressRegion`, + `streetAddress`, `addressLocality`, `addressCountry`); model + has all 5 Optional. `Address()` instantiates bare without + error. +- **D4 — `BitstringStatusListEntry` required-but-Optional.** 4 + schema-required fields (`statusListCredential`, `type`, + `statusPurpose`, `statusListIndex`) are Optional in model. + Plus `statusPurpose` schema enum (`refresh`, `revocation`, + `suspension`, `message`) not enforced in Pydantic. +- **D5 — `BitstringStatusListEntry.id` REVERSE drift.** Our + Pydantic field is REQUIRED; schema lists 4 required fields + *not including* `id`. Our model rejects valid payloads that + omit `id`. **Direction-of-drift inversion** — fix is to drop + the Pydantic-side requirement on `id` (annotate as Optional). +- **D6 — `Claim` 4 required-Optional drifts.** + `referenceCriteria`, `conformityTopic`, `claimedPerformance`, + `claimDate` all schema-required, model-Optional. +- **D7 — `Period` startDate/endDate drift.** Schema marks both + REQUIRED; model declares both `Optional[date] = None`. +- **D8 — `Link.linkName` REQUIRED in schema, absent from + model.** Currently in `model_extra` only; promotion needed. +- **D9 — `Package` 3 schema-required fields absent.** + `description`, `dimensions`, `materialUsed` all REQUIRED in + schema; model has only `package_type` + `weight` (neither in + schema). + +**Tier 3 — MEDIUM (API ergonomics; no functional impact). +Phase 10.9 scope.** + +- **D10 — `Party` 5 optional fields missing as first-class:** + `registrationCountry`, `partyAddress`, `organisationWebsite`, + `industryCategory`, `partyAlsoKnownAs`. +- **D11 — `Link.linkType` missing as first-class.** +- **D12 — `Package.packageLabel`, `Package.performanceClaim` + missing.** +- **D13 — `Period.periodInformation` missing.** +- **D14 — `RenderTemplate2024.mediaQuery`, `.url` missing.** + +**Tier 4 — LOW (model-only fields; possibly intentional +extensions but undocumented).** + +- **D15 — `Claim.classification`** — model only; classify as + intentional extension or remove. +- **D16 — `Link.name`, `.description`, `.relationship`** — + model only. `name` should be renamed to `linkName` per D8. +- **D17 — `Package.packageType`, `.weight`** — model only; + schema has neither. Likely UNTP 0.6 holdover (v0.6 had a + different Package shape) — should be removed during + Phase 10.9 alignment. +- **D18 — `RenderTemplate2024.id`** — model only. + +**Tier 5 — FORMAT-CONSTRAINT GAPS (Layer 1 catches; Pydantic +surface offers no early validation).** + +- **D19 — 12 schema sites with `format: uri` typed as plain + `str` in models** — `CredentialIssuer.id`, + `BitstringStatusListEntry.id`, `BitstringStatusListEntry.statusListCredential`, + `RenderTemplate2024.url`, `IssuingSoftware.id`, `Product.id`, + etc. Pydantic's `AnyUrl` / our `FlexibleUri` should be used + consistently. +- **D20 — 1 schema site with `format: byte` (Image.imageData)** + typed as `str` — no base64 validation at Pydantic layer. + +**Tier 6 — UNMAPPED MODEL CLASSES (audited 2026-05-09; 2/4 +closed as faithful, 2/4 documented as intentional extensions).** + +- **D21 — `DigitalProductPassport` (CLOSED — no drift).** + Schema-root audit confirms 9 model aliases match 10 schema + root properties (the 10th is the implicit `type` JSON-LD + keyword); all 5 schema-required root fields + (`credentialSubject`, `id`, `issuer`, `name`, `validFrom`) + are first-class required Pydantic fields. The envelope + class is faithful to the schema root. Action: none. +- **D22 — `Facility` (DOCUMENTED — extension).** Not in + v0.7.0 schema $defs. Acts as a cross-credential reference + shape for `Product.producedAtFacility`. Action in 10.14: + add a class docstring noting "extension for + DigitalFacilityRecord cross-credential references; not a + v0.7.0 schema $def". +- **D23 — `IdentifierScheme` (CLOSED — no drift).** Schema's + inline shape on `Party.idScheme: object` declares + `{type, id, name}` with `[id, name]` required; our + `IdentifierScheme` model fields are exactly the same set. + The model faithfully captures the inline shape. Action: + none. +- **D24 — `SoftwareVendor` (DOCUMENTED — nested helper).** + Nested inside `IssuingSoftware`; not exposed as a schema + $def. Action in 10.14: add a class docstring noting + "helper class for IssuingSoftware.vendor; not a schema + $def". + +**Tier 7 — PERMISSIVE-SHAPE CLASSES (NEW from final audit +pass). Schema declares 0–3 properties; model carries 1 +extra each. Likely intentional extensions but uncatalogued.** + +- **D25 — Permissive-shape model-only fields (4 sites).** + - `Performance`: schema 3 props (0 required); model 4 + fields → 1 model-only. + - `Characteristics`: schema 0 props (totally permissive + open shape); model 1 field → 1 model-only. + - `Score`: schema 3 props (1 required); model 4 fields → 1 + model-only. + - `Measure`: schema 4 props (2 required); model 5 fields → + 1 model-only. **Plus:** confirm `Measure`'s 2 schema- + required fields are required Pydantic fields (subset of + Tier-2 D-items audit). + - Action in 10.13: catalogue each model-only field; decide + keep-as-extension vs deprecate. + +**Tier 8 — CIRPASS REFERENCE STRUCTURE v1.3.0 ALIGNMENT (NEW +from final audit pass). 20 model classes vs 20 schema $defs; +counts match; field-level deep diff deferred to Phase 10.** + +- **D26 — CIRPASS schema `HazardCategory` + `LifeCycleStage` + $defs have no Pydantic model.** Both are 0-property $defs + in `cirpass-reference-1.3.0.json`, suggesting they are + terminology classes (enum-style anchors). Investigate + whether they should map to v1.9.x `eudpp_classes.py` enum + members or to dedicated Pydantic model classes. +- **D27 — CIRPASS Pydantic model layer field-level deep diff + not yet performed.** Phase 8.9 audit covered UNTP v0.7 + exhaustively; CIRPASS v1.3 only structurally (count + match). A targeted CIRPASS deep diff (analogous to + Phase 8.9's UNTP methodology) is queued for **Phase 10 + task 10.15** below — same diff script, repointed at + `models/cirpass/v1_3/` and `cirpass-reference-1.3.0.json`. + +**Aggregate counts (post-final-audit refinement):** + +- Schema $defs (UNTP v0.7): 22 classes, 99 properties, 4 + enum sites, 18 format-constraint sites (12 uri + 5 date + + 1 byte), 31 schema-required-marked properties. +- Schema $defs (CIRPASS v1.3): 20 classes (counts match + Pydantic; field-level diff is D27 follow-up). +- Model classes: 27 UNTP v0.7 + 20 CIRPASS v1.3 + (v0.6 + frozen) BaseModel subclasses. +- 23/27 UNTP v0.7 classes mapped to schema $defs; **2/4 + unmapped CLOSED** (D21, D23 confirmed faithful), **2/4 + documented** (D22, D24 intentional extensions). +- 9/22 mapped UNTP-v0.7 pairs have at least one drift item. +- **Total drift items: 27 named (D1–D27).** Of these: + - **2 closed** (D21, D23). + - **2 documented as intentional** (D22, D24). + - **2 BLOCKER** (D1, D2 — Phase 9). + - **7 HIGH** (D3–D9 — Phase 10.10/10.11). + - **5 MEDIUM** (D10–D14 — Phase 10.9). + - **4 LOW** (D15–D18 — Phase 10.13). + - **2 FORMAT** (D19, D20 — Phase 10.12). + - **1 PERMISSIVE-SHAPE audit** (D25 — Phase 10.13). + - **2 CIRPASS follow-ups** (D26, D27 — Phase 10.15). + - Net actionable: 23 items across Phase 9 + Phase 10. + +**Production-readiness verdict:** + +- The 0.5.0 Preview can ship **if** Tier 1 (D1, D2) is fixed — + these are the only items where the Pydantic surface + *contradicts* the schema's contract. Without them, a + programmatic caller can build a Pydantic model whose + `model_dump()` payload fails Layer 1 validation, which + breaks the round-trip invariant. +- All Tier 2+ drift is confined to the Pydantic API; Layer 1 + remains the authoritative check. These items can be + staged into Phase 10 (0.6.0 stable) without compromising + the Preview release. + +**Phase 9 → blocking work (NEW tasks 9.7–9.11):** see Phase 9 +section below — D1 + D2 promoted to release blockers; alignment +guard test reframed; documentation update; cross-version +regression baseline. + +**Phase 10 → expanded scope (NEW tasks 10.9–10.14):** see +Phase 10 section below — every Tier-2 / Tier-3 / Tier-4 / Tier-5 +item assigned a task with edge-case enumeration. + +**Cross-version / cross-family compatibility constraint +(addendum, 2026-05-09):** every Phase 9 + Phase 10 fix MUST +preserve UNTP v0.6.0 / v0.6.1 fixture parsing & upgrade-shim +output, and MUST keep CIRPASS reference structure v1.3.0 +round-trips bit-stable. The full constraint set is encoded in +the "Compatibility constraints (NON-NEGOTIABLE)" block at the +top of Phase 9 below; Phase 10's task descriptions inherit +the same constraint and add a per-task v0.6 / CIRPASS +regression checklist (see § 10.9 — 10.14). + +**Compatibility-driven decision changes:** + +- **D2 fix (PartyRoleEnum):** prior plan recommended option + (a) — tighten enum to 6 values. Compatibility analysis + reveals the CIRPASS reverse shim emits 8 of the 14 wider + values as mapping targets (lines 62-63 of + `cirpass_1_3_to_untp_0_7.py`); tightening would force a + CIRPASS rich → coarse degradation with information loss. + Refined to **option (b) modified**: keep wider enum, add + `PRT001` info-rule, add opt-in `strict_role_enum` engine + flag. **Non-breaking.** +- **D1 fix (statusListIndex):** prior plan recommended a + hard-cutover from `str | None` to `int | None`. + Compatibility analysis reveals v0.6's + `models/v0_6/credential.py:51` is also `str | None`, + meaning the v0.6 → v0.7 upgrade shim copies string-shaped + values across. Refined to add a `before` validator on the + v0.7 field that transparently coerces numeric strings → + int. **Non-breaking** for any v0.6 fixture with + numeric-string `statusListIndex`. + +--- + +### Phase 9 — `0.5.0` Preview release cut + +**Goal:** Cut a Preview release that is feature-complete for CIRPASS-2 +support, marked unstable. +**Effort:** S (~2 days) · **Depends on:** Phase 8 · **Ships in:** `0.5.0`. + +**Tasks** + +- **9.1** Set `DEFAULT_VERSIONS[UNTP] = "0.7.0"` (existing UNTP cutover). +- **9.2** Set `DEFAULT_VERSIONS[CIRPASS] = "1.3.0"`. +- **9.3** Author family-keyed `CHANGELOG.md` entry + (`### CIRPASS-2`, `### UNTP`, `### Plugins`). Include a § + "Known limitations" entry referencing Phase 8.9 Tier 2–6 + drift catalogue and Phase 10 alignment scope. +- **9.4** Activate deprecation warnings: bare-string registry lookup, + old `EUDPPNamespace` IRIs, `is_dpp_document` alias. +- **9.5** Run release-gate UAT scenarios manually (see below); capture + reviewer sign-off in the release PR description. +- **9.6** Run `pypi-publish` skill. + +**Phase 8.9 BLOCKER fixes — must land before tag 0.5.0:** + +**Compatibility constraints (NON-NEGOTIABLE — frame every fix):** + +Every Phase 9 BLOCKER fix MUST preserve: + +1. **UNTP v0.6.0 / v0.6.1 schema validation pathway.** v0.6 + models are frozen per the cardinal versioning rule + ([`.claude/rules/untp-versioning.md`](../../.claude/rules/untp-versioning.md)); + v0.6 fixtures must continue to parse + validate without + regression after every fix. +2. **The v0.6 → v0.7 upgrade shim** + ([`compat/upgrade_0_6_to_0_7.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/upgrade_0_6_to_0_7.py)) + must continue to produce v0.7-shaped payloads that the v0.7 + ValidationEngine accepts. Any v0.7 model change must + therefore be either (a) backwards-compatible at parse time + via a `before` validator that accepts the v0.6 shape, or + (b) accompanied by a corresponding shim edit that emits + the new v0.7 shape. +3. **The CIRPASS forward shim** + ([`compat/untp_0_7_to_cirpass_1_3.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py)) + must keep handling every v0.7 PartyRoleEnum value it + currently maps via `_UNTP_TO_EUDPP_ROLE` (lines 64+). + Tightening the v0.7 enum would orphan mapping entries. +4. **The CIRPASS reverse shim** + ([`compat/cirpass_1_3_to_untp_0_7.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py)) + uses the wider 20-value `PartyRoleEnum` as mapping + *targets* — see lines 62-63's "We pick the most-specific + UNTP role that the v0.7.0 PartyRoleEnum exposes". Of the + 12 EUDPP-role targets it emits, **8 are NOT in schema's + closed-6 enum** (`importer`, `distributor`, `retailer`, + `logisticsProvider`, `operator`, `regulator`, + `serviceProvider`, `certifier`). Tightening the Pydantic + enum to 6 would degrade CIRPASS → UNTP from rich → coarse + mapping with information loss not currently present. +5. **CIRPASS reference structure v1.3.0 conformance.** All + CIRPASS v1.3 fixtures (`tests/fixtures/valid/cirpass-1.3.0/`) + must continue to parse + validate; existing + `tests/integration/test_round_trip_untp_cirpass.py` and + `tests/integration/test_compat_roundtrip.py` must remain + green; CIRPASS pilot extensions (textile-v2, tyres) must + continue to fire their rule packs unchanged. + +**Critical pre-fix audit (run BEFORE 9.7/9.8 land):** + +- Before any model edit, run the **cross-version regression + baseline** (`pytest tests/integration/test_version_matrix.py + tests/integration/test_compat_roundtrip.py + tests/integration/test_round_trip_untp_cirpass.py + tests/unit/test_eudpp_export_v07.py + tests/unit/test_engine_extended.py -v`) and capture pass/skip + counts. Re-run after each fix; any delta must be explained. +- Pre-existing reverse-shim drift (the 8 non-schema-allowed + role values it emits) is documented as a known issue under + Phase 8.9 D2 — do NOT "fix" by mass-remapping those values + to the 6-value set, since that's a CIRPASS round-trip + degradation. The strategic answer is the dual-tier + acceptance gradient documented in 9.8 below. + +- **9.7 (BLOCKER fix D1, BACK-COMPAT preserving)** Fix + `BitstringStatusListEntry.statusListIndex` type drift while + preserving v0.6 → v0.7 upgrade-shim transparency. + - **v0.6 inventory:** v0.6's + [`models/v0_6/credential.py:51-58`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_6/credential.py#L51) + declares `status_list_index: str | None`. The upgrade + shim copies the value as-is. Therefore the v0.7 fix must + accept string-shaped numeric values from v0.6 fixtures + transparently. + - **Edit** [`models/v0_7/envelope.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/envelope.py): + change the annotation to + `int | None, Field(default=None, ge=0, alias="statusListIndex", ...)`. + - **Add a `@field_validator("status_list_index", mode="before")`** + that: + - Accepts `int` directly (passthrough). + - Accepts numeric string (e.g. `"5"`) — coerces to `int(value)`. + Transparent for v0.6 → v0.7 upgrades; no warning emitted. + - Rejects non-numeric strings (e.g. `"abc"`) with a + `ValueError` carrying `MDL050`-coded context. (Those + payloads were always invalid against schema; the + coercion just surfaces the error earlier in the pipeline.) + - Mirror the change + before-validator in `CredentialStatus` + (duplicate class — also touched by D5 in Phase 10.11). + - **v0.6 model**: do NOT touch. v0.6 stays `str | None` per + cardinal rule. Schema for v0.6 may also declare integer; + if so, document the v0.6 model drift as an inherited gap + (out of scope for this fix; v0.6 is frozen). + - **Fixture audit** (`grep -rn statusListIndex tests/fixtures/`): + confirm every existing usage is either bare-integer or + numeric-string. Document any non-numeric finds for + cleanup before merge. + - **Edge cases for the new parametrized test**: + - `5` (int) → accepted, parses to `5`. + - `"5"` (numeric string) → accepted via coercion, parses + to `5`. Confirms v0.6 fixture transparency. + - `"05"` / `" 5 "` (whitespace / leading zero) → coerce + via `int(value.strip())`. Edge case from upstream + payloads. + - `"abc"` (non-numeric) → ValueError; layer=`model`, + code=`MDL050`. + - `-1` (negative) → ValueError via `ge=0`. + - `None` → accepted (field is Optional). + - Effort: ~40 LoC (annotation + before-validator) + + ~20 LoC test. Single source-file change. + - Cross-version regression: re-run the test_version_matrix + integration test; expect zero delta (v0.6 → v0.7 upgrades + of fixtures with string-shaped statusListIndex must + succeed without new warnings). + +- **9.8 (BLOCKER fix D2, REFINED — option (b) modified)** + Resolve `PartyRoleEnum` Layer-1/Layer-2 contradiction + WITHOUT degrading CIRPASS round-trip. + - **Critical decision change from prior plan:** option (a) + [tighten enum to 6] is now ruled out by compatibility + constraint 4 above (CIRPASS reverse shim emits 8 of the + 14 schema-rejected values as mapping targets). The + refined approach is **option (b) modified** — keep the + 20-value `PartyRoleEnum` as the *acceptance gradient* + surface; document it explicitly. + - **Edit** [`models/v0_7/identifiers.py:169`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/identifiers.py#L169): + DO NOT remove values. Instead: + - Add a class-level docstring section labelled + "Acceptance gradient" listing the 6 schema-strict + values, the 14 wider-pilot values, and explaining the + Layer-1/Layer-2 split. + - Add a `SCHEMA_STRICT_ROLES: ClassVar[frozenset[str]] = + frozenset({"owner", "producer", "manufacturer", + "processor", "remanufacturer", "recycler"})` constant + on `PartyRoleEnum`. + - Add a `is_schema_strict()` instance method that + returns `self.value in SCHEMA_STRICT_ROLES`. + - **Add a NEW soft-warning semantic rule `PRT001`** in + [`validators/rules/v0_7/`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/validators/rules/v0_7/) + (one new file, ~70 LoC + 30 LoC test). Severity: `info` + (not `warning`, not `error` — this is informational so + pilot extensions don't get noisy errors). The rule fires + when a `PartyRole.role` is in the wider 14 but not the + strict 6, suggesting the canonical schema-allowed + counterpart via a mapping table: + - `importer` → `manufacturer` (closest economic operator) + - `distributor` → `manufacturer` + - `retailer` → `owner` + - `brandOwner` → `manufacturer` + - `carrier` → `manufacturer` (no good fit; informational) + - `consignor` → `manufacturer` + - `consignee` → `owner` + - `exporter` → `manufacturer` + - `inspector` → `processor` + - `logisticsProvider` → `manufacturer` + - `operator` → `manufacturer` + - `serviceProvider` → `processor` + - `regulator` → `processor` + - `certifier` → `processor` + - The mapping is `info`-severity advisory only. Layer 1 + JSON Schema still rejects when used in strict-mode + validation; this rule helps users understand the gap. + - **Schema-strict validation toggle** — add an opt-in + `strict_role_enum: bool = False` parameter to + `ValidationEngine`. When `True`, `PRT001` upgrades from + `info` → `error`. Default off so existing pipelines are + not disrupted. + - **CIRPASS shim impact**: zero changes required. Both + forward and reverse shims continue using the wider + 20-value enum. **Quantified pre-existing reverse-shim + coverage** (audited 2026-05-09): of the 12 distinct + `_EUDPP_TO_UNTP_ROLE` mapping targets, only 4 are in + schema's strict 6 (`manufacturer`, `owner`, `recycler`, + `remanufacturer`); the other 8 (`certifier`, + `distributor`, `importer`, `logisticsProvider`, + `operator`, `regulator`, `retailer`, `serviceProvider`) + are intentional rich-extension targets. After PRT001 + lands, these 8 emit info-rule warnings on consumer-side + validation but are otherwise unchanged — preserving + CIRPASS information fidelity. **DO NOT** mass-remap the + table to the strict 6 in 9.8 — that would be the same + rich-→-coarse degradation 9.8 is designed to avoid. + - **CHANGELOG entry** under "Bug fixes": + "PartyRoleEnum and the upstream JSON Schema's closed + enum had a Layer-1/Layer-2 contradiction. Resolved by + documenting the dual-tier acceptance gradient: Pydantic + accepts 20 values (back-compat with v0.6 fixtures and + the CIRPASS reverse-shim mapping); JSON Schema accepts + only 6 (strict closed enum). New advisory rule `PRT001` + surfaces the gap. Use `ValidationEngine(strict_role_enum=True)` + to enforce the 6-value set." + - **NOT a breaking change.** No values removed. No API + surface narrowed. + - Effort: ~120 LoC (rule + engine flag + test) + docs. + +- **9.9 (alignment guard test)** Add + `tests/unit/test_v07_model_schema_alignment.py`. Two tiers: + - **Strict tier (block CI):** schema-required fields with no + Pydantic-side coverage AND a Layer-1/Layer-2 contradiction. + After 9.7 + 9.8 are merged, this asserts D1 is fully + fixed (statusListIndex int) and D2 is fully documented + (the gradient is intentional, PRT001 wired). + - **Drift-watch tier (advisory; emits a CI warning):** any + NEW drift items not registered in the `EXPECTED_DRIFT` + constant. Forces every future PR that introduces drift + to update the constant — no silent widening. + - **Compatibility tier (NEW):** assert v0.6 → v0.7 upgrade + of every `tests/fixtures/upstream/v0.6.x/` fixture + succeeds; assert CIRPASS round-trip on every + `tests/fixtures/valid/cirpass-1.3.0/` succeeds. Catches + any future model edit that silently breaks the cross- + version / cross-family pipeline. + - Implementation: ~120 LoC. Reuses the + `walk_schema_defs(schema_path)` helper (introduced in + 9.7's test file) and existing fixture-walk helpers in + `tests/integration/test_version_matrix.py`. + +- **9.10 (CHANGELOG drift section)** Edit `CHANGELOG.md`'s + `0.5.0` entry to include: + - § "Bug fixes": D1 (statusListIndex int with back-compat + coercion — non-breaking), D2 (PartyRoleEnum gradient + documented + PRT001 rule — non-breaking). + - § "Known limitations": one-line summary of Tier 2 + 3 + items deferred to 0.6.0; link to Phase 10 of this plan. + - § "Cross-version compatibility": explicit affirmation + that v0.6.0 / v0.6.1 fixtures parse, validate, and + upgrade without regression; that all CIRPASS v1.3.0 + round-trips remain bit-stable. + +- **9.11 (NEW — cross-version regression baseline)** Add a + release-gate manual checkpoint that runs the + cross-version + cross-family regression suite immediately + before tagging. Captures pass/skip counts in the release + PR description. Single command: + `pytest tests/integration/test_version_matrix.py + tests/integration/test_compat_roundtrip.py + tests/integration/test_round_trip_untp_cirpass.py + tests/integration/test_cross_family_isolation.py + tests/unit/test_engine_extended.py -v`. Expected counts: + current baseline + zero deltas (no new failures, no new + skips). + +**UAT scenarios (manual, pre-tag).** + +| # | Scenario | Expected | +|---|---|---| +| U1 | UNTP v0.7 → export `cirpass-jsonld` → validate (CIRPASS pipeline) → migrate `--to=untp-0.7` | Zero errors; ≤ documented `MAP00X` count | +| U2 | CIRPASS v1.3 (Phase 0 fixture) → validate → export `cirpass-jsonld` → re-ingest → re-validate | Bit-stable JSON-LD output | +| U3 | UNTP v0.6 → migrate `--to=untp-0.7` → migrate `--to=cirpass-1.3` → validate | Zero errors | +| U4 | Textile v2 fixture and Tyre Birth v0.9 fixture through their plugin pipelines | Zero errors | + +**Deliverables** + +- `dppvalidator 0.5.0` on PyPI. +- CHANGELOG entry. +- UAT sign-off in PR description. + +**Tests** — none new; relies on existing suites. + +**Exit criteria** + +- [ ] All four UAT scenarios pass with reviewer sign-off. +- [ ] `pypi-publish` skill clean. + +#### Phase 9 status — 2026-05-09 (10/11 tasks complete; 9.6 PyPI step reserved) + +End-to-end implementation of the Phase 9 release cut. **Tasks 9.1 +through 9.5 plus 9.7 through 9.11 are landed in source on +`develop`.** Task 9.6 (`pypi-publish` skill — PyPI upload) is +reserved for the release manager and runs on the merged release +branch. + +**Refactors and additions:** + +- **9.1 (DEFAULT_VERSIONS flip).** `DEFAULT_VERSIONS[UNTP] = "0.7.0"` + in [`schemas/registry.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/schemas/registry.py). + v0.6.x remains supported via auto-detection and the v0.6 → v0.7 + upgrade shim. **9.2** was already complete + (`DEFAULT_VERSIONS[CIRPASS] = "1.3.0"` shipped in 0.4.0). +- **9.3 + 9.10 (CHANGELOG 0.5.0).** Family-keyed entry authored at + the top of [`CHANGELOG.md`](https://github.com/artiso-ai/dppvalidator/blob/main/CHANGELOG.md): + CIRPASS-2, UNTP, Plugins sections + Bug fixes + Deprecations + + Cross-version compatibility + Known limitations + Migration + guide. All 27 Phase 8.9 drift items referenced; D1/D2 closures + documented as non-breaking. +- **9.4 (deprecation activation).** Three surfaces now emit + `DeprecationWarning` in 0.5.0: + - Bare-string `SCHEMA_REGISTRY[version]` lookup → wrapped in a + `_DeprecatedSchemaRegistryDict` subclass; `__getitem__` warns. + Internal CLI `schema.py` migrated to the tuple-keyed source of + truth so we don't warn on our own callers. + - `is_dpp_document()` alias → emits warning suggesting + `looks_like_dpp()`. + - `EUDPP_CONTEXT_URL` (already deprecated via PEP 562 in earlier + phase) — verified still emitting. +- **9.5 (UAT scenarios).** Manually exercised the 4 scenarios from + the table above: + - U1: ✓ v0.7 fixture validates clean; CIRPASS-JSONLD export + produces 2540 bytes with 3 documented MAP-warnings. + - U2: ✓ Round-trip via the Python compat layer (6 forward + 2 + reverse warnings, all documented as MAP00X codes). + - U3: ✓ v0.6 → v0.7 → CIRPASS chained migration succeeds with + documented UPG001 + MAP001 warnings (lossy lift on + performanceClaims is by-design per Phase 7 pilot scope). + - U4: ✓ Tyre plugin pipeline 7/7 green. Textile-v2 rules + correctly fire on a non-textile fixture (TXT001…TXT007 flag + missing textile fields) — informational, not a regression. +- **9.7 (D1 BLOCKER).** + [`models/v0_7/envelope.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/envelope.py): + `BitstringStatusListEntry.statusListIndex` is now `int | None` + with `ge=0`; new `_coerce_status_list_index` `before` validator + transparently converts numeric strings (whitespace-tolerant, + leading-zero-tolerant) for v0.6 fixture back-compat. 12 new + parametrized tests in + [`tests/unit/test_v07_models.py::TestStatusListIndexCoercion`](https://github.com/artiso-ai/dppvalidator/blob/main/tests/unit/test_v07_models.py) + cover bare int, numeric string, whitespace, leading zero, zero, + None, non-numeric string (rejected), float string (rejected), + mixed-alpha string (rejected), negative int (rejected via ge=0), + and round-trip int preservation. +- **9.8 (D2 BLOCKER).** PartyRoleEnum acceptance gradient + documented + `SCHEMA_STRICT_ROLES` constant + `is_schema_strict()` + method added to + [`models/v0_7/identifiers.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/identifiers.py). + New advisory rule + [`PartyRoleAcceptanceGradientRule`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/validators/rules/v0_7/party_role.py) + emitting `PRT001` (info) when payload uses one of the 14 wider + values, with a documented `SUGGESTED_STRICT_REMAP` table. New + `ValidationEngine(strict_role_enum=True)` flag upgrades PRT001 + from info → error at emit-time. New PRT001.md doc page wired + into mkdocs nav. 31 new tests in + [`tests/unit/test_party_role_gradient.py`](https://github.com/artiso-ai/dppvalidator/blob/main/tests/unit/test_party_role_gradient.py) + including a CIRPASS-shim-table-preservation test enforcing the + Phase 9 compatibility constraint. +- **9.9 (alignment guard test).** Three-tier guard at + [`tests/unit/test_v07_model_schema_alignment.py`](https://github.com/artiso-ai/dppvalidator/blob/main/tests/unit/test_v07_model_schema_alignment.py): + - Strict tier (5 tests): D1, D2 closure assertions. + - Drift-watch tier (1 test): walks every $def×model class pair; + asserts only the registered drift items in + `EXPECTED_SCHEMA_ONLY_DRIFT` and `EXPECTED_MODEL_ONLY_DRIFT` + appear. New drift forces a baseline update. + - Compat tier (3 tests): v0.6 model frozen, CredentialStatus + alias, CIRPASS reverse-shim 8 wider targets preserved. + - Disposition closures (3 tests): D21 + D23 confirmed faithful. +- **9.11 (cross-version regression baseline).** Single command + recorded in this status block — see "Quality gates" below. + +**Code-size delta:** + +- Total `src/dppvalidator/` LoC: 29 227 → 29 480 (+253 net). +- New module + [`validators/rules/v0_7/party_role.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/validators/rules/v0_7/party_role.py): + +103 LoC (rule class + remap table + docstrings). +- New errors page + [`docs/errors/PRT001.md`](https://github.com/artiso-ai/dppvalidator/blob/main/docs/errors/PRT001.md): + +95 lines (separate, doesn't count toward source LoC). +- Net source +253 buys: D1 fix + back-compat coercion (37 LoC), + D2 acceptance gradient + PRT001 wiring + engine flag (~140 LoC), + registry deprecation wrapper (~30 LoC), `is_dpp_document` + deprecation (~12 LoC), CLI migration to tuple-keyed registry + (~30 LoC). + +**Quality gates after Phase 9:** + +- `uv run pytest tests/`: **2525 passed / 36 skipped** (was 2468 + before Phase 9 — 12 + 31 + 12 + 1 + 1 = 57 new tests, of which + 12 are PRT001 parametrize fan-outs). +- Coverage: **92.05 % → 92.04 %** (effectively flat; new test code + with minor delta in instrumented files). +- `uv run ruff check src/ tests/`: **clean**. +- `uv run ruff format --check src/ tests/`: **clean**. +- `uv run ty check src/`: **clean**. +- `uv run --group docs mkdocs build --strict`: **clean** (PRT001 + page + nav entry). +- `uv run python scripts/check_error_docs.py`: **96/96** + documented + nav-wired (was 95/95 before PRT001). +- Cold-start contract: `import dppvalidator` still does not load + `models.cirpass`. + +**Cross-version regression baseline (task 9.11) — release manager +re-runs this before tagging:** + +```bash +uv run pytest \ + tests/integration/test_version_matrix.py \ + tests/integration/test_compat_roundtrip.py \ + tests/integration/test_round_trip_untp_cirpass.py \ + tests/integration/test_cross_family_isolation.py \ + tests/unit/test_engine_extended.py \ + -q --tb=short +``` + +Current baseline: **101 passed**. Required for release tag: zero +delta from this number. + +**Acceptance gradient verification (Phase 9.8 + Phase 9 compatibility constraint):** + +- v0.6 → v0.7 upgrade shim still emits `manufacturer` for the + one role v0.6 fixtures use (verified via + `test_engine_extended.py::TestUpgradeShim`). +- CIRPASS reverse shim's `_EUDPP_TO_UNTP_ROLE` table still + contains the 8 schema-rejected rich-extension targets + (`importer`, `distributor`, `retailer`, `logisticsProvider`, + `operator`, `regulator`, `serviceProvider`, `certifier`) — + asserted by + `test_v07_model_schema_alignment.py::TestCrossVersionCompatibility::test_cirpass_reverse_shim_table_preserved`. + +**Carried forward:** + +- Task 9.6 (`pypi-publish`) reserved for release manager. +- Phase 8.9 Tier-2/3/4/5/6 drift items (D3–D20, D25–D27) defer to + Phase 10. Alignment guard test currently registers the full + drift baseline; Phase 10's 10.9–10.15 will progressively shrink + `EXPECTED_SCHEMA_ONLY_DRIFT` and `EXPECTED_MODEL_ONLY_DRIFT` + toward empty, then flip the strict-tier `EXPECTED_DRIFT` + assertion from advisory to hard. + +--- + +### Phase 10 — `0.6.0` Stable + cleanup + +**Goal:** Lock CIRPASS APIs; drop deprecated surfaces; claim Stable. +**Effort:** M (~3 days) · **Depends on:** Phase 9 · **Ships in:** `0.6.0`. + +**Compatibility constraints (inherited from Phase 9):** + +Every model-alignment task in Phase 10 (10.9–10.14) MUST +preserve: + +1. UNTP v0.6.0 / v0.6.1 fixture parsing & validation — + v0.6 models stay frozen. +2. `compat/upgrade_0_6_to_0_7.py` output — keep emitting + v0.7-shaped payloads that the v0.7 ValidationEngine + accepts. When promoting a `model_extra` field to + first-class, audit the upgrade shim for whether it + currently emits the field; if it doesn't, decide + whether to extend the shim's coverage or accept the + field as null on upgrade-from-v0.6. +3. `compat/untp_0_7_to_cirpass_1_3.py` and + `compat/cirpass_1_3_to_untp_0_7.py` — both shims + continue to round-trip every CIRPASS v1.3.0 fixture + bit-stably. Field promotions in 10.9 will surface new + keys in `model_dump()` output → audit both shims for + `model_fields` iteration assumptions. +4. CIRPASS reference structure v1.3.0 conformance — + no model edit should perturb the CIRPASS pipeline. +5. The Phase 9 cross-version regression suite (task 9.11) + must remain green after each Phase 10 task lands. + +**Per-task compatibility checklist (apply to 10.9–10.14):** + +- [ ] Run `pytest tests/integration/test_version_matrix.py` → + zero pass-count delta. +- [ ] Run `pytest tests/integration/test_compat_roundtrip.py + tests/integration/test_round_trip_untp_cirpass.py` → + zero pass-count delta. +- [ ] Round-trip every `tests/fixtures/upstream/v0.6.x/` + fixture through the v0.6 → v0.7 upgrade shim → confirm + no new MDL050 / SCH001 errors. +- [ ] Round-trip every `tests/fixtures/valid/cirpass-1.3.0/` + fixture through forward + reverse shim → confirm + bit-stable JSON-LD output. +- [ ] If a model-only field is being deprecated/removed + (D17, D18), grep the entire codebase for usage + including plugins (`plugins/textiles/`, + `plugins/tyres/`) and the EUDPP / CIRPASS exporters. + +**Tasks** + +- **10.1** Remove pre-1.9 EUDPP TTLs (`v1.7.1`, `v1.5.1`, `v1.4.7`, + `v2.0`, `v1.3.1`) and the corresponding manifest rows. +- **10.2** Remove the legacy `EUDPP_CONTEXT_URL` registration. +- **10.3** Remove the `--profile textile-v1` flag and the + `textile-v1` entry-point. +- **10.4** Remove the `is_dpp_document` alias. +- **10.5** Remove the bare-string `SCHEMA_REGISTRY` lookup wrapper. +- **10.6** Promote `plugins/tyres/` from `pre-1.0` to `1.0`. +- **10.7** Remove the superseded `docs/concepts/eudpp-ontology-alignment.md` + stub. +- **10.8** Run `code-health`, `docs-health`, `claude-health`. +- **10.9 (Phase 8.9 Tier-3 MEDIUM scope)** Promote the 14 + schema-declared fields currently stored only as + `model_extra` to first-class Pydantic fields across + `Party`, `Link`, `Package`, `Period`, `RenderTemplate2024` + (drift items D8, D10–D14, D16–D18). Specifics: + - `Party` (D10): add `registration_country` (Country), + `party_address` (Address), `organisation_website` (str), + `industry_category` (list[str | Classification]), + `party_also_known_as` (list[Identifier]). + - `Link` (D8 + D11 + D16): rename `name` → `link_name` + (alias `linkName`, REQUIRED per schema); add `link_type` + (alias `linkType`, optional). Deprecate `description` and + `relationship` (model-only, not in schema) — emit + `DeprecationWarning` in 0.6.0; remove in 0.7.0. + - `Package` (D9 + D12 + D17): replace the legacy + `package_type` + `weight` fields with schema's + `description`, `dimensions`, `material_used` (REQUIRED), + `package_label`, `performance_claim` (optional). Carry the + legacy fields as `model_extra` access only with a + deprecation warning. + - `Period` (D13): add `period_information` field. The + `start_date`/`end_date` Required-vs-Optional drift (D7) is + handled separately under task 10.10. + - `RenderTemplate2024` (D14 + D18): add `media_query`, `url`; + deprecate the model-only `id` field. + - Cross-cutting: the alignment guard test from 9.9 should + flip its xfail markers for D8, D10–D14, D16–D18 as those + items resolve. CIRPASS forward shim + [`compat/untp_0_7_to_cirpass_1_3.py`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py) + should pick up the promoted fields automatically (it + iterates `model_fields`); audit afterward to confirm. + - Audit the rule corpus for attribute-access opportunities: + Phase 4 ACT/REL/SOC rules currently use + `getattr(model, "partyAddress", None)` workarounds; switch + to first-class field access once 10.9 lands. + - Effort estimate: ~1.5 days; ~150 LoC of model code + + fixture/test/exporter touch-up. Confined to v0.7 — v0.6 + models are frozen per the cardinal rule. + +- **10.10 (Phase 8.9 Tier-2 HIGH — required-vs-Optional + reconciliation)** Tighten the Pydantic safety lattice to + match schema's required-marker semantics (drift items D3, + D4, D6, D7). Strategy: introduce a *strict-mode pair* per + affected class so callers can pick: + - `Address` / `AddressPermissive`: strict variant marks + all 5 schema-required fields as Pydantic-required; + permissive variant retains current Optional shape for + incremental construction. Default validation entrypoint + (`ValidationEngine.validate`) parses with strict; programmatic + callers building incrementally can opt into permissive. + - Same pattern for `BitstringStatusListEntry`, + `Claim`, `Period`. The strict variant should ship as the + canonical class name; the permissive variant gets a + `Permissive` suffix. + - Edge cases: `Period` can legitimately have one bound open + (semantic rule REL003 covers `validFrom < validTo` but + also allows either to be missing). Schema's REQUIRED on + both `startDate` and `endDate` is at the payload level + where Period is *populated* — when a payload omits + `period` entirely, the Period $def is never visited. The + strict variant therefore only fires when a Period dict is + being instantiated from a real schema-bound source. + - Cross-fixture audit: walk all 0.7 fixtures + (`tests/fixtures/valid/*.json`) and confirm none rely on + the permissive shape for required schema fields. + - Effort: ~1 day; ~80 LoC + ~50 fixture-touch lines. + +- **10.11 (Phase 8.9 Tier-2 HIGH — closed-enum coverage)** + Add closed Pydantic enums for the 2 schema enum sites not + currently typed (drift items: `BitstringStatusListEntry.statusPurpose` + and `BitstringStatusListEntry.type`). + - `StatusPurposeEnum`: 4 values — `refresh`, `revocation`, + `suspension`, `message`. + - `BitstringStatusListEntryTypeEnum`: 1 value — + `BitstringStatusListEntry` (it's a singleton; using a + `Literal["BitstringStatusListEntry"]` type is leaner). + - `BitstringStatusListEntry.id` REVERSE drift (D5): drop + the Pydantic-side requirement on `id` — schema marks it + OPTIONAL. + - Effort: ~30 LoC; mostly enum class additions. + +- **10.12 (Phase 8.9 Tier-5 — format-constraint enforcement)** + Promote the 13 schema sites with `format: uri` and + `format: byte` from plain `str` to typed Pydantic + annotations (drift items D19, D20). + - The 12 `format: uri` sites all currently use `str`. Switch + to the project's existing `FlexibleUri` type (defined in + [models/v0_7/primitives.py](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/models/v0_7/primitives.py)) + which permits both URLs and DIDs. Sites: + `CredentialIssuer.id`, `BitstringStatusListEntry.id`, + `BitstringStatusListEntry.statusListCredential`, + `RenderTemplate2024.url`, `IssuingSoftware.id`, + `Product.id`, plus the others enumerated in Phase 8.9 D19. + - The 1 `format: byte` site (`Image.imageData`) — add a + Pydantic validator that verifies base64 decodability via + `base64.b64decode(value, validate=True)`. + - Edge cases: `did:` URIs must remain valid (FlexibleUri + handles this); empty strings must be rejected; whitespace + must be stripped. + - Effort: ~40 LoC + ~10 test cases (1 happy / 1 sad per + site grouping). + +- **10.13 (Phase 8.9 Tier-4 LOW — model-only field audit)** + Catalogue every model-only field (D15, D16, D17, D18) and + decide per-field: keep as documented extension, or remove. + - `Claim.classification` — likely intentional (used by + semantic rules); keep but document in the v0.7 model + docstring as "extension beyond schema". + - `Link.name`, `.description`, `.relationship` — the `name` + field becomes `linkName` per 10.9; `description` and + `relationship` deprecated. + - `Package.packageType`, `.weight` — UNTP 0.6 holdover; + remove (with deprecation warning for one minor). + - `RenderTemplate2024.id` — not in schema; deprecate. + - Effort: ~20 LoC delete + comment additions. + +- **10.14 (Phase 8.9 Tier-6 — unmapped class follow-up, + REFINED 2026-05-09)** Two of the four are CLOSED by + audit (D21 `DigitalProductPassport`, D23 + `IdentifierScheme` — both confirmed faithful to schema's + shape). Remaining work is documentation-only: + - **D22 `Facility`**: add a class docstring noting + "extension for `DigitalFacilityRecord` cross-credential + references; not a v0.7.0 schema $def". Confirm semantic + rules don't expect schema validation of `Facility` + instances (they shouldn't — `Facility` flows through as + a reference, not as a validated payload). + - **D24 `SoftwareVendor`**: add a class docstring noting + "helper class for `IssuingSoftware.vendor`; not exposed + as a schema $def — the schema inlines the same shape + under `IssuingSoftware.properties.vendor`". + - Effort: ~10 LoC of docstring additions. + +- **10.15 (Phase 8.9 Tier-8 — CIRPASS v1.3 model layer + field-level deep diff, NEW 2026-05-09)** Phase 8.9 audited + UNTP v0.7 exhaustively but only counted CIRPASS v1.3 + classes (20 model vs 20 schema $defs — counts match). A + field-level deep diff analogous to Phase 8.9's + methodology, repointed at `models/cirpass/v1_3/` and + `schemas/data/cirpass-reference-1.3.0.json`, is queued + here. + - Walk every $def in `cirpass-reference-1.3.0.json` → + inventory of field/type/required/format/enum/pattern. + - Walk every CIRPASS Pydantic model class → inventory of + aliases / required / annotation. + - Diff: schema-only, model-only, required-vs-Optional, + type drift, enum drift. + - **Specific gaps surfaced by Phase 8.9 final pass:** + - **D26**: schema `HazardCategory` and `LifeCycleStage` + $defs (both 0-prop terminology classes) have no + Pydantic model. Investigate whether they should map + to v1.9.x EUDPP enum members + (e.g. `vocabularies/eudpp_classes.py` / + `vocabularies/eudpp_lca.py`) or to dedicated Pydantic + classes. The 0-prop shape suggests they're enum + anchors (used as `@type`-style references), in which + case Pydantic-class modelling isn't needed but a + docstring pointer back to the EUDPP enum class is. + - **D27**: catalogue any new field-level drift + surfaced by the deep diff. Tier each finding using + the same severity model as Phase 8.9 (BLOCKER / + HIGH / MEDIUM / LOW / FORMAT). Register each in a + Phase 8.10 status block (pattern follows 8.9). + - **Compatibility constraint inherited from Phase 9 + block**: every CIRPASS model edit must preserve forward + and reverse shim round-trips and keep + `tests/integration/test_round_trip_untp_cirpass.py` + bit-stable. + - Effort: ~1 day audit + tiering. Code edits depend on + findings (likely 0–100 LoC depending on drift density). + +**Deliverables** + +- `dppvalidator 0.6.0` on PyPI. +- All deprecated surfaces removed. +- Phase 8.9 alignment guard test flips from advisory to + strict (zero `EXPECTED_DRIFT` entries). + +**Tests** + +- `pytest -W error` (warnings-as-errors mode) — no deprecation + warnings emitted by the suite. +- `tests/unit/test_vocab_loader_perf.py` — cold-start import time + within +20 ms of the `tests/baselines/import_time.json` baseline. +- `tests/unit/test_v07_model_schema_alignment.py` (added in + task 9.9): `EXPECTED_DRIFT` constant must be empty and the + strict-tier assertion must fire for every $def×model class + pair — actionable drift items closed (D21 + D23 already + closed at 8.9 audit; D1 + D2 closed at Phase 9; D3–D20, + D25 closed at Phase 10.9–10.13; D26 + D27 closed at + Phase 10.15). +- New per-class round-trip parity tests: load every fixture + in `tests/fixtures/valid/*.json`, parse → dump → diff. Zero + paths lost, zero unexpected paths gained (the +1 path in + the Phase 8.9 battery survey came from a synthesised + default and is acceptable; document any residual diff). + +**Exit criteria** + +- [ ] No deprecation warnings under `-W error`. +- [ ] Cold-start budget met (median of 5 CI runs). +- [ ] All three health checks clean. +- [ ] Alignment guard test passes with empty `EXPECTED_DRIFT`. +- [ ] All 0.7 fixture round-trips bit-stable (or differences + explicitly documented). +- [ ] CIRPASS forward shim picks up all newly-promoted fields + (manual smoke: export the battery fixture as + `cirpass-jsonld` and confirm `materialUsed` / `linkName` + / `linkType` now appear in the projection). + +--- + +## 3. Cross-cutting workstreams + +Each closes by a specific phase to keep them from drifting. + +| ID | Workstream | Closes by | Notes | +|---|---|---|---| +| X1 | Code-generation hygiene | Phase 3 | Generators under `tools/codegen/`, never `src/`. Files header `# generated-from: @`. CI gate `tools/codegen/check_drift.py`. *Status (2026-05-08):* enum-regeneration generator [`tools/codegen/cirpass/regenerate_enums.py`](../../tools/codegen/cirpass/regenerate_enums.py) ✓ landed with self-test against v1.7.1 TTL fixture; JSON Schema deriver and drift-gate wrapper still owned by Phase 3. | +| X2 | SHACL pipeline | Phase 4 | `pyshacl` integration tests at `tests/integration/shacl/`. Module-level shape-graph caching keyed on `(family, module, version)` + bundled SHA | +| X3 | Performance budget | Phase 9 | `python -X importtime`, median of 5; baseline at `tests/baselines/import_time.json`; +20 ms ceiling. Lazy-import CIRPASS package | +| X4 | Property-based + fuzz tests | Phase 5 | Hypothesis strategy per Phase 3 model; round-trip invariants from Phase 5 | +| X5 | CI matrix | Phase 9 | New job `CIRPASS_FAMILY=cirpass` runs only the CIRPASS subset for early isolation regression detection | +| X6 | Internationalisation | Phase 5 | `LocalisedText` (Phase 3) field-level; mapping shim (Phase 5) preserves exactly one `MAP001` per dropped language | + +--- + +## 4. Risk register + +| ID | Risk | Lik. | Imp. | Mitigation | +|---|---|---|---|---| +| R1 | Hub re-publishes a module under same version with mutated axioms | M | H | GUID + SHA pinning; nightly `tools/snapshot/check_drift.py` | +| R2 | UNTP↔CIRPASS mapping has more lossy fields than expected | M | M | Phase 5 audit table is authoritative; `MAP00X` warnings, never silent loss | +| R3 | IDENT/MAT/EVENT/COMP publish mid-window with shifted hierarchies | M | H | Deliberately not scaffolded; §6.3 recipe adopts post-`0.6.0` | +| R4 | Pilot data models shift between Preview and Stable | H | L–M | Plugins separately versioned; profile flags buy a release of grace | +| R5 | SHACL evaluation latency spikes test suite | M | M | Module-level shape caching; `@pytest.mark.integration` excluded from pre-commit | +| R6 | `pyshacl` / `rdflib` upstream breakage | L | M | Pinned in `pyproject.toml`; pre-commit smoke `tools/check_rdf_stack.py` | +| R7 | Bare version-literal regressions slip past linter | L | H | Phase 1 task 1.14 extends the guard to CIRPASS literals | +| R8 | Plugin license contamination (GPL → MIT core) | L | H (legal) | [`.claude/rules/plugin-licenses.md`](../../.claude/rules/plugin-licenses.md) + `tools/check_imports.py` (Phase 7 task 7.9) | +| R9 | Cardinal versioning rule erosion under PR pressure | M | M | Every PR in Phases 2–6 references the rules file; reviewer checklist | +| R10 | `DigitalProductPassport` type-name shared across families → detection ambiguity | M | H | Phase 2 resolves by `@context` first, shape signature second; `DET001` rather than fallthrough | +| R11 | EU regulation expands multi-language scope beyond Phase 3's set | L | M | `LocalisedText` is field-level; expansion adds fields, doesn't break | +| R12 | `https://w3id.org/eudpp/...` IRIs do not dereference (D-0.3 fails) | L | H | Phase 0 task 0.4 verifies; failure escalates as a hard blocker | +| R13 | Performance budget breached by SHACL eager-load | M | L | Lazy-import (X3); `tests/unit/test_vocab_loader_perf.py` asserts | +| R14 | Derived JSON Schema diverges from tree-view exports across hub revisions | M | M | `tools/codegen/check_drift.py` nightly | + +--- + +## 5. Rollout + +| Release | Phases | Signal | +|---|---|---| +| `0.4.z` | Phases 0 → 2 | Additive only; no deprecation warnings yet | +| `0.5.0` Preview | Phases 3 → 9 | CIRPASS family available; deprecation warnings active; CHANGELOG sectioned by family | +| `0.6.0` Stable | Phase 10 | Deprecated surfaces removed; APIs locked | +| `0.6.z` opportunistic | §6.3 add-module recipe | New EUDPP modules light up as published | + +CHANGELOG sections from `0.5.0` onwards are family-keyed +(`### UNTP`, `### CIRPASS-2`, `### Plugins`). The authoritative +"what's bundled at this tag" reference is +`docs/concepts/cirpass-2-spec-snapshot.md`. + +--- + +## 6. Appendices + +### 6.1 File inventory delta + +**Added** + +```text +docs/concepts/cirpass-2-spec-snapshot.md +docs/concepts/cirpass-2-alignment.md +docs/concepts/eudpp-1.9-changelog.md +docs/concepts/untp-cirpass-mapping.md +docs/guides/migrate-untp-to-cirpass.md +docs/plans/CIRPASS_2_MIGRATION.md # this file +docs/plugins/tyres.md +docs/reference/cli/exit-codes.md +docs/reference/cirpass/ # mkdocstrings-generated +docs/adr/0001-cirpass-json-schema-derivation.md +docs/adr/0002-canonical-eudpp-iri.md +docs/adr/0003-tyre-license.md +src/dppvalidator/models/cirpass/__init__.py +src/dppvalidator/models/cirpass/v1_3/{passport,product,actor,material, + substances,lca,connector,i18n,temporal}.py +src/dppvalidator/validators/rules/cirpass_v1_3/{base,substances,lca, + actor,connector}.py +src/dppvalidator/compat/{untp_0_7_to_cirpass_1_3,cirpass_1_3_to_untp_0_7, + _untp_cirpass_map,_identifier_schemes,_mapping_codes}.py +src/dppvalidator/exporters/cirpass_jsonld.py +src/dppvalidator/schemas/data/cirpass-reference-1.3.0.json +src/dppvalidator/vocabularies/data/eudpp-context-v1.9.1.jsonld +src/dppvalidator/vocabularies/data/ontologies/{product_dpp_v1.9.1, + actors_roles_v1.9.1,soc_v1.9.1,lca_v1.9.4_Maki,eudpp_core_v1.9.1, + connector_v1.9.1}.ttl +plugins/tyres/{pyproject.toml,LICENSE,dppvalidator_tyres/...} +tools/snapshot/{fetch_cirpass.py,check_drift.py} +tools/codegen/cirpass/derive_schema.py +tools/codegen/check_drift.py +tools/check_imports.py +tests/baselines/import_time.json +tests/fixtures/{valid,invalid}/cirpass-1.3.0/ +tests/unit/test_{models_cirpass_v1_3,detection_cirpass, + detection_ambiguity,namespace_canonicality,mapping_codes, + registry_back_compat,cold_start_import,rules_cirpass_*}.py +tests/integration/{test_cirpass_v1_3_pipeline, + test_round_trip_untp_cirpass,test_cli_cirpass, + test_cli_export_matrix,test_cli_back_compat, + test_cross_family_isolation,test_i18n_roundtrip}.py +tests/integration/shacl/ +tests/property/{test_cirpass_v1_3_invariants, + test_round_trip_invariants}.py +tests/plugins/{tyres/,test_license_isolation.py} +``` + +**Updated** + +```text +src/dppvalidator/schemas/registry.py # SchemaFamily, two-axis registry, GUID field +src/dppvalidator/schemas/data/MANIFEST.json # CIRPASS rows + family/module/guid fields +src/dppvalidator/validators/detection.py # detect_schema_family + URL patterns + looks_like_dpp +src/dppvalidator/validators/engine.py # _PIPELINE_BY_FAMILY dispatch +src/dppvalidator/compat/__init__.py # active_version(family=…) +src/dppvalidator/exporters/eudpp_jsonld.py # v1.9.1 + canonical w3id IRIs +src/dppvalidator/exporters/contexts.py # family-keyed context registry +src/dppvalidator/vocabularies/ontology.py # TERM_MAPPINGS @ v1.9.1; namespace rebase; alias deletion +src/dppvalidator/vocabularies/eudpp_*.py # regenerated enums +src/dppvalidator/cli/main.py # --target, --format, migrate --to, exit codes +plugins/textiles/... # MVP Textile DPP v2 + textile-v1 profile +.claude/rules/untp-versioning.md # cardinal-rule extension to CIRPASS family +tests/unit/test_no_version_literals.py # CIRPASS / EUDPP module-version literal guards +README.md +mkdocs.yml +CHANGELOG.md +``` + +**Removed (Phase 10 only)** + +```text +src/dppvalidator/vocabularies/data/ontologies/{product_dpp_v1.7.1, + actors_roles_v1.5.1,soc_v1.4.7,lca_v2.0,eudpp_core_v1.3.1}.ttl +src/dppvalidator/vocabularies/ontology.py::CIRPASSNamespace_alias +docs/concepts/eudpp-ontology-alignment.md # superseded +``` + +### 6.2 Phase dependency graph + +```text + ┌──────────────┐ + │ Phase 0 │ + └──────┬───────┘ + ┌─────────────┴─────────────┐ + ▼ ▼ + ┌──────────┐ ┌──────────────┐ + │ Phase 1 │ │ Phase 2 │ + └────┬─────┘ └──────┬───────┘ + └─────────────┬──────────────┘ + ▼ + ┌──────────┐ + │ Phase 3 │ + └────┬─────┘ + ▼ + ┌──────────┐ + │ Phase 4 │ + └────┬─────┘ + ▼ + ┌──────────┐ + │ Phase 5 │ + └────┬─────┘ + ┌─────────┴─────────┐ + ▼ ▼ + ┌──────────┐ ┌──────────┐ + │ Phase 6 │ │ Phase 7 │ + └────┬─────┘ └────┬─────┘ + └─────────┬──────────┘ + ▼ + ┌──────────┐ + │ Phase 8 │ + └────┬─────┘ + ▼ + ┌──────────┐ + │ Phase 9 │ + └────┬─────┘ + ▼ + ┌──────────┐ + │ Phase 10 │ + └──────────┘ +``` + +**Critical path:** 0 → 1 → 3 → 4 → 5 → 6 → 8 → 9 → 10 (Phase 7 runs in parallel with 6). + +### 6.3 CIRPASS minimum-touch list + +Mirror of the UNTP version-bump touch list at +[`.claude/rules/untp-versioning.md`](../../.claude/rules/untp-versioning.md). +For new CIRPASS modules (the IDENT / MAT / EVENT / COMP scenario) or new +CIRPASS message versions, when adding `` at version ``: + +- `src/dppvalidator/schemas/registry.py` — register bundled artefact if any. +- `src/dppvalidator/exporters/contexts.py` — JSON-LD context entry if any. +- `src/dppvalidator/schemas/data/MANIFEST.json` — manifest row with + `family: "cirpass"` (message) or `family: "eudpp-ontology"` (TTL), + `module: `, `vocabulary_hub_guid`, SHA-256. +- `src/dppvalidator/vocabularies/data/ontologies/_.ttl` — vendored TTL. +- `src/dppvalidator/vocabularies/ontology.py::TERM_MAPPINGS` — new rows. +- `src/dppvalidator/vocabularies/eudpp_.py` — refreshed/new enum. +- `src/dppvalidator/models/cirpass/v1_X/.py` — Pydantic models. +- `src/dppvalidator/validators/rules/cirpass_v1_X/.py` — semantic rules. +- `tests/fixtures/valid/cirpass-1.X.0/_*.json` — fixtures. +- `tests/integration/test_version_matrix.py` — new family+version row. +- `docs/plans/CIRPASS____MIGRATION.md` — migration doc if a major version. + +If you touched more than this list, you're either fixing an unrelated bug +(split the PR) or going around the version-aware spine (don't). + +--- + +*End of plan v3. Locked decisions: D-0.1, D-0.3, D-naming, D-default-family. +Open action items: OA-1 (tyre license, Phase 0 close), OA-2 (Battery Pass +follow-on plan, post-`0.6.0`).* diff --git a/docs/plugins/tyres.md b/docs/plugins/tyres.md new file mode 100644 index 0000000..ea782d9 --- /dev/null +++ b/docs/plugins/tyres.md @@ -0,0 +1,136 @@ +# Tyres pilot plugin (`dppvalidator-tyres`) + +> **Status: Pre-1.0 / Experimental.** The GDSO Birth v0.9 and +> Recycling v0.1 declarations are still moving — the rule contract +> may shift before the 1.0 cut. Production users should pin a specific +> version of `dppvalidator-tyres` and review the [`CHANGELOG`](https://github.com/artiso-ai/dppvalidator/blob/main/plugins/tyres/CHANGELOG.md) +> on every minor bump. + +Phase 7 of [`docs/plans/CIRPASS_2_MIGRATION.md`] introduces this +plugin, scaffolded against the GDSO public declaration specs. + +## Installation + +```sh +uv add dppvalidator-tyres +# or +pip install dppvalidator-tyres +``` + +The plugin auto-registers via Python entry-points under the +`dppvalidator.validators` and `dppvalidator.exporters` groups; no +glue code is required at the call site. After install, verify +discovery: + +```sh +$ uv run python -c "from dppvalidator.plugins.discovery import list_available_plugins; print(list_available_plugins())" +{'validators': [..., 'tyr001_dot_marking', 'tyr002_birth_chain', ...], 'exporters': [..., 'tyres_lifecycle_csv']} +``` + +## Wire shape + +Tyre lifecycle data is carried in the UNTP DPP envelope's extension +slot: + +```json +{ + "credentialSubject": { + "extensions": { + "tyreLifecycleHistory": { + "birth": { /* GDSO Birth v0.9 */ }, + "events": [ + {"type": "collection", "declaration": { /* GDSO Collection v0.1 */ }}, + {"type": "retread", "declaration": { /* GDSO Retread v0.1 */ }}, + {"type": "recycling", "declaration": { /* GDSO Recycling v0.1 */ }} + ] + } + } + } +} +``` + +The plugin's rules walk this extension slot. Passports without the +extension are silently skipped — non-tyre DPPs are not affected. + +## Rule reference + +| Code | Severity | Topic | Notes | +| -------- | -------- | ------------------------ | ------------------------------------------------------------------------------- | +| `TYR001` | error | DOT marking | Required: `plantCode`, `sizeCode`, `weekOfYear`, `year`. | +| `TYR002` | error | Birth chain completeness | Required: `tyreUuid`, `manufacturer.id`, `manufacturer.name`. | +| `TYR003` | warning | Load index | ETRTO range 60–130 (passenger / light-truck). | +| `TYR004` | warning | Speed rating | ETRTO letter set: L, M, N, P, Q, R, S, T, U, H, V, W, Y, Z. | +| `TYR005` | warning | Tyre dimensions | Section width 125–445 mm; aspect ratio 20–95; rim 10–24 in. | +| `TYR006` | error | Retread provenance | Every Retread event names `previousBirthUuid`. | +| `TYR007` | warning | Collection actor | Every Collection event identifies `collector.id` + `name`. | +| `TYR008` | warning | Recycling method | Closed set: mechanical / pyrolysis / devulcanisation / energy_recovery / other. | + +## Models + +| Model | Source spec | Description | +| ---------------------- | -------------------- | --------------------------------------------- | +| `Birth` | GDSO Birth v0.9 | Manufacturer's at-mould declaration. | +| `Collection` | GDSO Collection v0.1 | Collector's at-end-of-first-life declaration. | +| `Retread` | GDSO Retread v0.1 | Retreader's renewal declaration. | +| `Recycling` | GDSO Recycling v0.1 | Recycler's at-end-of-life declaration. | +| `TyreLifecycleHistory` | (aggregate) | Birth + chronologically-ordered events. | + +The aggregate enforces three cross-event invariants: + +1. Every event's `tyreUuid` matches the Birth's. +1. Events are chronologically ordered. +1. At most one Recycling event (lifecycle terminator). + +## Exporter + +The plugin registers a CSV exporter under +`dppvalidator.exporters` group as `tyres-lifecycle-csv`. It +flattens a TyreLifecycleHistory into one row per declaration: + +```sh +$ uv run dppvalidator export passport.json --format tyres-lifecycle-csv +tyreUuid,eventType,timestamp,actorId,actorName,method +550e8400-e29b-41d4-a716-446655440000,birth,2026-04-15T10:30:00+00:00,https://example.com/operator/acme,ACME Tyres, +550e8400-e29b-41d4-a716-446655440000,collection,2030-08-01T09:00:00+00:00,https://example.com/op/collector-eu,EU Tyre Collection, +550e8400-e29b-41d4-a716-446655440000,recycling,2031-02-15T11:00:00+00:00,https://example.com/op/recycler-de,Recycle Co.,mechanical +``` + +## License + +GPL-3.0-or-later. See +[`plugins/tyres/LICENSE`](https://github.com/artiso-ai/dppvalidator/blob/main/plugins/tyres/LICENSE) +for the rationale; the canonical text is at +. + +The dppvalidator core is MIT-licensed; per the +[plugin-license isolation rule](https://github.com/artiso-ai/dppvalidator/blob/main/.claude/rules/plugin-licenses.md), +the core never imports from this plugin (one-way dependency only). + +## Status note + +Phase 7 of the migration plan (2026-05) explicitly marks this +plugin **Pre-1.0 / Experimental**. The GDSO Birth v0.9 spec was +last updated in 2025; the Recycling v0.1 declaration is still in +draft. Expect breaking changes in: + +- Field names on the four declaration models. +- Rule IDs (the `TYR0NN` numerals are stable; the *severity* of + individual rules may shift between minor releases). +- Exporter column ordering. + +The plugin will move to 1.0 when the GDSO declarations stabilise +(planned for 2026-Q4). + +## Sample fixture + +A minimal Birth-only fixture lives at +[`plugins/tyres/samples/birth.json`](https://github.com/artiso-ai/dppvalidator/blob/main/plugins/tyres/samples/birth.json). +The Phase 7 exit criterion requires this fixture to validate +cleanly: + +```sh +uv run dppvalidator validate plugins/tyres/samples/birth.json +# → ✓ VALID +``` + +[`docs/plans/cirpass_2_migration.md`]: ../plans/CIRPASS_2_MIGRATION.md diff --git a/docs/reference/api/validators.md b/docs/reference/api/validators.md index caeaa2d..4acdb35 100644 --- a/docs/reference/api/validators.md +++ b/docs/reference/api/validators.md @@ -110,4 +110,4 @@ reset_default_registry() | SEM002 | semantic | Invalid date range | | SIG001 | crypto | Invalid signature | -> **Note:** This table shows common examples. See [Error Reference](../errors/) for the complete list of 70+ error codes. +> **Note:** This table shows common examples. See [Error Reference](../../errors/index.md) for the complete list of 70+ error codes. diff --git a/docs/reference/cirpass/index.md b/docs/reference/cirpass/index.md new file mode 100644 index 0000000..1f9f669 --- /dev/null +++ b/docs/reference/cirpass/index.md @@ -0,0 +1,149 @@ +# CIRPASS reference structure v1.3.0 — API reference + +> **Status:** Final (Phase 8 finalisation, 2026-05-09). Generated +> from the Pydantic models at +> [`src/dppvalidator/models/cirpass/v1_3/`](https://github.com/artiso-ai/dppvalidator/tree/main/src/dppvalidator/models/cirpass/v1_3). + +The CIRPASS reference structure v1.3.0 is the message-level wire +format that the CIRPASS-2 project publishes alongside the EUDPP +ontology. dppvalidator's Pydantic models are the source of truth; +the JSON Schema bundled at +[`schemas/data/cirpass-reference-1.3.0.json`](https://github.com/artiso-ai/dppvalidator/tree/main/src/dppvalidator/schemas/data/cirpass-reference-1.3.0.json) +is *derived* from them via +[`tools/codegen/cirpass/derive_schema.py`](https://github.com/artiso-ai/dppvalidator/tree/main/tools/codegen/cirpass/derive_schema.py). + +## Reading guide + +| Topic | Page | +| ----------------------- | ----------------------------------------------------------------------- | +| Big-picture orientation | [`cirpass-2-alignment.md`](../../concepts/cirpass-2-alignment.md) | +| EUDPP module changelog | [`eudpp-1.9-changelog.md`](../../concepts/eudpp-1.9-changelog.md) | +| UNTP ↔ CIRPASS mapping | [`untp-cirpass-mapping.md`](../../concepts/untp-cirpass-mapping.md) | +| Migration how-to | [`migrate-untp-to-cirpass.md`](../../guides/migrate-untp-to-cirpass.md) | + +## Root: `ReferencePassport` + +The CIRPASS DPP reference structure root. Maps to `eudpp:DPP` +(P_DPP v1.9.1). Mirrors the v1.3.0 message tree-view shape: a +`Product` at the root + sibling fields for the DPP-level metadata. + +::: dppvalidator.models.cirpass.v1_3.ReferencePassport +options: +show_source: false +show_bases: false + +## Product / Identifier / Classification + +::: dppvalidator.models.cirpass.v1_3.Product +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.Identifier +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.ClassificationCode +options: +show_source: false +show_bases: false + +## Actor / Role + +::: dppvalidator.models.cirpass.v1_3.Actor +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.Facility +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.ActorRole +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.ActorRoleAssignment +options: +show_source: false +show_bases: false + +## Material / Composition + +::: dppvalidator.models.cirpass.v1_3.Material +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.Composition +options: +show_source: false +show_bases: false + +## Substances of Concern + +::: dppvalidator.models.cirpass.v1_3.SubstanceOfConcern +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.Concentration +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.HazardClassification +options: +show_source: false +show_bases: false + +## Life-Cycle Assessment + +::: dppvalidator.models.cirpass.v1_3.LifeCycleAssessment +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.ImpactResult +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.ImpactCategoryReference +options: +show_source: false +show_bases: false + +## Connector relations + +::: dppvalidator.models.cirpass.v1_3.ConnectorRelation +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.RelationType +options: +show_source: false +show_bases: false + +## Multilingual labels + +::: dppvalidator.models.cirpass.v1_3.LocalisedText +options: +show_source: false +show_bases: false + +## Temporal + +::: dppvalidator.models.cirpass.v1_3.EffectivePeriod +options: +show_source: false +show_bases: false + +::: dppvalidator.models.cirpass.v1_3.IssuedAt +options: +show_source: false +show_bases: false diff --git a/docs/reference/cli/exit-codes.md b/docs/reference/cli/exit-codes.md new file mode 100644 index 0000000..9cc6ddb --- /dev/null +++ b/docs/reference/cli/exit-codes.md @@ -0,0 +1,80 @@ +# CLI exit codes + +Phase 6 task 6.7 of [docs/plans/CIRPASS_2_MIGRATION.md] formalises +the CLI exit-code surface. Every `dppvalidator` subcommand exits +with one of the codes below; values are stable across releases so +shell wrappers / CI pipelines / plugins may pattern-match on the +integer. + +The same constants are exported from +[`dppvalidator.cli.main`](https://github.com/artiso-ai/dppvalidator/blob/main/src/dppvalidator/cli/main.py) +(`EXIT_VALID`, `EXIT_INVALID`, `EXIT_ERROR`, +`EXIT_FAMILY_MISMATCH`, `EXIT_BLOCKING_WARNINGS`, `EXIT_IO_ERROR`). + +| Code | Constant | Meaning | +| ---- | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `0` | `EXIT_VALID` | Operation completed successfully. For `validate`, the input passed every layer with zero errors. For `export` / `migrate`, the output was written. | +| `1` | `EXIT_INVALID` | Validation produced one or more errors. The input parsed cleanly but failed one or more validation layers. | +| `2` | `EXIT_ERROR` | Unrecoverable engine / shim failure (uncaught exception, dependency missing, invalid arguments not caught by argparse). The CLI prints a stack trace under `--verbose`. | +| `3` | `EXIT_FAMILY_MISMATCH` | A `--target` (or `--to`) flag explicitly contradicts the payload's detected family. Surfaces `DET001`. New in Phase 6. | +| `4` | `EXIT_BLOCKING_WARNINGS` | A migration / upgrade emitted at least one `warning`-or-higher diagnostic and the user did not pass `--accept-warnings`. The output is *not* written; a sidecar JSON file lists the warnings. New in Phase 6. | +| `5` | `EXIT_IO_ERROR` | IO failure: file not found, encoding error, glob pattern matched nothing, output path not writable. New in Phase 6 — distinguishes IO from logical failures. | + +## Per-command code matrix + +| Command | `0` | `1` | `2` | `3` | `4` | `5` | +| ---------- | ---------------------------------------------------- | ----------------------------------------- | -------------------------------------- | ------------------------------------- | --------------------------------------------- | ------------------------------------- | +| `validate` | All inputs valid | One or more validation errors | Engine crash | `--target` mismatches detected family | (n/a) | File-not-found / glob match nothing | +| `export` | Output written | Validation failed (input not a valid DPP) | Exporter crash | (n/a) | (n/a) | File-not-found / output write failure | +| `migrate` | Output written, no warnings (or `--accept-warnings`) | (n/a) | Shim crash | `--from` / `--to` family mismatch | Blocking warnings without `--accept-warnings` | File-not-found / output write failure | +| `schema` | Listing / download / info succeeded | (n/a) | Unknown subcommand or download failure | (n/a) | (n/a) | Output directory not writable | + +## Examples + +### Family-mismatch (code 3) + +```sh +$ dppvalidator validate cirpass-payload.json --target untp +DET001: cirpass-payload.json — payload looks like 'cirpass' + but --target='untp' pins the other family. +$ echo $? +3 +``` + +### Blocking-warnings during migrate (code 4) + +```sh +$ dppvalidator migrate untp-fixture.json --to cirpass-1.3 -o out.json +Migration to CIRPASS reference structure v1.3.0 emitted 5 blocking +warning(s); refusing to write. Re-run with --accept-warnings to +override, or fix the issues listed in the sidecar warnings file. + [MAP002] (warning) $.dppIdentifier.scheme: … + [MAP002] (warning) $.product.productName[0].language: … + … +$ echo $? +4 +$ ls out.json out.json.warnings.json +out.json.warnings.json # only the sidecar exists; no main output +``` + +### IO error (code 5) + +```sh +$ dppvalidator validate /no/such/file.json +File not found: /no/such/file.json +$ echo $? +5 +``` + +## Rationale + +Pre-Phase-6, the CLI used `0`/`1`/`2` for `valid`/`invalid`/`error`. +Wrappers couldn't distinguish "the file didn't exist" from "the +engine crashed", or "you passed `--target=untp` on a CIRPASS +payload" from "the payload had validation errors". Phase 6 adds +three orthogonal codes (`3` family-mismatch, `4` blocking-warnings, +`5` IO) so CI pipelines can branch precisely on the failure shape +without parsing the human-readable output. + +The legacy codes remain unchanged so existing scripts and +golden-snapshot tests keep passing. diff --git a/mkdocs.yml b/mkdocs.yml index bccfd65..1ff8232 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -154,20 +154,29 @@ nav: - JSON-LD Export: guides/jsonld.md - EU DPP Export: guides/eudpp-export.md - Migration 0.6 → 0.7: guides/migration-0-6-to-0-7.md + - Migrate UNTP → CIRPASS: guides/migrate-untp-to-cirpass.md - Plugin Development: guides/plugins.md - Reference: - CLI Reference: reference/cli.md + - CLI Exit Codes: reference/cli/exit-codes.md - API: - Models: reference/api/models.md - Validators: reference/api/validators.md - Exporters: reference/api/exporters.md - Plugins: reference/api/plugins.md + - CIRPASS Models: reference/cirpass/index.md - Concepts: + - CIRPASS-2 Alignment: concepts/cirpass-2-alignment.md + - CIRPASS-2 Spec Snapshot: concepts/cirpass-2-spec-snapshot.md + - EUDPP v1.9 Changelog: concepts/eudpp-1.9-changelog.md + - UNTP ↔ CIRPASS Mapping: concepts/untp-cirpass-mapping.md - UNTP DPP versions: concepts/untp-versions.md - UNTP DPP Schema: concepts/untp-schema.md - Five-Layer Validation: concepts/validation-layers.md - - EU DPP Ontology Alignment: concepts/eudpp-ontology-alignment.md - - CIRPASS-2 Implementation: concepts/cirpass-implementation.md + - EU DPP Ontology Alignment (legacy): concepts/eudpp-ontology-alignment.md + - CIRPASS-2 Implementation (legacy): concepts/cirpass-implementation.md + - Plugins: + - Tyres (Pre-1.0): plugins/tyres.md - Contributing: - Development Setup: contributing/development-setup.md - Code Style: contributing/code-style.md @@ -270,6 +279,10 @@ nav: - TXT003 - Missing Microplastic Data: errors/TXT003.md - TXT004 - Missing Durability Info: errors/TXT004.md - TXT005 - Missing Care Instructions: errors/TXT005.md + - TXT006 - Missing Recycled-Content Disclosure: errors/TXT006.md + - TXT007 - Missing Repair Information: errors/TXT007.md + - Party-Role Errors: + - PRT001 - PartyRole Acceptance Gradient: errors/PRT001.md - Version Errors: - VER001 - UNTP Version Mismatch: errors/VER001.md - Upgrade-shim Errors: diff --git a/plugins/tyres/LICENSE b/plugins/tyres/LICENSE new file mode 100644 index 0000000..545eb74 --- /dev/null +++ b/plugins/tyres/LICENSE @@ -0,0 +1,30 @@ +dppvalidator-tyres — Tyres pilot plugin for dppvalidator +Copyright (C) 2026 artiso-ai + +SPDX-License-Identifier: GPL-3.0-or-later + +This program is free software: you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation, either version 3 of the License, or (at your +option) any later version. + +This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +The full text of the GNU General Public License version 3 is available +at and in the source-tree +file `LICENSE.GPL-3.0` when bundled by Phase 8 of +`docs/plans/CIRPASS_2_MIGRATION.md` (documentation phase). + +The SPDX identifier `GPL-3.0-or-later` in `pyproject.toml` is the +legally binding license declaration; this file is a human-readable +pointer to the canonical text. + +Decision rationale: Phase 7 OA-1 of `docs/plans/CIRPASS_2_MIGRATION.md` +chose GPL-3.0-or-later for the tyres plugin so derived rule-pack +extensions remain in the open-source commons. The dppvalidator core +(`src/dppvalidator/`) is MIT-licensed; per the plugin-license isolation +rule (`.claude/rules/plugin-licenses.md`), the core never imports from +this plugin — the dependency direction is one-way (plugin → core). diff --git a/plugins/tyres/README.md b/plugins/tyres/README.md new file mode 100644 index 0000000..f74eca7 --- /dev/null +++ b/plugins/tyres/README.md @@ -0,0 +1,56 @@ +# dppvalidator-tyres + +> **Status: Pre-1.0 / Experimental.** The GDSO Birth v0.9 and Recycling +> v0.1 specifications are still moving — this plugin tracks them but +> does *not* claim production stability. The rule contract may shift +> without a SemVer bump until 1.0. + +GDSO-aligned tyres pilot plugin for [dppvalidator]. Adds Pydantic +models for the tyre lifecycle declarations (Birth, Collection, +Retread, Recycling) plus an aggregate Tyre Lifecycle History +wrapper, and registers `TYR001…TYR008` validation rules. + +## Installation + +```sh +uv add dppvalidator-tyres +# or +pip install dppvalidator-tyres +``` + +The plugin auto-registers via Python entry-points (group +`dppvalidator.validators`); no glue code required at the call site. + +## Rule codes + +| Code | Severity | Topic | +| -------- | -------- | ------------------------------------------------------------------------------------------- | +| `TYR001` | error | DOT marking present and well-formed | +| `TYR002` | error | Birth declaration carries the manufacturer-actor chain | +| `TYR003` | warning | Load index is in the standard ETRTO range | +| `TYR004` | warning | Speed rating is a recognised letter code | +| `TYR005` | warning | Section width / aspect ratio / rim diameter look sane | +| `TYR006` | error | Retread declarations name the upstream Birth UUID | +| `TYR007` | warning | Collection declarations identify the collecting actor | +| `TYR008` | warning | Recycling declarations name the recycling method (mechanical / pyrolysis / devulcanisation) | + +## Models + +Models live under `dppvalidator_tyres.models`. The aggregate +`TyreLifecycleHistory` wraps the four declaration types with a +single tyre-identifying Birth and a chronologically-ordered list +of subsequent events. + +## License + +GPL-3.0-or-later. See [`LICENSE`](LICENSE) for the rationale; the +canonical license text is at . + +## Status note + +The GDSO Birth v0.9 spec was last updated in 2025. The Recycling v0.1 +declaration is still in draft. The plugin's rule-pack interpretation +follows what's in the GDSO public docs; expect breaking changes in +the rule IDs / message wording until 1.0. + +[dppvalidator]: https://github.com/artiso-ai/dppvalidator diff --git a/plugins/tyres/pyproject.toml b/plugins/tyres/pyproject.toml new file mode 100644 index 0000000..25e7679 --- /dev/null +++ b/plugins/tyres/pyproject.toml @@ -0,0 +1,68 @@ +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[project] +name = "dppvalidator-tyres" +version = "0.1.0" +description = "GDSO-aligned tyres pilot plugin for dppvalidator (Pre-1.0 / Experimental)" +readme = "README.md" +requires-python = ">=3.10" +# Phase 7 OA-1 of docs/plans/CIRPASS_2_MIGRATION.md decided on +# GPL-3.0-or-later for the tyres plugin (the upstream GDSO +# declarations are licensed permissively but the plugin's +# rule-pack interpretation is published as copyleft to keep the +# rule extensions in the open-source commons). +license = "GPL-3.0-or-later" +authors = [{ name = "artiso-ai" }] +keywords = ["dpp", "tyres", "gdso", "validation", "plugin", "dppvalidator"] +classifiers = [ + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Topic :: Software Development :: Libraries", +] +dependencies = [ + # Tyres plugin sits on top of the dppvalidator core. The Phase 7 + # entry into the migration plan added validator entry-points + # consuming the SemanticRule protocol — which has been stable + # since 0.4.0. + "dppvalidator>=0.4.0", + "pydantic>=2.0", +] + +# Phase 7 task 7.7: validator entry-points. Each validator class +# implements the SemanticRule protocol from +# dppvalidator.validators.protocols. The plugin discovery mechanism +# (src/dppvalidator/plugins/registry.py) auto-registers these on +# engine init; no per-payload glue needed. +[project.entry-points."dppvalidator.validators"] +tyr001_dot_marking = "dppvalidator_tyres.validators:DOTMarkingRule" +tyr002_birth_chain = "dppvalidator_tyres.validators:BirthChainCompletenessRule" +tyr003_load_index = "dppvalidator_tyres.validators:LoadIndexRule" +tyr004_speed_rating = "dppvalidator_tyres.validators:SpeedRatingRule" +tyr005_section_width = "dppvalidator_tyres.validators:SectionWidthRule" +tyr006_retread_provenance = "dppvalidator_tyres.validators:RetreadProvenanceRule" +tyr007_collection_actor = "dppvalidator_tyres.validators:CollectionActorRule" +tyr008_recycling_method = "dppvalidator_tyres.validators:RecyclingMethodRule" + +# Phase 7 task 7.7: exporter entry-points (CSV summary of the +# tyre's lifecycle history — useful for fleet operators). +[project.entry-points."dppvalidator.exporters"] +tyres_lifecycle_csv = "dppvalidator_tyres.exporters:TyreLifecycleCSVExporter" + +[project.urls] +Homepage = "https://github.com/artiso-ai/dppvalidator/tree/main/plugins/tyres" +Documentation = "https://artiso-ai.github.io/dppvalidator/plugins/tyres/" +"Bug Tracker" = "https://github.com/artiso-ai/dppvalidator/issues" + +[tool.hatch.build.targets.wheel] +packages = ["src/dppvalidator_tyres"] + +[tool.hatch.build.targets.sdist] +include = ["/src", "/README.md", "/LICENSE", "/samples"] diff --git a/plugins/tyres/samples/birth.json b/plugins/tyres/samples/birth.json new file mode 100644 index 0000000..3d60109 --- /dev/null +++ b/plugins/tyres/samples/birth.json @@ -0,0 +1,70 @@ +{ + "@context": [ + "https://www.w3.org/ns/credentials/v2", + "https://vocabulary.uncefact.org/untp/0.7.0/context/" + ], + "type": ["DigitalProductPassport", "VerifiableCredential"], + "id": "https://example.com/dpp/tyre-birth-001", + "name": "Sample Tyre DPP — Birth", + "issuer": { + "id": "did:web:acme-tyres.example.com", + "name": "ACME Tyres", + "type": ["CredentialIssuer"] + }, + "validFrom": "2026-04-15T00:00:00+00:00", + "credentialSubject": { + "type": ["Product"], + "id": "https://example.com/tyre/550e8400-e29b-41d4-a716-446655440000", + "name": "ACME PrimeRoad 205/55 R16 91V", + "idScheme": { + "id": "https://gs1.org/voc/", + "name": "GS1 GTIN" + }, + "idGranularity": "item", + "itemNumber": "550e8400-e29b-41d4-a716-446655440000", + "productCategory": [ + { + "schemeId": "https://www.wcoomd.org/en/topics/nomenclature/instrument-and-tools/hs-nomenclature-2022-edition.aspx", + "schemeName": "WCO HS", + "code": "401110", + "name": "New pneumatic tyres of rubber, of a kind used on motor cars" + } + ], + "producedAtFacility": { + "id": "https://example.com/facility/acme-bulgaria", + "name": "ACME Bulgaria Plant", + "type": ["Facility"] + }, + "countryOfProduction": { + "countryCode": "BG", + "countryName": "Bulgaria" + }, + "extensions": { + "tyreLifecycleHistory": { + "birth": { + "tyreUuid": "550e8400-e29b-41d4-a716-446655440000", + "manufacturer": { + "id": "https://example.com/operator/acme", + "name": "ACME Tyres", + "lei": "529900T8BM49AURSDO55" + }, + "manufacturedAt": "2026-04-15T10:30:00+00:00", + "dotMarking": { + "plantCode": "B7", + "sizeCode": "5K", + "weekOfYear": 16, + "year": 2026 + }, + "size": { + "sectionWidth": 205, + "aspectRatio": 55, + "rimDiameter": 16, + "loadIndex": 91, + "speedRating": "V" + } + }, + "events": [] + } + } + } +} diff --git a/plugins/tyres/src/dppvalidator_tyres/__init__.py b/plugins/tyres/src/dppvalidator_tyres/__init__.py new file mode 100644 index 0000000..a431c32 --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/__init__.py @@ -0,0 +1,46 @@ +"""dppvalidator-tyres — GDSO-aligned tyres pilot plugin. + +Phase 7 of [docs/plans/CIRPASS_2_MIGRATION.md] in the +``dppvalidator`` core. The plugin lives at ``plugins/tyres/`` of the +core repo and is published to PyPI as ``dppvalidator-tyres``. + +Status: **Pre-1.0 / Experimental.** The GDSO Birth v0.9 and +Recycling v0.1 declarations are still moving; rule IDs and message +wording may change before the 1.0 cut. + +Public surface +============== + +- :mod:`dppvalidator_tyres.models` — Pydantic models for the four + GDSO declarations (Birth, Collection, Retread, Recycling) plus + the :class:`TyreLifecycleHistory` aggregate. +- :mod:`dppvalidator_tyres.validators` — eight ``TYR0NN`` rules + registered via entry-points. +- :mod:`dppvalidator_tyres.exporters` — CSV exporter that flattens + a Tyre Lifecycle History into one row per declaration. + +Cardinal rule +============= + +Per ``.claude/rules/plugin-licenses.md`` (rule §1, "No reverse +imports"): the dppvalidator core MUST NOT import from this package. +Phase 7 task 7.9 of the migration plan adds a CI gate +(``tools/check_imports.py``) that fails the build when the rule +is violated. +""" + +from __future__ import annotations + +__version__ = "0.1.0" +__experimental__ = True + +# Stable identifier the migration plan + CLI use to mark this +# plugin as Pre-1.0 in help text. Phase 9 of the migration plan +# uses it to gate the deprecation-warning ramp. +PLUGIN_STATUS = "Pre-1.0 / Experimental" + +__all__ = [ + "PLUGIN_STATUS", + "__experimental__", + "__version__", +] diff --git a/plugins/tyres/src/dppvalidator_tyres/exporters.py b/plugins/tyres/src/dppvalidator_tyres/exporters.py new file mode 100644 index 0000000..9a40883 --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/exporters.py @@ -0,0 +1,112 @@ +"""Tyres pilot exporters. + +Phase 7 task 7.7 of [docs/plans/CIRPASS_2_MIGRATION.md]. Registers +a CSV exporter via the ``dppvalidator.exporters`` entry-point group. + +The CSV exporter flattens a TyreLifecycleHistory into one row per +declaration, useful for fleet operators who want a tabular ledger +of their tyre population's lifecycle. +""" + +from __future__ import annotations + +import csv +import io +from typing import Any + + +class TyreLifecycleCSVExporter: + """Flatten a TyreLifecycleHistory into CSV rows. + + Each row is one declaration: the Birth + every subsequent + event becomes a separate line. Columns are: + + - ``tyreUuid`` + - ``eventType`` (birth | collection | retread | recycling) + - ``timestamp`` (ISO 8601) + - ``actorId`` + - ``actorName`` + - ``method`` (Recycling only) / ``process`` (Retread only) / empty + + Implements the minimal :class:`Exporter` protocol — + ``export(passport) -> str``. + """ + + name: str = "tyres-lifecycle-csv" + format_id: str = "tyres-lifecycle-csv" + + def export(self, passport: Any) -> str: + """Return a CSV string flattening the tyreLifecycleHistory extension. + + When the passport carries no tyreLifecycleHistory, returns + a CSV with just the header row. + """ + from dppvalidator_tyres.validators.rules import _extract_history + + history = _extract_history(passport) + buf = io.StringIO() + writer = csv.writer(buf) + writer.writerow( + [ + "tyreUuid", + "eventType", + "timestamp", + "actorId", + "actorName", + "method", + ] + ) + if history is None: + return buf.getvalue() + birth = history.get("birth") + if isinstance(birth, dict): + manuf = birth.get("manufacturer") or {} + writer.writerow( + [ + birth.get("tyreUuid", ""), + "birth", + birth.get("manufacturedAt", ""), + manuf.get("id", ""), + manuf.get("name", ""), + "", + ] + ) + events = history.get("events") + if not isinstance(events, list): + return buf.getvalue() + for event in events: + if not isinstance(event, dict): + continue + event_type = event.get("type") or "" + decl = event.get("declaration") or {} + if not isinstance(decl, dict): + continue + actor_key = { + "collection": "collector", + "retread": "retreader", + "recycling": "recycler", + }.get(event_type, "") + actor = decl.get(actor_key) or {} + ts_key = { + "collection": "collectedAt", + "retread": "retreadedAt", + "recycling": "recycledAt", + }.get(event_type, "") + writer.writerow( + [ + decl.get("tyreUuid", ""), + event_type, + decl.get(ts_key, ""), + actor.get("id", ""), + actor.get("name", ""), + decl.get("method", "") + if event_type == "recycling" + else decl.get("process", "") + if event_type == "retread" + else "", + ] + ) + return buf.getvalue() + + +__all__ = ["TyreLifecycleCSVExporter"] diff --git a/plugins/tyres/src/dppvalidator_tyres/models/__init__.py b/plugins/tyres/src/dppvalidator_tyres/models/__init__.py new file mode 100644 index 0000000..0f91bbe --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/models/__init__.py @@ -0,0 +1,45 @@ +"""Tyres pilot models — GDSO declarations. + +Phase 7 task 7.5 of [docs/plans/CIRPASS_2_MIGRATION.md]. Pydantic v2 +models for the four GDSO tyre-lifecycle declaration types plus the +:class:`TyreLifecycleHistory` aggregate. + +The four declarations correspond to the GDSO public specs: + +- :class:`Birth` (v0.9) — manufacturer's at-mould declaration. +- :class:`Collection` (v0.1) — collector's at-end-of-first-life + declaration. +- :class:`Retread` (v0.1) — retreader's renewal declaration. +- :class:`Recycling` (v0.1) — recycler's at-end-of-life declaration. + +Plus a wrapper: + +- :class:`TyreLifecycleHistory` — a single Birth + chronologically- + ordered list of subsequent events. + +Status: Pre-1.0 / Experimental — the upstream GDSO Birth v0.9 and +Recycling v0.1 declarations are still moving; field names may +shift before 1.0. +""" + +from __future__ import annotations + +from dppvalidator_tyres.models.actor import TyreActor +from dppvalidator_tyres.models.birth import Birth, DOTMarking, TyreSize +from dppvalidator_tyres.models.collection import Collection +from dppvalidator_tyres.models.history import TyreLifecycleEvent, TyreLifecycleHistory +from dppvalidator_tyres.models.recycling import Recycling, RecyclingMethod +from dppvalidator_tyres.models.retread import Retread + +__all__ = [ + "Birth", + "Collection", + "DOTMarking", + "Recycling", + "RecyclingMethod", + "Retread", + "TyreActor", + "TyreLifecycleEvent", + "TyreLifecycleHistory", + "TyreSize", +] diff --git a/plugins/tyres/src/dppvalidator_tyres/models/actor.py b/plugins/tyres/src/dppvalidator_tyres/models/actor.py new file mode 100644 index 0000000..c177fbf --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/models/actor.py @@ -0,0 +1,53 @@ +"""Shared TyreActor model — used by every GDSO declaration. + +Each declaration (Birth / Collection / Retread / Recycling) names +the actor that produced it. The actor is identified by a stable +URI plus a human-readable name and an optional GLEIF LEI (the +recommended GDSO identifier scheme). +""" + +from __future__ import annotations + +import re + +from pydantic import BaseModel, ConfigDict, Field, field_validator + +# GLEIF Legal Entity Identifier — 20 alphanumeric characters per +# ISO 17442. The GDSO actor declaration recommends LEI as the +# canonical operator identifier. +_LEI_RE = re.compile(r"^[0-9A-Z]{20}$") + + +class TyreActor(BaseModel): + """An actor in the tyre lifecycle (manufacturer, collector, retreader, recycler). + + Wire shape: + ``{"id": "https://example.com/operator/X", + "name": "ACME Tyre Manufacturing", + "lei": "529900T8BM49AURSDO55"}`` + """ + + model_config = ConfigDict(populate_by_name=True, str_strip_whitespace=True) + + id: str = Field(..., description="URI / DID identifying the actor.") + name: str = Field(..., min_length=1, description="Human-readable actor name.") + lei: str | None = Field( + default=None, + description=( + "GLEIF Legal Entity Identifier (ISO 17442). 20 alphanumeric " + "characters; recommended by GDSO for production declarations." + ), + ) + + @field_validator("lei") + @classmethod + def _validate_lei(cls, value: str | None) -> str | None: + if value is None: + return value + if not _LEI_RE.match(value): + msg = ( + f"TyreActor.lei={value!r} is not a valid GLEIF LEI " + "(20 alphanumeric characters per ISO 17442)." + ) + raise ValueError(msg) + return value diff --git a/plugins/tyres/src/dppvalidator_tyres/models/birth.py b/plugins/tyres/src/dppvalidator_tyres/models/birth.py new file mode 100644 index 0000000..d3e4102 --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/models/birth.py @@ -0,0 +1,200 @@ +"""GDSO Birth v0.9 declaration — at-mould tyre birth event. + +The Birth declaration is the *canonical* identity declaration for a +manufactured tyre. Every subsequent declaration (Collection, +Retread, Recycling) references the Birth's ``tyre_uuid`` so the +lifecycle history can be reconstructed. + +Status: Pre-1.0 — GDSO Birth v0.9 is the current published spec but +field names may shift before 1.0. + +Wire shape (minimal, ~paraphrased from GDSO public docs): + + { + "tyreUuid": "550e8400-e29b-41d4-a716-446655440000", + "manufacturer": { + "id": "https://example.com/operator/acme", + "name": "ACME Tyres", + "lei": "529900T8BM49AURSDO55" + }, + "manufacturedAt": "2026-04-15T10:30:00+00:00", + "dotMarking": { + "plantCode": "B7", + "sizeCode": "5K", + "weekOfYear": 16, + "year": 2026 + }, + "size": { + "sectionWidth": 205, + "aspectRatio": 55, + "rimDiameter": 16, + "loadIndex": 91, + "speedRating": "V" + } + } +""" + +from __future__ import annotations + +import re +from datetime import datetime +from uuid import UUID + +from pydantic import BaseModel, ConfigDict, Field, field_validator + +from dppvalidator_tyres.models.actor import TyreActor + +# DOT plant code: 1-2 alphanumeric chars (per US DOT 49 CFR 574). +_DOT_PLANT_RE = re.compile(r"^[0-9A-Z]{1,2}$") +# DOT size code: 1-2 alphanumeric chars per the manufacturer-specific +# table (49 CFR 574 leaves the encoding to each plant). +_DOT_SIZE_RE = re.compile(r"^[0-9A-Z]{1,4}$") +# ETRTO standard speed-rating letters. Q–Y per GB-T 2977; the very +# rare W/Y/(Y) sub-letters are accepted as bare letters. +_ETRTO_SPEED_RATINGS = frozenset( + ["L", "M", "N", "P", "Q", "R", "S", "T", "U", "H", "V", "W", "Y", "Z"] +) + + +class DOTMarking(BaseModel): + """DOT marking on the tyre sidewall (US DOT 49 CFR 574). + + Wire shape: + ``{"plantCode": "B7", "sizeCode": "5K", + "weekOfYear": 16, "year": 2026}`` + """ + + model_config = ConfigDict(populate_by_name=True, str_strip_whitespace=True) + + plant_code: str = Field( + ..., alias="plantCode", description="DOT plant code (1-2 alphanumeric chars)." + ) + size_code: str = Field(..., alias="sizeCode", description="Manufacturer-specific size code.") + week_of_year: int = Field( + ..., + alias="weekOfYear", + ge=1, + le=53, + description="ISO week number of manufacture (1-53).", + ) + year: int = Field( + ..., + ge=1971, # DOT marking system standardised in 1971. + le=2100, + description="Year of manufacture (4-digit).", + ) + + @field_validator("plant_code") + @classmethod + def _validate_plant_code(cls, value: str) -> str: + if not _DOT_PLANT_RE.match(value): + msg = f"DOT plant code {value!r} must be 1-2 alphanumeric uppercase chars." + raise ValueError(msg) + return value + + @field_validator("size_code") + @classmethod + def _validate_size_code(cls, value: str) -> str: + if not _DOT_SIZE_RE.match(value): + msg = f"DOT size code {value!r} must be 1-4 alphanumeric uppercase chars." + raise ValueError(msg) + return value + + +class TyreSize(BaseModel): + """ETRTO tyre size designation. + + Wire shape: + ``{"sectionWidth": 205, "aspectRatio": 55, "rimDiameter": 16, + "loadIndex": 91, "speedRating": "V"}`` + """ + + model_config = ConfigDict(populate_by_name=True, str_strip_whitespace=True) + + section_width: int = Field( + ..., + alias="sectionWidth", + ge=125, + le=445, + description="Section width in millimetres (ETRTO standard range 125-445).", + ) + aspect_ratio: int = Field( + ..., + alias="aspectRatio", + ge=20, + le=95, + description="Aspect ratio as a percentage (20-95).", + ) + rim_diameter: int = Field( + ..., + alias="rimDiameter", + ge=10, + le=24, + description="Rim diameter in inches (10-24 covers the passenger / light-truck range).", + ) + load_index: int = Field( + ..., + alias="loadIndex", + ge=0, + le=200, + description="ETRTO load index (0-200 covers the passenger / light-truck range).", + ) + speed_rating: str = Field( + ..., + alias="speedRating", + description="ETRTO speed-rating letter (L, M, N, P, Q, R, S, T, U, H, V, W, Y, Z).", + ) + + @field_validator("speed_rating") + @classmethod + def _validate_speed_rating(cls, value: str) -> str: + upper = value.upper() + if upper not in _ETRTO_SPEED_RATINGS: + msg = ( + f"Speed rating {value!r} is not in the ETRTO standard set " + f"({sorted(_ETRTO_SPEED_RATINGS)})." + ) + raise ValueError(msg) + return upper + + +class Birth(BaseModel): + """GDSO Birth v0.9 — manufacturer's at-mould tyre declaration. + + Wire shape (minimal): + ``{"tyreUuid": "...", "manufacturer": {...}, + "manufacturedAt": "...", "dotMarking": {...}, + "size": {...}}`` + + Cardinality: + + - ``tyre_uuid``: required (1) — globally unique tyre identifier. + - ``manufacturer``: required (1) — the actor that produced the tyre. + - ``manufactured_at``: required (1) — ISO 8601 timestamp. + - ``dot_marking``: required (1) — DOT 49 CFR 574 marking. + - ``size``: required (1) — ETRTO size + load + speed rating. + """ + + model_config = ConfigDict(populate_by_name=True, str_strip_whitespace=True) + + tyre_uuid: UUID = Field( + ..., + alias="tyreUuid", + description="UUID v4 identifying the physical tyre across its lifecycle.", + ) + manufacturer: TyreActor = Field(..., description="The manufacturer (an :class:`TyreActor`).") + manufactured_at: datetime = Field( + ..., + alias="manufacturedAt", + description="ISO 8601 timestamp of manufacture (timezone-aware required).", + ) + dot_marking: DOTMarking = Field(..., alias="dotMarking", description="DOT 49 CFR 574 marking.") + size: TyreSize = Field(..., description="ETRTO size + load + speed rating.") + + @field_validator("manufactured_at") + @classmethod + def _require_tz(cls, value: datetime) -> datetime: + if value.tzinfo is None: + msg = "Birth.manufacturedAt must be timezone-aware (typically UTC)." + raise ValueError(msg) + return value diff --git a/plugins/tyres/src/dppvalidator_tyres/models/collection.py b/plugins/tyres/src/dppvalidator_tyres/models/collection.py new file mode 100644 index 0000000..bc41f07 --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/models/collection.py @@ -0,0 +1,78 @@ +"""GDSO Collection v0.1 — at-end-of-first-life collection declaration. + +The Collection declaration records the moment a tyre is removed +from a vehicle and enters the recovered-materials stream. The +collector identifies itself, the upstream Birth (by ``tyre_uuid``), +and the disposition routing (e.g. ``retread``, ``recycle``, +``landfill``). + +Status: Pre-1.0 — GDSO Collection v0.1 is in draft; field names +may shift before 1.0. +""" + +from __future__ import annotations + +from datetime import datetime +from enum import Enum +from uuid import UUID + +from pydantic import BaseModel, ConfigDict, Field, field_validator + +from dppvalidator_tyres.models.actor import TyreActor + + +class CollectionDisposition(str, Enum): + """Where the tyre is routed after collection.""" + + RETREAD = "retread" + RECYCLE = "recycle" + REUSE = "reuse" + ENERGY_RECOVERY = "energy_recovery" + LANDFILL = "landfill" + UNKNOWN = "unknown" + + +class Collection(BaseModel): + """GDSO Collection v0.1 — collector's at-end-of-first-life declaration. + + Wire shape: + ``{"tyreUuid": "...", "collector": {...}, + "collectedAt": "...", "disposition": "retread", + "facilityId": "https://example.com/facility/X"}`` + """ + + model_config = ConfigDict( + populate_by_name=True, str_strip_whitespace=True, use_enum_values=True + ) + + tyre_uuid: UUID = Field( + ..., + alias="tyreUuid", + description="UUID of the tyre being collected (matches Birth.tyreUuid).", + ) + collector: TyreActor = Field(..., description="The actor collecting the tyre.") + collected_at: datetime = Field( + ..., + alias="collectedAt", + description="ISO 8601 timestamp of collection (timezone-aware required).", + ) + disposition: CollectionDisposition = Field( + ..., + description=( + "Where the tyre is routed: retread / recycle / reuse / " + "energy_recovery / landfill / unknown." + ), + ) + facility_id: str | None = Field( + default=None, + alias="facilityId", + description="Optional URI / DID of the collection facility.", + ) + + @field_validator("collected_at") + @classmethod + def _require_tz(cls, value: datetime) -> datetime: + if value.tzinfo is None: + msg = "Collection.collectedAt must be timezone-aware (typically UTC)." + raise ValueError(msg) + return value diff --git a/plugins/tyres/src/dppvalidator_tyres/models/history.py b/plugins/tyres/src/dppvalidator_tyres/models/history.py new file mode 100644 index 0000000..9b305ec --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/models/history.py @@ -0,0 +1,150 @@ +"""TyreLifecycleHistory v1 — aggregate of one tyre's full lifecycle. + +The history wraps one :class:`Birth` declaration plus a chronologically- +ordered list of subsequent events (Collection, Retread, Recycling). +Every event must reference the Birth's ``tyre_uuid``; the +:func:`_chain_consistent` validator enforces this. + +Status: Pre-1.0 — depends on Birth v0.9 / Recycling v0.1 which are +themselves moving. +""" + +from __future__ import annotations + +from datetime import datetime +from typing import Annotated, Any, Literal + +from pydantic import BaseModel, ConfigDict, Discriminator, Field, Tag, model_validator + +from dppvalidator_tyres.models.birth import Birth +from dppvalidator_tyres.models.collection import Collection +from dppvalidator_tyres.models.recycling import Recycling +from dppvalidator_tyres.models.retread import Retread + + +class _CollectionEvent(BaseModel): + """Discriminated wrapper for a Collection event.""" + + model_config = ConfigDict(populate_by_name=True) + type: Literal["collection"] = "collection" + declaration: Collection + + +class _RetreadEvent(BaseModel): + """Discriminated wrapper for a Retread event.""" + + model_config = ConfigDict(populate_by_name=True) + type: Literal["retread"] = "retread" + declaration: Retread + + +class _RecyclingEvent(BaseModel): + """Discriminated wrapper for a Recycling event.""" + + model_config = ConfigDict(populate_by_name=True) + type: Literal["recycling"] = "recycling" + declaration: Recycling + + +def _event_discriminator(value: Any) -> str | None: + """Pick the event subclass based on the ``type`` field. + + Pydantic v2 calls this with either a parsed model instance or a + raw dict (depending on whether the input had already been + pre-validated). We accept :class:`Any` because the discriminator + runs ahead of model validation and the narrowing behaviour + differs between input shapes. + """ + if isinstance(value, dict): + type_value = value.get("type") + if isinstance(type_value, str): + return type_value + return None + return getattr(value, "type", None) + + +TyreLifecycleEvent = Annotated[ + Annotated[_CollectionEvent, Tag("collection")] + | Annotated[_RetreadEvent, Tag("retread")] + | Annotated[_RecyclingEvent, Tag("recycling")], + Discriminator(_event_discriminator), +] + + +class TyreLifecycleHistory(BaseModel): + """Aggregate of one tyre's lifecycle (Birth + chronological events). + + Wire shape: + ``{"birth": {...}, + "events": [{"type": "collection", "declaration": {...}}, ...]}`` + + Cardinality: + + - ``birth``: required (1). + - ``events``: optional list, may be empty. + + Cross-event invariants: + + - Every event's ``tyre_uuid`` matches the birth's. + - Events are chronologically ordered (each event's timestamp is + ≥ the previous one). + - At most one Recycling event (the lifecycle's terminator). + - No events after a Recycling event. + """ + + model_config = ConfigDict(populate_by_name=True) + + birth: Birth = Field(..., description="The tyre's at-mould Birth declaration.") + events: list[TyreLifecycleEvent] = Field( + default_factory=list, + description="Chronologically-ordered subsequent events.", + ) + + @model_validator(mode="after") + def _chain_consistent(self) -> TyreLifecycleHistory: + birth_uuid = self.birth.tyre_uuid + last_ts: datetime | None = self.birth.manufactured_at + recycling_seen = False + for i, event in enumerate(self.events): + decl = event.declaration + if decl.tyre_uuid != birth_uuid: + msg = ( + f"TyreLifecycleHistory.events[{i}] references tyre_uuid=" + f"{decl.tyre_uuid!r} which does not match the Birth " + f"({birth_uuid!r})." + ) + raise ValueError(msg) + event_ts = _event_timestamp(decl) + if last_ts is not None and event_ts < last_ts: + msg = ( + f"TyreLifecycleHistory.events[{i}] timestamp " + f"{event_ts.isoformat()} predates the previous event / " + f"Birth ({last_ts.isoformat()}). Events must be in " + "chronological order." + ) + raise ValueError(msg) + last_ts = event_ts + if event.type == "recycling": + if recycling_seen: + msg = ( + "TyreLifecycleHistory: only one Recycling event " + "permitted (the lifecycle terminator)." + ) + raise ValueError(msg) + recycling_seen = True + elif recycling_seen: + msg = ( + f"TyreLifecycleHistory.events[{i}] follows a Recycling " + "event; nothing may happen after end-of-life." + ) + raise ValueError(msg) + return self + + +def _event_timestamp(decl: Collection | Retread | Recycling) -> datetime: + """Return the canonical timestamp for an event declaration.""" + if isinstance(decl, Collection): + return decl.collected_at + if isinstance(decl, Retread): + return decl.retreaded_at + return decl.recycled_at diff --git a/plugins/tyres/src/dppvalidator_tyres/models/recycling.py b/plugins/tyres/src/dppvalidator_tyres/models/recycling.py new file mode 100644 index 0000000..755b221 --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/models/recycling.py @@ -0,0 +1,126 @@ +"""GDSO Recycling v0.1 — at-end-of-life recycling declaration. + +The Recycling declaration is the *terminal* event in a tyre's +lifecycle: the tyre is shredded, devulcanised, pyrolysed, or +otherwise broken down into recovered materials. The declaration +identifies the recycler, the method, and the recovered-mass +fraction routed to each downstream stream. + +Status: Pre-1.0 — GDSO Recycling v0.1 is in draft. +""" + +from __future__ import annotations + +from datetime import datetime +from decimal import Decimal +from enum import Enum +from uuid import UUID + +from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator + +from dppvalidator_tyres.models.actor import TyreActor + + +class RecyclingMethod(str, Enum): + """Recycling method per the GDSO Recycling v0.1 enumeration. + + The four canonical industry methods plus ``other`` for emerging + technologies (e.g. cryogenic shred, microwave devulcanisation): + + - ``mechanical`` — shred → granulate → crumb rubber. + - ``pyrolysis`` — thermal decomposition into oil, gas, char, steel. + - ``devulcanisation`` — chemical / thermomechanical de-cross-linking. + - ``energy_recovery`` — combustion in a cement kiln or power plant. + """ + + MECHANICAL = "mechanical" + PYROLYSIS = "pyrolysis" + DEVULCANISATION = "devulcanisation" + ENERGY_RECOVERY = "energy_recovery" + OTHER = "other" + + +class RecoveryFraction(BaseModel): + """Recovery fraction for one downstream material stream. + + Wire shape: + ``{"materialName": "Steel wire", "massFractionRecovered": 0.12}`` + """ + + model_config = ConfigDict(populate_by_name=True, str_strip_whitespace=True) + + material_name: str = Field( + ..., + alias="materialName", + min_length=1, + description="Name of the recovered material stream.", + ) + mass_fraction_recovered: Decimal = Field( + ..., + alias="massFractionRecovered", + ge=Decimal("0"), + le=Decimal("1"), + description="Fraction (0..1) of the input tyre's mass routed to this stream.", + ) + + +class Recycling(BaseModel): + """GDSO Recycling v0.1 — recycler's at-end-of-life declaration. + + Wire shape: + ``{"tyreUuid": "...", "recycler": {...}, + "recycledAt": "...", "method": "mechanical", + "recoveryFractions": [...]}`` + """ + + model_config = ConfigDict( + populate_by_name=True, str_strip_whitespace=True, use_enum_values=True + ) + + tyre_uuid: UUID = Field( + ..., + alias="tyreUuid", + description="UUID of the tyre being recycled.", + ) + recycler: TyreActor = Field(..., description="The actor performing the recycling.") + recycled_at: datetime = Field( + ..., + alias="recycledAt", + description="ISO 8601 timestamp of the recycling event (timezone-aware required).", + ) + method: RecyclingMethod = Field( + ..., + description="Recycling method (mechanical / pyrolysis / devulcanisation / energy_recovery / other).", + ) + recovery_fractions: list[RecoveryFraction] | None = Field( + default=None, + alias="recoveryFractions", + description=( + "Optional per-stream recovery fractions. When present, the sum " + "of mass_fraction_recovered values must be ≤ 1.0." + ), + ) + + @field_validator("recycled_at") + @classmethod + def _require_tz(cls, value: datetime) -> datetime: + if value.tzinfo is None: + msg = "Recycling.recycledAt must be timezone-aware (typically UTC)." + raise ValueError(msg) + return value + + @model_validator(mode="after") + def _recovery_fractions_sum_within_one(self) -> Recycling: + if not self.recovery_fractions: + return self + total = sum( + (f.mass_fraction_recovered for f in self.recovery_fractions), + start=Decimal("0"), + ) + if total > Decimal("1.0001"): + msg = ( + f"Recycling.recoveryFractions sum {total} exceeds 1.0 — " + "recovered mass cannot exceed input mass." + ) + raise ValueError(msg) + return self diff --git a/plugins/tyres/src/dppvalidator_tyres/models/retread.py b/plugins/tyres/src/dppvalidator_tyres/models/retread.py new file mode 100644 index 0000000..eaa9868 --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/models/retread.py @@ -0,0 +1,88 @@ +"""GDSO Retread v0.1 — tyre-renewal declaration. + +The Retread declaration records the renewal of a previously-used +tyre by a retreader. The declaration references the upstream Birth +(by ``tyre_uuid``) and may chain through prior Retread events for +multi-cycle retreaded casings. + +Status: Pre-1.0 — GDSO Retread v0.1 is in draft. +""" + +from __future__ import annotations + +from datetime import datetime +from enum import Enum +from uuid import UUID + +from pydantic import BaseModel, ConfigDict, Field, field_validator + +from dppvalidator_tyres.models.actor import TyreActor + + +class RetreadProcess(str, Enum): + """Retread process per ETRTO Recommendations. + + The two industry-standard processes are: + + - ``hot_cure`` (mould cure): old casing + uncured rubber tread, + vulcanised in a heated mould. + - ``cold_cure`` (pre-cured tread): old casing + already-vulcanised + tread strip, bonded with a thin uncured layer in an autoclave. + """ + + HOT_CURE = "hot_cure" + COLD_CURE = "cold_cure" + OTHER = "other" + + +class Retread(BaseModel): + """GDSO Retread v0.1 — retreader's renewal declaration. + + Wire shape: + ``{"tyreUuid": "...", "retreader": {...}, + "retreadedAt": "...", "process": "cold_cure", + "previousBirthUuid": "..."}`` + """ + + model_config = ConfigDict( + populate_by_name=True, str_strip_whitespace=True, use_enum_values=True + ) + + tyre_uuid: UUID = Field( + ..., + alias="tyreUuid", + description="UUID of the renewed tyre. Stays stable across retread cycles.", + ) + retreader: TyreActor = Field(..., description="The actor performing the retread.") + retreaded_at: datetime = Field( + ..., + alias="retreadedAt", + description="ISO 8601 timestamp of the retread event (timezone-aware required).", + ) + process: RetreadProcess = Field( + ..., + description="Retread process — hot_cure / cold_cure / other.", + ) + previous_birth_uuid: UUID = Field( + ..., + alias="previousBirthUuid", + description=( + "UUID of the upstream Birth declaration. Required for chain " + "reconstruction; the validator's TYR006 rule pins this." + ), + ) + cycle_number: int | None = Field( + default=None, + alias="cycleNumber", + ge=1, + le=5, + description="Optional retread cycle (1 = first retread).", + ) + + @field_validator("retreaded_at") + @classmethod + def _require_tz(cls, value: datetime) -> datetime: + if value.tzinfo is None: + msg = "Retread.retreadedAt must be timezone-aware (typically UTC)." + raise ValueError(msg) + return value diff --git a/plugins/tyres/src/dppvalidator_tyres/validators/__init__.py b/plugins/tyres/src/dppvalidator_tyres/validators/__init__.py new file mode 100644 index 0000000..f82122f --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/validators/__init__.py @@ -0,0 +1,69 @@ +"""Tyres pilot validators — TYR-coded rules. + +Phase 7 task 7.6 of [docs/plans/CIRPASS_2_MIGRATION.md] in the +``dppvalidator`` core. Each rule implements the ``SemanticRule`` +protocol (``rule_id``, ``description``, ``severity``, +``check(passport)``) and is auto-registered via the entry-points +declared in ``plugins/tyres/pyproject.toml``. + +The plugin's wire-shape contract: the tyres lifecycle data lives +under ``credentialSubject.extensions.tyreLifecycleHistory`` (a +:class:`TyreLifecycleHistory` JSON serialisation). Rules walk this +extension; passports without it are silently skipped (not every +DPP is a tyre DPP). + +Rule codes (TYR001…TYR008): + +- ``TYR001`` (error): DOT marking present and well-formed. +- ``TYR002`` (error): Birth declaration carries the manufacturer + actor identifier. +- ``TYR003`` (warning): Load index in the standard ETRTO range + (60-130 covers passenger / light-truck). +- ``TYR004`` (warning): Speed rating is a recognised letter code. +- ``TYR005`` (warning): Section width / aspect ratio / rim + diameter look sane (light-truck range). +- ``TYR006`` (error): Retread declarations name the upstream + Birth UUID via ``previousBirthUuid``. +- ``TYR007`` (warning): Collection declarations identify the + collecting actor. +- ``TYR008`` (warning): Recycling declarations name the recycling + method (mechanical / pyrolysis / devulcanisation / energy_recovery). +""" + +from __future__ import annotations + +from dppvalidator_tyres.validators.rules import ( + BirthChainCompletenessRule, + CollectionActorRule, + DOTMarkingRule, + LoadIndexRule, + RecyclingMethodRule, + RetreadProvenanceRule, + SectionWidthRule, + SpeedRatingRule, +) + +# Public dispatch — useful for callers that want to construct a +# SemanticValidator with a hand-curated rule list. +TYR_RULES = ( + DOTMarkingRule(), + BirthChainCompletenessRule(), + LoadIndexRule(), + SpeedRatingRule(), + SectionWidthRule(), + RetreadProvenanceRule(), + CollectionActorRule(), + RecyclingMethodRule(), +) + +__all__ = [ + "BirthChainCompletenessRule", + "CollectionActorRule", + "DOTMarkingRule", + "LoadIndexRule", + "RecyclingMethodRule", + "RetreadProvenanceRule", + "SectionWidthRule", + "SpeedRatingRule", + "TYR_RULES", +] diff --git a/plugins/tyres/src/dppvalidator_tyres/validators/rules.py b/plugins/tyres/src/dppvalidator_tyres/validators/rules.py new file mode 100644 index 0000000..bb78108 --- /dev/null +++ b/plugins/tyres/src/dppvalidator_tyres/validators/rules.py @@ -0,0 +1,423 @@ +"""TYR-coded rule implementations for the tyres pilot. + +Each rule walks the UNTP DPP envelope, looking for the +``credentialSubject.extensions.tyreLifecycleHistory`` extension +slot. Passports without that extension are silently skipped (not +every DPP is a tyre DPP — the rule should be a no-op for textile +or generic DPPs). + +The extension is read as raw dict / model attribute data — no +pyyaml-style schema enforcement. Type errors caused by malformed +extensions are caught defensively; the model layer is responsible +for catching shape problems before these rules run. +""" + +from __future__ import annotations + +from typing import Any, Literal + +# ETRTO load-index range. The full table runs 0-279, but realistic +# passenger / light-truck tyres land in the 60-130 band; values +# outside this range are *technically valid* but warrant a check. +_ETRTO_LOAD_INDEX_PASSENGER = range(60, 131) +_ETRTO_SPEED_RATINGS = frozenset( + ["L", "M", "N", "P", "Q", "R", "S", "T", "U", "H", "V", "W", "Y", "Z"] +) +_RECYCLING_METHODS = frozenset( + ["mechanical", "pyrolysis", "devulcanisation", "energy_recovery", "other"] +) + + +# ============================================================================= +# Helpers +# ============================================================================= + + +def _extract_history(passport: Any) -> dict[str, Any] | None: + """Pull the tyreLifecycleHistory dict from the passport's extension slot. + + Returns ``None`` when the extension isn't present (the passport + isn't a tyre DPP). Tolerant of both dict-shaped extensions + (Pydantic ``extra="allow"`` payload) and structured Pydantic + sub-models. Returns the raw dict so the rule logic stays + decoupled from the plugin's own model classes. + """ + cs = getattr(passport, "credential_subject", None) + if cs is None and isinstance(passport, dict): + cs = passport.get("credentialSubject") or passport.get("credential_subject") + if cs is None: + return None + + extensions = getattr(cs, "extensions", None) + if extensions is None and isinstance(cs, dict): + extensions = cs.get("extensions") + if extensions is None: + return None + + history = ( + extensions.get("tyreLifecycleHistory") + if isinstance(extensions, dict) + else getattr(extensions, "tyreLifecycleHistory", None) + ) + if isinstance(history, dict): + return history + if hasattr(history, "model_dump"): + return history.model_dump(by_alias=True, mode="json") + return None + + +def _birth(history: dict[str, Any]) -> dict[str, Any] | None: + """Return the Birth sub-dict from a history payload.""" + birth = history.get("birth") + if isinstance(birth, dict): + return birth + return None + + +def _events(history: dict[str, Any]) -> list[dict[str, Any]]: + """Return the events list (may be empty).""" + events = history.get("events") + if isinstance(events, list): + return [e for e in events if isinstance(e, dict)] + return [] + + +# ============================================================================= +# TYR001 — DOT marking +# ============================================================================= + + +class DOTMarkingRule: + """TYR001: Birth declaration carries a well-formed DOT marking.""" + + rule_id: str = "TYR001" + description: str = "Tyre Birth declaration must carry a DOT marking (US DOT 49 CFR 574)" + severity: Literal["error", "warning", "info"] = "error" + suggestion: str = ( + "Add a ``dotMarking`` object to the Birth declaration with " + "plantCode, sizeCode, weekOfYear (1-53), and year (4-digit)." + ) + docs_url: str = "https://artiso-ai.github.io/dppvalidator/errors/TYR001" + + def check(self, passport: Any) -> list[tuple[str, str]]: + history = _extract_history(passport) + if history is None: + return [] + birth = _birth(history) + if birth is None: + return [] + path = "$.credentialSubject.extensions.tyreLifecycleHistory.birth.dotMarking" + dot = birth.get("dotMarking") + if not isinstance(dot, dict): + return [(path, "Tyre Birth declaration is missing dotMarking.")] + violations: list[tuple[str, str]] = [] + for required in ("plantCode", "sizeCode", "weekOfYear", "year"): + if dot.get(required) in (None, ""): + violations.append( + ( + f"{path}.{required}", + f"DOT marking is missing required field {required!r}.", + ) + ) + return violations + + +# ============================================================================= +# TYR002 — Birth chain completeness +# ============================================================================= + + +class BirthChainCompletenessRule: + """TYR002: Birth declaration carries the manufacturer actor + tyre UUID.""" + + rule_id: str = "TYR002" + description: str = "Tyre Birth declaration must carry tyreUuid and manufacturer actor" + severity: Literal["error", "warning", "info"] = "error" + suggestion: str = ( + "Populate ``tyreUuid`` (UUID v4) and ``manufacturer`` " + "(an actor object with id + name) on the Birth declaration." + ) + docs_url: str = "https://artiso-ai.github.io/dppvalidator/errors/TYR002" + + def check(self, passport: Any) -> list[tuple[str, str]]: + history = _extract_history(passport) + if history is None: + return [] + birth = _birth(history) + if birth is None: + return [ + ( + "$.credentialSubject.extensions.tyreLifecycleHistory.birth", + "TyreLifecycleHistory has no Birth declaration.", + ) + ] + violations: list[tuple[str, str]] = [] + path = "$.credentialSubject.extensions.tyreLifecycleHistory.birth" + if not birth.get("tyreUuid"): + violations.append((f"{path}.tyreUuid", "Birth declaration is missing tyreUuid.")) + manuf = birth.get("manufacturer") + if not isinstance(manuf, dict): + violations.append( + ( + f"{path}.manufacturer", + "Birth declaration is missing the manufacturer actor.", + ) + ) + else: + if not manuf.get("id"): + violations.append( + (f"{path}.manufacturer.id", "Manufacturer actor is missing id (URI / DID).") + ) + if not manuf.get("name"): + violations.append( + (f"{path}.manufacturer.name", "Manufacturer actor is missing name.") + ) + return violations + + +# ============================================================================= +# TYR003 — Load index +# ============================================================================= + + +class LoadIndexRule: + """TYR003: Load index lies in the ETRTO passenger/light-truck band (60-130).""" + + rule_id: str = "TYR003" + description: str = "Load index should be in the ETRTO passenger / light-truck range (60-130)" + severity: Literal["error", "warning", "info"] = "warning" + suggestion: str = ( + "Verify the load index value against the ETRTO standard " + "manual; values outside 60-130 are unusual for passenger / " + "light-truck applications." + ) + docs_url: str = "https://artiso-ai.github.io/dppvalidator/errors/TYR003" + + def check(self, passport: Any) -> list[tuple[str, str]]: + history = _extract_history(passport) + if history is None: + return [] + birth = _birth(history) + if birth is None: + return [] + size = birth.get("size") + if not isinstance(size, dict): + return [] + load_index = size.get("loadIndex") + if not isinstance(load_index, int): + return [] + path = "$.credentialSubject.extensions.tyreLifecycleHistory.birth.size.loadIndex" + if load_index not in _ETRTO_LOAD_INDEX_PASSENGER: + return [ + ( + path, + f"Load index {load_index} is outside the ETRTO passenger / " + f"light-truck range (60-130).", + ) + ] + return [] + + +# ============================================================================= +# TYR004 — Speed rating +# ============================================================================= + + +class SpeedRatingRule: + """TYR004: Speed rating is a recognised ETRTO letter code.""" + + rule_id: str = "TYR004" + description: str = "Speed rating should be a recognised ETRTO letter code" + severity: Literal["error", "warning", "info"] = "warning" + suggestion: str = "Use one of L, M, N, P, Q, R, S, T, U, H, V, W, Y, Z." + docs_url: str = "https://artiso-ai.github.io/dppvalidator/errors/TYR004" + + def check(self, passport: Any) -> list[tuple[str, str]]: + history = _extract_history(passport) + if history is None: + return [] + birth = _birth(history) + if birth is None: + return [] + size = birth.get("size") + if not isinstance(size, dict): + return [] + speed = size.get("speedRating") + if not isinstance(speed, str): + return [] + path = "$.credentialSubject.extensions.tyreLifecycleHistory.birth.size.speedRating" + if speed.upper() not in _ETRTO_SPEED_RATINGS: + return [ + ( + path, + f"Speed rating {speed!r} is not in the ETRTO standard set " + f"({sorted(_ETRTO_SPEED_RATINGS)}).", + ) + ] + return [] + + +# ============================================================================= +# TYR005 — Section width / aspect ratio / rim diameter sanity +# ============================================================================= + + +class SectionWidthRule: + """TYR005: Tyre size dimensions look sane for passenger / light-truck.""" + + rule_id: str = "TYR005" + description: str = "Tyre dimensions should be in the passenger / light-truck range" + severity: Literal["error", "warning", "info"] = "warning" + suggestion: str = ( + "ETRTO ranges: section width 125-445 mm, aspect ratio 20-95, rim diameter 10-24 inches." + ) + docs_url: str = "https://artiso-ai.github.io/dppvalidator/errors/TYR005" + + def check(self, passport: Any) -> list[tuple[str, str]]: + history = _extract_history(passport) + if history is None: + return [] + birth = _birth(history) + if birth is None: + return [] + size = birth.get("size") + if not isinstance(size, dict): + return [] + violations: list[tuple[str, str]] = [] + path = "$.credentialSubject.extensions.tyreLifecycleHistory.birth.size" + sw = size.get("sectionWidth") + if isinstance(sw, int) and not (125 <= sw <= 445): + violations.append( + (f"{path}.sectionWidth", f"Section width {sw} mm is outside ETRTO 125-445.") + ) + ar = size.get("aspectRatio") + if isinstance(ar, int) and not (20 <= ar <= 95): + violations.append((f"{path}.aspectRatio", f"Aspect ratio {ar} is outside ETRTO 20-95.")) + rd = size.get("rimDiameter") + if isinstance(rd, int) and not (10 <= rd <= 24): + violations.append( + (f"{path}.rimDiameter", f"Rim diameter {rd} in is outside the 10-24 range.") + ) + return violations + + +# ============================================================================= +# TYR006 — Retread provenance +# ============================================================================= + + +class RetreadProvenanceRule: + """TYR006: Retread events name the upstream Birth via previousBirthUuid.""" + + rule_id: str = "TYR006" + description: str = "Retread declaration must reference the upstream Birth (previousBirthUuid)" + severity: Literal["error", "warning", "info"] = "error" + suggestion: str = ( + "Populate ``previousBirthUuid`` on every Retread event so the " + "casing chain can be reconstructed." + ) + docs_url: str = "https://artiso-ai.github.io/dppvalidator/errors/TYR006" + + def check(self, passport: Any) -> list[tuple[str, str]]: + history = _extract_history(passport) + if history is None: + return [] + violations: list[tuple[str, str]] = [] + for i, event in enumerate(_events(history)): + if event.get("type") != "retread": + continue + decl = event.get("declaration") + if not isinstance(decl, dict): + continue + if not decl.get("previousBirthUuid"): + violations.append( + ( + f"$.credentialSubject.extensions.tyreLifecycleHistory.events[{i}].declaration.previousBirthUuid", + "Retread declaration is missing previousBirthUuid.", + ) + ) + return violations + + +# ============================================================================= +# TYR007 — Collection actor +# ============================================================================= + + +class CollectionActorRule: + """TYR007: Collection events identify the collecting actor.""" + + rule_id: str = "TYR007" + description: str = "Collection declaration must identify the collecting actor" + severity: Literal["error", "warning", "info"] = "warning" + suggestion: str = "Populate ``collector.id`` and ``collector.name`` on every Collection event." + docs_url: str = "https://artiso-ai.github.io/dppvalidator/errors/TYR007" + + def check(self, passport: Any) -> list[tuple[str, str]]: + history = _extract_history(passport) + if history is None: + return [] + violations: list[tuple[str, str]] = [] + for i, event in enumerate(_events(history)): + if event.get("type") != "collection": + continue + decl = event.get("declaration") + if not isinstance(decl, dict): + continue + collector = decl.get("collector") + base = f"$.credentialSubject.extensions.tyreLifecycleHistory.events[{i}].declaration.collector" + if not isinstance(collector, dict): + violations.append((base, "Collection declaration is missing the collector actor.")) + continue + if not collector.get("id"): + violations.append((f"{base}.id", "Collection.collector is missing id.")) + if not collector.get("name"): + violations.append((f"{base}.name", "Collection.collector is missing name.")) + return violations + + +# ============================================================================= +# TYR008 — Recycling method +# ============================================================================= + + +class RecyclingMethodRule: + """TYR008: Recycling events declare the recycling method.""" + + rule_id: str = "TYR008" + description: str = ( + "Recycling declaration must name the method (mechanical / " + "pyrolysis / devulcanisation / energy_recovery / other)" + ) + severity: Literal["error", "warning", "info"] = "warning" + suggestion: str = ( + "Populate ``method`` on every Recycling event with a value " + "from the GDSO Recycling v0.1 enumeration." + ) + docs_url: str = "https://artiso-ai.github.io/dppvalidator/errors/TYR008" + + def check(self, passport: Any) -> list[tuple[str, str]]: + history = _extract_history(passport) + if history is None: + return [] + violations: list[tuple[str, str]] = [] + for i, event in enumerate(_events(history)): + if event.get("type") != "recycling": + continue + decl = event.get("declaration") + if not isinstance(decl, dict): + continue + method = decl.get("method") + base = f"$.credentialSubject.extensions.tyreLifecycleHistory.events[{i}].declaration.method" + if not isinstance(method, str) or not method: + violations.append((base, "Recycling declaration is missing method.")) + continue + if method not in _RECYCLING_METHODS: + violations.append( + ( + base, + f"Recycling method {method!r} is not in the GDSO v0.1 " + f"closed set ({sorted(_RECYCLING_METHODS)}).", + ) + ) + return violations diff --git a/scripts/smoke_test.py b/scripts/smoke_test.py index cd2390d..91d0a39 100644 --- a/scripts/smoke_test.py +++ b/scripts/smoke_test.py @@ -407,13 +407,14 @@ def section_migrate(s: Smoke) -> None: json.dumps(warnings_doc)[:200], ) - # Refusal path: no --accept-warnings → exit 1, sidecar STILL written, - # main output NOT written. + # Refusal path: no --accept-warnings → EXIT_BLOCKING_WARNINGS (4) + # per docs/reference/cli/exit-codes.md (Phase 6 task 6.7); + # sidecar STILL written, main output NOT written. out2 = tmp / "blocked.json" r = s.cli(["migrate", str(v06), "-o", str(out2)]) s.assert_( - "migrate without --accept-warnings refuses (exit 1)", - r.returncode == 1, + "migrate without --accept-warnings refuses (exit 4)", + r.returncode == 4, r.stderr or r.stdout, ) s.assert_( diff --git a/src/dppvalidator/cli/_io.py b/src/dppvalidator/cli/_io.py new file mode 100644 index 0000000..30293ec --- /dev/null +++ b/src/dppvalidator/cli/_io.py @@ -0,0 +1,67 @@ +"""Shared CLI I/O helpers. + +Phase 8.6 polish (2026-05-09). Centralises input loading across the +``validate``, ``export``, and ``migrate`` commands — three near- +identical ``_load_input`` helpers with subtly different error +phrasings and (for ``export``) missing stdin support. + +Module-private — the underscore prefix marks it as internal to the +``cli`` package; subcommand modules import :func:`load_input` +directly. +""" + +from __future__ import annotations + +import json +import sys +from pathlib import Path +from typing import TYPE_CHECKING, Any + +from dppvalidator.logging import get_logger + +if TYPE_CHECKING: + from dppvalidator.cli.console import Console + +logger = get_logger(__name__) + + +def load_input(input_path: str, console: Console) -> dict[str, Any] | None: + """Load a JSON DPP payload from a file path or stdin. + + Args: + input_path: File path or the literal ``"-"`` to read stdin. + console: CLI console for human-readable error reporting. + + Returns: + The parsed JSON dict on success; ``None`` on any failure + (file not found, JSON parse error, encoding error, etc.). + Callers translate ``None`` into the ``EXIT_IO_ERROR`` (5) + exit code per the Phase 6 §6.7 contract. + """ + try: + if input_path == "-": + # Ensure UTF-8 encoding for stdin on every platform that + # supports the reconfigure() hook (Python 3.7+ on most + # OSes; PyPy / older 3.x may not). + if hasattr(sys.stdin, "reconfigure"): + sys.stdin.reconfigure(encoding="utf-8") # type: ignore[union-attr] + content = sys.stdin.read() + else: + path = Path(input_path) + if not path.exists(): + logger.error("File not found: %s", input_path) + console.print_error(f"File not found: {input_path}") + return None + content = path.read_text(encoding="utf-8") + return json.loads(content) + except json.JSONDecodeError as exc: + logger.error("Invalid JSON: %s", exc) + console.print_error(f"Invalid JSON: {exc}") + return None + except Exception as exc: # noqa: BLE001 — boundary catch for CLI ergonomics + logger.exception("Unexpected error loading input") + console.print_error(str(exc)) + return None + + +__all__ = ["load_input"] diff --git a/src/dppvalidator/cli/commands/export.py b/src/dppvalidator/cli/commands/export.py index 04cbc54..eef0a44 100644 --- a/src/dppvalidator/cli/commands/export.py +++ b/src/dppvalidator/cli/commands/export.py @@ -3,10 +3,11 @@ from __future__ import annotations import argparse -import json +import sys from pathlib import Path from typing import TYPE_CHECKING, Any +from dppvalidator.cli._io import load_input as _load_input from dppvalidator.logging import get_logger from dppvalidator.schemas.registry import DEFAULT_SCHEMA_VERSION @@ -17,6 +18,9 @@ EXIT_VALID = 0 EXIT_ERROR = 2 +# Phase 6 task 6.7 — additional exit codes; see +# docs/reference/cli/exit-codes.md for the full table. +EXIT_IO_ERROR = 5 def add_parser(subparsers: Any) -> argparse.ArgumentParser: @@ -38,9 +42,25 @@ def add_parser(subparsers: Any) -> argparse.ArgumentParser: parser.add_argument( "-f", "--format", - choices=["jsonld", "json"], + choices=["jsonld", "json", "eudpp-jsonld", "cirpass-jsonld"], default="jsonld", - help="Output format (default: jsonld)", + help=( + "Output format. ``json`` and ``jsonld`` are pre-Phase-6 " + "and emit the UNTP shape. ``eudpp-jsonld`` re-keys the " + "UNTP payload onto canonical EUDPP v1.9.1 IRIs. " + "``cirpass-jsonld`` projects onto the CIRPASS reference-" + "structure v1.3.0 shape (Phase 5 forward shim runs " + "automatically when the input is a UNTP envelope)." + ), + ) + parser.add_argument( + "--default-language", + default="en", + help=( + "BCP-47 tag used by ``--format=cirpass-jsonld`` when " + "wrapping UNTP scalar names as CIRPASS LocalisedText " + "entries. Default: en." + ), ) parser.add_argument( "--schema-version", @@ -61,12 +81,17 @@ def add_parser(subparsers: Any) -> argparse.ArgumentParser: def run(args: argparse.Namespace, console: Console) -> int: """Execute export command.""" - from dppvalidator.exporters import JSONExporter, JSONLDExporter + from dppvalidator.exporters import ( + CIRPASSJsonLDExporter, + EUDPPJsonLDExporter, + JSONExporter, + JSONLDExporter, + ) from dppvalidator.validators import ValidationEngine data = _load_input(args.input, console) if data is None: - return EXIT_ERROR + return EXIT_IO_ERROR engine = ValidationEngine(schema_version=args.schema_version) result = engine.validate(data) @@ -88,39 +113,44 @@ def run(args: argparse.Namespace, console: Console) -> int: # Use that for the exporter so the output's @context URL matches the # payload's actual version, not the literal 'auto' sentinel. resolved_version = result.schema_version or args.schema_version + default_language = getattr(args, "default_language", "en") if args.format == "jsonld": - exporter = JSONLDExporter(version=resolved_version) - output = exporter.export(result.passport, indent=indent) + jsonld_exporter = JSONLDExporter(version=resolved_version) + output = jsonld_exporter.export(result.passport, indent=indent) + elif args.format == "eudpp-jsonld": + eudpp_exporter = EUDPPJsonLDExporter(schema_version=resolved_version) + output = eudpp_exporter.export(result.passport, indent=indent) + elif args.format == "cirpass-jsonld": + # The CIRPASS exporter accepts both UNTP envelopes (forward- + # shimmed) and native CIRPASS passports. Either way the + # output is a CIRPASS reference-structure v1.3.0 message + # with the EUDPP v1.9.1 context attached. + cirpass_exporter = CIRPASSJsonLDExporter() + output = cirpass_exporter.export( + result.passport, + indent=indent, + default_language=default_language, + ) + # Surface mapping warnings on stderr so the user knows what + # was lossy / synthesised on the projection. Stderr (rather + # than the Console object, which targets stdout) keeps the + # JSON output on stdout pipe-clean for ``... | jq`` consumers. + for w in cirpass_exporter.last_mapping_warnings: + print( + f" [{w.code}] ({w.severity.value}) {w.path}: {w.message}", + file=sys.stderr, + ) else: - exporter = JSONExporter() - output = exporter.export(result.passport, indent=indent) + json_exporter = JSONExporter() + output = json_exporter.export(result.passport, indent=indent) if args.output: Path(args.output).write_text(output, encoding="utf-8") console.print_success(f"Exported to: {args.output}") else: - console.print(output) + # Use plain print for stdout to keep pipes clean (Rich would + # inject ANSI escapes that break JSON consumers). + print(output) return EXIT_VALID - - -def _load_input(input_path: str, console: Console) -> dict[str, Any] | None: - """Load input data from file.""" - try: - path = Path(input_path) - if not path.exists(): - logger.error("File not found: %s", input_path) - console.print_error(f"File not found: {input_path}") - return None - content = path.read_text(encoding="utf-8") - return json.loads(content) - - except json.JSONDecodeError as e: - logger.error("Invalid JSON: %s", e) - console.print_error(f"Invalid JSON: {e}") - return None - except Exception as e: - logger.exception("Unexpected error loading input") - console.print_error(str(e)) - return None diff --git a/src/dppvalidator/cli/commands/migrate.py b/src/dppvalidator/cli/commands/migrate.py index 36f0f55..93bb261 100644 --- a/src/dppvalidator/cli/commands/migrate.py +++ b/src/dppvalidator/cli/commands/migrate.py @@ -17,11 +17,11 @@ import argparse import json -import sys from dataclasses import asdict from pathlib import Path from typing import TYPE_CHECKING, Any +from dppvalidator.cli._io import load_input as _load_input from dppvalidator.logging import get_logger if TYPE_CHECKING: @@ -32,6 +32,16 @@ EXIT_OK = 0 EXIT_BLOCKED = 1 EXIT_ERROR = 2 +# Phase 6 task 6.7 — additional exit codes; see +# docs/reference/cli/exit-codes.md for the full table. +EXIT_FAMILY_MISMATCH = 3 +EXIT_BLOCKING_WARNINGS = 4 +EXIT_IO_ERROR = 5 + +# Migration target tokens (Phase 6 task 6.5). Stable across releases. +_MIGRATE_TARGET_UNTP_07 = "untp-0.7" +_MIGRATE_TARGET_CIRPASS_13 = "cirpass-1.3" +MIGRATE_TARGETS: tuple[str, ...] = (_MIGRATE_TARGET_UNTP_07, _MIGRATE_TARGET_CIRPASS_13) def add_parser(subparsers: Any) -> argparse.ArgumentParser: @@ -76,28 +86,79 @@ def add_parser(subparsers: Any) -> argparse.ArgumentParser: default="0.6.x", help=( "Source UNTP version family (default: 0.6.x). Pass an explicit " - "X.Y.Z value to pin a specific source version." + "X.Y.Z value to pin a specific source version. For " + "``--to=cirpass-1.3``, ``--from`` accepts ``0.7.0`` and " + "is otherwise unused (the CIRPASS forward shim only reads " + "UNTP 0.7.0 input)." + ), + ) + parser.add_argument( + "--to", + dest="target", + choices=list(MIGRATE_TARGETS), + default=_MIGRATE_TARGET_UNTP_07, + help=( + "Migration target. ``untp-0.7`` (default) runs the v0.6 → v0.7 " + "intra-family upgrade shim (back-compat with the pre-Phase-6 " + "behaviour). ``cirpass-1.3`` runs the Phase 5 cross-family " + "forward shim onto the CIRPASS reference structure v1.3.0." + ), + ) + parser.add_argument( + "--default-language", + default="en", + help=( + "BCP-47 tag passed to the CIRPASS forward shim when " + "wrapping UNTP scalar names as LocalisedText entries. " + "Used only by ``--to=cirpass-1.3`` (default: en)." ), ) return parser def run(args: argparse.Namespace, console: Console) -> int: - """Execute the migrate command.""" - from dppvalidator.compat.upgrade_0_6_to_0_7 import ( - UpgradeSeverity, - upgrade, - ) + """Execute the migrate command. + Phase 6 task 6.5 of [docs/plans/CIRPASS_2_MIGRATION.md] generalised + this command from the single 0.6 → 0.7 path to a family-aware + dispatch on ``--to``. The 0.6 → 0.7 path is the default for + back-compat with the pre-Phase-6 behaviour. + """ data = _load_input(args.input, console) if data is None: + return EXIT_IO_ERROR + + target = getattr(args, "target", _MIGRATE_TARGET_UNTP_07) + try: + if target == _MIGRATE_TARGET_UNTP_07: + return _run_untp_upgrade(args, data, console) + if target == _MIGRATE_TARGET_CIRPASS_13: + return _run_cirpass_forward(args, data, console) + # argparse should have rejected anything else; defensive + # fallback only. + console.print_error(f"Unknown migration target: {target!r}") + return EXIT_ERROR + except Exception as exc: # pragma: no cover — defensive + logger.exception("Migrate dispatch crashed") + console.print_error(f"Migrate failed: {exc}") return EXIT_ERROR + +def _run_untp_upgrade(args: argparse.Namespace, data: dict[str, Any], console: Console) -> int: + """0.6 → 0.7 intra-family upgrade (pre-Phase-6 behaviour).""" + from dppvalidator.compat.upgrade_0_6_to_0_7 import ( + UpgradeSeverity, + upgrade, + ) + if not args.source_version.startswith("0.6"): console.print_error( - f"No upgrade shim registered for source version {args.source_version!r}.", + f"No upgrade shim registered for source version " + f"{args.source_version!r}. ``--to=untp-0.7`` only reads " + f"v0.6.x input; for cross-family migration use " + f"``--to=cirpass-1.3``.", ) - return EXIT_ERROR + return EXIT_FAMILY_MISMATCH try: upgraded, warnings = upgrade(data) @@ -107,43 +168,124 @@ def run(args: argparse.Namespace, console: Console) -> int: return EXIT_ERROR blocking = [w for w in warnings if w.severity != UpgradeSeverity.INFO] + return _finalise_migration( + args=args, + payload=upgraded, + warnings=warnings, + blocking=blocking, + target_label="UNTP 0.7.0", + sidecar_meta={ + "schema_version_from": "0.6.x", + "schema_version_to": _resolve_v07_target_version(), + }, + console=console, + ) + + +def _run_cirpass_forward(args: argparse.Namespace, data: dict[str, Any], console: Console) -> int: + """UNTP 0.7.0 → CIRPASS reference structure v1.3.0 forward shim + (Phase 6 task 6.5). + """ + from dppvalidator.compat import MappingSeverity, to_cirpass_1_3 + + default_language = getattr(args, "default_language", "en") + try: + cirpass_dict, mapping_warnings = to_cirpass_1_3(data, default_language=default_language) + except Exception as exc: + logger.exception("CIRPASS forward shim crashed") + console.print_error(f"Mapping failed: {exc}") + return EXIT_ERROR + + blocking = [w for w in mapping_warnings if w.severity != MappingSeverity.INFO] + # ``schema_version_from`` defaults to the current UNTP default + # (sourced from the registry, not a literal) when --from is left + # at its catch-all "0.6.x" value. Pulling the version literals + # from the registry keeps the no-version-literals guard happy. + from dppvalidator.compat import active_version + from dppvalidator.schemas.registry import DEFAULT_VERSIONS, SchemaFamily + + untp_default = active_version(SchemaFamily.UNTP) + source_version = ( + args.source_version if args.source_version not in (None, "0.6.x") else untp_default + ) + return _finalise_migration( + args=args, + payload=cirpass_dict, + warnings=mapping_warnings, + blocking=blocking, + target_label=(f"CIRPASS reference structure v{DEFAULT_VERSIONS[SchemaFamily.CIRPASS]}"), + sidecar_meta={ + "schema_version_from": source_version, + "schema_version_to": DEFAULT_VERSIONS[SchemaFamily.CIRPASS], + "family_from": "untp", + "family_to": "cirpass", + }, + console=console, + ) + +def _finalise_migration( + *, + args: argparse.Namespace, + payload: dict[str, Any], + warnings: list[Any], + blocking: list[Any], + target_label: str, + sidecar_meta: dict[str, Any], + console: Console, +) -> int: + """Shared post-migration handling: sidecar + accept-warnings gate + write. + + Both the UNTP-upgrade and CIRPASS-forward paths converge here so + sidecar / blocking-warning semantics are identical. + """ output_path = _resolve_output_path(args, console) if output_path is None and args.in_place: return EXIT_ERROR - # Always write a sidecar warnings file when *any* blocking-grade - # warning fired, regardless of whether the main write goes through. sidecar_path: Path | None = None if blocking and output_path is not None: sidecar_path = output_path.with_suffix(output_path.suffix + ".warnings.json") - _write_warnings_sidecar(sidecar_path, warnings) + _write_warnings_sidecar(sidecar_path, warnings, sidecar_meta) console.print_warning( f"{len(warnings)} warning(s) recorded in {sidecar_path}", ) if blocking and not args.accept_warnings: console.print_error( - f"Upgrade emitted {len(blocking)} blocking warning(s); refusing to " - "write. Re-run with --accept-warnings to override, or fix the " - "issues listed in the sidecar warnings file.", + f"Migration to {target_label} emitted {len(blocking)} " + "blocking warning(s); refusing to write. Re-run with " + "--accept-warnings to override, or fix the issues listed " + "in the sidecar warnings file.", ) for w in warnings: console.print(f" [{w.code}] ({w.severity.value}) {w.path}: {w.message}") - return EXIT_BLOCKED + return EXIT_BLOCKING_WARNINGS - _write_output(upgraded, output_path, console) + _write_output(payload, output_path, console) if warnings: - console.print(f"Upgraded with {len(warnings)} warning(s).") + console.print(f"Migrated to {target_label} with {len(warnings)} warning(s).") for w in warnings: console.print(f" [{w.code}] ({w.severity.value}) {w.path}: {w.message}") else: - console.print_success("Upgraded with no warnings.") + console.print_success(f"Migrated to {target_label} with no warnings.") return EXIT_OK +def _resolve_v07_target_version() -> str: + """Resolve the highest 0.7.x version registered in SCHEMA_REGISTRY.""" + from dppvalidator.schemas.registry import SCHEMA_REGISTRY + + target_candidates = [ + v for v in SCHEMA_REGISTRY if v.split(".")[0] == "0" and v.split(".")[1] == "7" + ] + if target_candidates: + return max(target_candidates, key=lambda v: tuple(int(x) for x in v.split("."))) + return "0.7.x" + + def _resolve_output_path(args: argparse.Namespace, console: Console) -> Path | None: """Return the resolved output path or ``None`` when stdout is the target.""" if args.in_place and args.output: @@ -159,29 +301,6 @@ def _resolve_output_path(args: argparse.Namespace, console: Console) -> Path | N return None -def _load_input(input_path: str, console: Console) -> dict[str, Any] | None: - """Load JSON from a file path or stdin.""" - try: - if input_path == "-": - if hasattr(sys.stdin, "reconfigure"): - sys.stdin.reconfigure(encoding="utf-8") # type: ignore[union-attr] - content = sys.stdin.read() - else: - path = Path(input_path) - if not path.exists(): - console.print_error(f"File not found: {input_path}") - return None - content = path.read_text(encoding="utf-8") - return json.loads(content) - except json.JSONDecodeError as exc: - console.print_error(f"Invalid JSON: {exc}") - return None - except Exception as exc: - logger.exception("Unexpected error loading input") - console.print_error(str(exc)) - return None - - def _write_output(payload: dict[str, Any], path: Path | None, console: Console) -> None: """Write the upgraded JSON to ``path`` (or stdout if ``None``).""" serialised = json.dumps(payload, indent=2, ensure_ascii=False, default=str) @@ -193,21 +312,36 @@ def _write_output(payload: dict[str, Any], path: Path | None, console: Console) console.print(f"Wrote {path}") -def _write_warnings_sidecar(path: Path, warnings: list[Any]) -> None: - """Persist the full warning list as JSON next to the upgraded payload.""" - from dppvalidator.schemas.registry import SCHEMA_REGISTRY +def _write_warnings_sidecar( + path: Path, + warnings: list[Any], + meta: dict[str, Any] | None = None, +) -> None: + """Persist the full warning list as JSON next to the upgraded payload. - target_candidates = [ - v for v in SCHEMA_REGISTRY if v.split(".")[0] == "0" and v.split(".")[1] == "7" - ] - target_version = ( - max(target_candidates, key=lambda v: tuple(int(x) for x in v.split("."))) - if target_candidates - else "0.7.x" - ) - payload = { + Accepts both :class:`UpgradeWarning` (UNTP 0.6 → 0.7) and + :class:`MappingWarning` (cross-family) entries; both expose the + same dataclass shape, but MappingWarning's ``details`` field is a + tuple of ``(key, value)`` pairs that needs flattening for JSON + serialisation. + """ + payload: dict[str, Any] = { "schema_version_from": "0.6.x", - "schema_version_to": target_version, - "warnings": [{**asdict(w), "severity": w.severity.value} for w in warnings], + "schema_version_to": _resolve_v07_target_version(), } + if meta: + payload.update(meta) + payload["warnings"] = [_warning_to_dict(w) for w in warnings] path.write_text(json.dumps(payload, indent=2) + "\n", encoding="utf-8") + + +def _warning_to_dict(warning: Any) -> dict[str, Any]: + """Convert UpgradeWarning / MappingWarning to a JSON-serialisable dict.""" + raw = asdict(warning) + raw["severity"] = warning.severity.value + # MappingWarning.details is a ``tuple[tuple[str, str], ...]``; turn + # it into a dict for ergonomic JSON output. + details = raw.get("details") + if isinstance(details, (list, tuple)): + raw["details"] = dict(details) + return raw diff --git a/src/dppvalidator/cli/commands/schema.py b/src/dppvalidator/cli/commands/schema.py index a075332..1126132 100644 --- a/src/dppvalidator/cli/commands/schema.py +++ b/src/dppvalidator/cli/commands/schema.py @@ -76,47 +76,79 @@ def run(args: argparse.Namespace, console: Console) -> int: def _list_schemas(console: Console) -> int: - """List available schema versions registered in ``SCHEMA_REGISTRY``. + """List available schema versions registered in ``SCHEMA_REGISTRY_BY_FAMILY``. - Source of truth is ``SCHEMA_REGISTRY``; ``CONTEXTS`` is consulted for the - JSON-LD context URLs of each version. Both are kept in lock-step — see - docs/plans/UNTP_0.7.0_MIGRATION.md §Phase 1. + Phase 6 task 6.6 of [docs/plans/CIRPASS_2_MIGRATION.md] extended + this with a family column and a per-family default marker so + consumers can see UNTP 0.6 / 0.7 and CIRPASS 1.3 in one view. """ from dppvalidator.exporters.contexts import CONTEXTS - from dppvalidator.schemas.registry import SCHEMA_REGISTRY, SchemaRegistry - - default_version = SchemaRegistry().default_version + from dppvalidator.schemas.registry import ( + DEFAULT_VERSIONS, + SCHEMA_REGISTRY_BY_FAMILY, + SchemaFamily, + ) - table = console.create_table(title="Available UNTP DPP Schema Versions") + table = console.create_table(title="Available DPP Schema Versions") + table.add_column("Family") table.add_column("Version") table.add_column("Default", justify="center") table.add_column("Bundled", justify="center") table.add_column("Contexts") - for version in sorted(SCHEMA_REGISTRY): - schema = SCHEMA_REGISTRY[version] - is_default = "✓" if version == default_version else "" - # ``sha256 is not None`` is the proxy for "we ship this file in-tree"; - # versions without a hash are registered but rely on a custom path - # being supplied at validate-time. + # Sort: family alphabetically, then version ascending. Stable so + # consumers / golden-snapshot tests can pin against the table. + sorted_keys = sorted( + SCHEMA_REGISTRY_BY_FAMILY, + key=lambda fv: ( + fv[0].value, + tuple(int(x) if x.isdigit() else 0 for x in fv[1].split(".")), + ), + ) + + for family, version in sorted_keys: + schema = SCHEMA_REGISTRY_BY_FAMILY[(family, version)] + family_label = family.value + is_default = "✓" if DEFAULT_VERSIONS.get(family) == version else "" is_bundled = "✓" if schema.sha256 is not None else "" - ctx = CONTEXTS.get(version) - contexts = ", ".join(ctx.contexts) if ctx else "(no @context registered)" - table.add_row(version, is_default, is_bundled, contexts) + # Each row's @context list comes from two sources: + # - UNTP rows use the legacy CONTEXTS table (key = bare version). + # - CIRPASS rows use the registry row's ``context_urls`` tuple + # directly (no legacy CONTEXTS entry exists for CIRPASS). + if family is SchemaFamily.UNTP: + ctx = CONTEXTS.get(version) + contexts = ", ".join(ctx.contexts) if ctx else "(no @context registered)" + else: + contexts = ( + ", ".join(schema.context_urls) + if schema.context_urls + else "(no @context registered)" + ) + table.add_row(family_label, version, is_default, is_bundled, contexts) console.print_table(table) return EXIT_VALID def _download_schema(version: str, output_dir: str | None, console: Console) -> int: - """Download a schema by version using the URL recorded in ``SCHEMA_REGISTRY``.""" + """Download a schema by version using the URL recorded in the registry. + + Uses the tuple-keyed :data:`SCHEMA_REGISTRY_BY_FAMILY` (Phase 2 + source of truth) — the bare-string ``SCHEMA_REGISTRY`` view is + deprecated since 0.5.0 (Phase 9 task 9.4). + """ from pathlib import Path - from dppvalidator.schemas.registry import SCHEMA_REGISTRY + from dppvalidator.schemas.registry import SCHEMA_REGISTRY_BY_FAMILY, SchemaFamily - if version not in SCHEMA_REGISTRY: + untp_versions = { + v: schema + for (family, v), schema in SCHEMA_REGISTRY_BY_FAMILY.items() + if family is SchemaFamily.UNTP + } + if version not in untp_versions: console.print_error(f"Unknown version: {version}") - console.print(f"Available: {', '.join(sorted(SCHEMA_REGISTRY))}") + console.print(f"Available: {', '.join(sorted(untp_versions))}") return EXIT_ERROR try: @@ -126,7 +158,7 @@ def _download_schema(version: str, output_dir: str | None, console: Console) -> console.print("Install with: pip install 'dppvalidator[http]'") return EXIT_ERROR - schema_url = SCHEMA_REGISTRY[version].url + schema_url = untp_versions[version].url try: logger.info("Downloading schema %s from %s", version, schema_url) @@ -149,16 +181,24 @@ def _download_schema(version: str, output_dir: str | None, console: Console) -> def _show_info(version: str, console: Console) -> int: - """Show schema information for ``version`` from the registries.""" + """Show schema information for ``version`` from the registries. + + Uses the tuple-keyed :data:`SCHEMA_REGISTRY_BY_FAMILY` source of + truth — the bare-string ``SCHEMA_REGISTRY`` view is deprecated + since 0.5.0 (Phase 9 task 9.4). + """ from dppvalidator.exporters.contexts import CONTEXTS - from dppvalidator.schemas.registry import SCHEMA_REGISTRY + from dppvalidator.schemas.registry import SCHEMA_REGISTRY_BY_FAMILY, SchemaFamily - if version not in SCHEMA_REGISTRY: + untp_versions = { + v: s for (family, v), s in SCHEMA_REGISTRY_BY_FAMILY.items() if family is SchemaFamily.UNTP + } + if version not in untp_versions: console.print_error(f"Unknown version: {version}") - console.print(f"Available: {', '.join(sorted(SCHEMA_REGISTRY))}") + console.print(f"Available: {', '.join(sorted(untp_versions))}") return EXIT_ERROR - schema = SCHEMA_REGISTRY[version] + schema = untp_versions[version] ctx = CONTEXTS.get(version) type_arr = ctx.default_type if ctx else () sha = schema.sha256 or "(not bundled — fetched on demand)" diff --git a/src/dppvalidator/cli/commands/validate.py b/src/dppvalidator/cli/commands/validate.py index 51f3670..066bf11 100644 --- a/src/dppvalidator/cli/commands/validate.py +++ b/src/dppvalidator/cli/commands/validate.py @@ -5,10 +5,10 @@ import argparse import glob import json -import sys from pathlib import Path from typing import TYPE_CHECKING, Any +from dppvalidator.cli._io import load_input as _load_input from dppvalidator.logging import get_logger from dppvalidator.schemas.registry import DEFAULT_SCHEMA_VERSION @@ -20,6 +20,10 @@ EXIT_VALID = 0 EXIT_INVALID = 1 EXIT_ERROR = 2 +# Phase 6 task 6.7 — additional exit codes; see +# docs/reference/cli/exit-codes.md for the full table. +EXIT_FAMILY_MISMATCH = 3 +EXIT_IO_ERROR = 5 def add_parser(subparsers: Any) -> argparse.ArgumentParser: @@ -79,6 +83,31 @@ def add_parser(subparsers: Any) -> argparse.ArgumentParser: "warnings are reported alongside validation issues." ), ) + parser.add_argument( + "--target", + choices=["auto", "untp", "cirpass"], + default="auto", + help=( + "Schema family to validate against. ``auto`` (default) detects " + "the family from the payload's @context / shape signature. " + "Explicit ``untp`` or ``cirpass`` overrides detection — when the " + "override contradicts the payload, the command exits with code 3 " + "and a DET001 diagnostic instead of silently re-routing." + ), + ) + parser.add_argument( + "--profile", + choices=["textile-v1", "textile-v2"], + default=None, + help=( + "Optional pilot profile (Phase 7 of CIRPASS_2_MIGRATION). " + "``textile-v1`` runs the legacy textile rule pack (TXT001…TXT005, " + "all info/warning); ``textile-v2`` runs the MVP Textile DPP v2 " + "(2025-12-04) pack with stricter rules (TXT001…TXT007). When " + "omitted, no textile-specific profile is loaded — the default " + "rule set still applies." + ), + ) return parser @@ -89,19 +118,23 @@ def run(args: argparse.Namespace, console: Console) -> int: # Resolve input patterns to file paths files = _resolve_inputs(args.input, console) if not files: - return EXIT_ERROR + return EXIT_IO_ERROR engine = ValidationEngine( schema_version=args.schema_version, strict_mode=args.strict, + profile=getattr(args, "profile", None), ) upgrade_from = getattr(args, "upgrade_from", None) if upgrade_from is not None: _verify_upgrade_path(upgrade_from, args.schema_version, console) + target = getattr(args, "target", "auto") + all_valid = True has_load_error = False + has_family_mismatch = False results: list[tuple[str, Any]] = [] for file_path in files: @@ -115,6 +148,18 @@ def run(args: argparse.Namespace, console: Console) -> int: if upgrade_warnings: _print_upgrade_warnings(upgrade_warnings, file_path, console) + # Phase 6 task 6.3: when the user pinned --target explicitly, + # check it against the detected family before validating. A + # contradiction emits DET001 + EXIT_FAMILY_MISMATCH; agreement + # falls through to ordinary validation. + if target != "auto": + mismatch = _check_target_family(data, target, file_path, console) + if mismatch is not None: + has_family_mismatch = True + results.append((file_path, mismatch)) + all_valid = False + continue + result = engine.validate( data, fail_fast=args.fail_fast, @@ -124,9 +169,9 @@ def run(args: argparse.Namespace, console: Console) -> int: if not result.valid: all_valid = False - # If no files were successfully loaded, return error + # If no files were successfully loaded, return IO error if not results and has_load_error: - return EXIT_ERROR + return EXIT_IO_ERROR # Output results if len(results) == 1: @@ -134,9 +179,12 @@ def run(args: argparse.Namespace, console: Console) -> int: elif results: _output_batch_results(results, args.format, console) - # Return error if any file failed to load, invalid if validation failed + # Phase 6: family mismatch is a distinct exit code so wrappers + # can route DET001 differently from regular validation errors. + if has_family_mismatch: + return EXIT_FAMILY_MISMATCH if has_load_error: - return EXIT_ERROR + return EXIT_IO_ERROR return EXIT_VALID if all_valid else EXIT_INVALID @@ -173,6 +221,61 @@ def _resolve_inputs(inputs: list[str], console: Console) -> list[str]: return files +def _check_target_family( + data: dict[str, Any], + target: str, + file_path: str, + console: Console, +) -> Any | None: + """Validate that ``--target`` agrees with the payload's detected family. + + Phase 6 task 6.3 of [docs/plans/CIRPASS_2_MIGRATION.md]. Returns + ``None`` when there's no contradiction; returns a synthetic + :class:`ValidationResult` carrying ``DET001`` when the user- + supplied target overrides what the engine would have picked. + """ + from dppvalidator.schemas.registry import SchemaFamily + from dppvalidator.validators.detection import ( + DET_CODE_FAMILY_MISMATCH, + detect_schema_family, + ) + from dppvalidator.validators.results import ValidationError, ValidationResult + + detected = detect_schema_family(data) + target_family = SchemaFamily.UNTP if target == "untp" else SchemaFamily.CIRPASS + if detected is None: + # No detection signal — caller's --target wins silently. + return None + if detected is target_family: + # Detection agrees with --target. Continue. + return None + # Mismatch — fail fast with DET001. + error = ValidationError( + path="$", + message=( + f"--target={target!r} contradicts the payload's detected " + f"family ({detected.value!r}). Re-run with --target=auto " + f"or --target={detected.value!r} (or fix the payload)." + ), + code=DET_CODE_FAMILY_MISMATCH, + layer="engine", + severity="error", + context={ + "detected_family": detected.value, + "configured_target": target, + }, + ) + console.print_error( + f"DET001: {file_path} — payload looks like {detected.value!r} " + f"but --target={target!r} pins the other family." + ) + return ValidationResult( + valid=False, + errors=[error], + schema_version=target_family.value, + ) + + def _verify_upgrade_path(source: str, target: str, console: Console) -> None: """Confirm we have a registered shim for ``source → target``. @@ -217,34 +320,6 @@ def _print_upgrade_warnings(warnings: list[Any], input_path: str, console: Conso console.print(f" [{w.code}] ({w.severity.value}) {w.path}: {w.message}") -def _load_input(input_path: str, console: Console) -> dict[str, Any] | None: - """Load input data from file or stdin.""" - try: - if input_path == "-": - # Ensure UTF-8 encoding for stdin on all platforms (if supported) - if hasattr(sys.stdin, "reconfigure"): - sys.stdin.reconfigure(encoding="utf-8") # type: ignore[union-attr] - content = sys.stdin.read() - else: - path = Path(input_path) - if not path.exists(): - logger.error("File not found: %s", input_path) - console.print_error(f"File not found: {input_path}") - return None - content = path.read_text(encoding="utf-8") - - return json.loads(content) - - except json.JSONDecodeError as e: - logger.error("Invalid JSON: %s", e) - console.print_error(f"Invalid JSON: {e}") - return None - except Exception as e: - logger.exception("Unexpected error loading input") - console.print_error(str(e)) - return None - - def _output_result(result: Any, fmt: str, input_path: str, console: Console) -> None: """Output validation result in specified format.""" if fmt == "json": diff --git a/src/dppvalidator/cli/main.py b/src/dppvalidator/cli/main.py index ceaf136..05eb49a 100644 --- a/src/dppvalidator/cli/main.py +++ b/src/dppvalidator/cli/main.py @@ -20,9 +20,40 @@ from dppvalidator.cli.console import Console from dppvalidator.logging import configure_logging +# ============================================================================= +# CLI exit codes (Phase 6 task 6.7 of docs/plans/CIRPASS_2_MIGRATION.md) +# ============================================================================= +# +# The exit-code table is documented at docs/reference/cli/exit-codes.md. +# Codes 0-2 are pre-Phase-6 and unchanged for back-compat. Codes 3-5 +# are new in Phase 6 and surfaced through ``--target`` / ``--strict``. +# +# Cardinal contract: every CLI command may exit with one of these +# codes; values are stable across releases. Plugins / wrappers may +# pattern-match on the integer. + EXIT_VALID = 0 +"""Validation passed without errors.""" + EXIT_INVALID = 1 +"""Validation produced one or more errors.""" + EXIT_ERROR = 2 +"""IO / parse / unhandled engine failure.""" + +EXIT_FAMILY_MISMATCH = 3 +"""``--target`` (or ``--to``) explicitly contradicts the payload's +detected family. Surfaces ``DET001``.""" + +EXIT_BLOCKING_WARNINGS = 4 +"""``--strict`` was set and the run produced one or more +warning-level diagnostics (upgrade ``UPG`` warnings or mapping +``MAP`` warnings) that blocked the operation.""" + +EXIT_IO_ERROR = 5 +"""IO / file-not-found / encoding errors that the wrapper can +distinguish from logical validation failures.""" + # Command handler type: (args, console) -> exit_code CommandHandler = Callable[[argparse.Namespace, Console], int] diff --git a/src/dppvalidator/compat/__init__.py b/src/dppvalidator/compat/__init__.py index 1687992..fb9a379 100644 --- a/src/dppvalidator/compat/__init__.py +++ b/src/dppvalidator/compat/__init__.py @@ -25,6 +25,20 @@ from __future__ import annotations +from typing import TYPE_CHECKING + +from dppvalidator.compat._mapping_codes import ( + MAP_CODE_LOSSY, + MAP_CODE_REQUIRED_FIELD_MISSING, + MAP_CODE_SYNTHESISED, + MAP_CODE_TEMPORAL_COLLAPSE, + MAP_CODE_UNMAPPED, + MAP_CODES, + MappingSeverity, + MappingWarning, +) +from dppvalidator.compat.cirpass_1_3_to_untp_0_7 import to_untp_0_7 +from dppvalidator.compat.untp_0_7_to_cirpass_1_3 import to_cirpass_1_3 from dppvalidator.compat.upgrade_0_6_to_0_7 import ( UPG_CODE_LOSSY, UPG_CODE_REQUIRED_FIELD_MISSING, @@ -35,35 +49,69 @@ upgrade, ) +if TYPE_CHECKING: + from dppvalidator.schemas.registry import SchemaFamily + -def active_version() -> str: - """Return the UNTP DPP version this build of dppvalidator targets. +def active_version(family: SchemaFamily | None = None) -> str: + """Return the active default version for a schema family. - This is the value of :data:`DEFAULT_SCHEMA_VERSION` from the schema - registry, surfaced as a function so callers don't have to import the - registry directly. Use this whenever you need a "current default" - version literal in feature code — the no-version-literals guard - test (``tests/unit/test_no_version_literals.py``) refuses to let you + This is the value of :data:`DEFAULT_VERSIONS[family]` from the + schema registry, surfaced as a function so callers don't have to + import the registry directly. Use this whenever you need a + "current default" version literal in feature code — the + no-version-literals guard test + (``tests/unit/test_no_version_literals.py``) refuses to let you hardcode the string. + + Phase 2 of the CIRPASS-2 migration extended this with the + ``family`` keyword. Pre-Phase-2 callers (``active_version()`` with + no argument) keep getting the UNTP default, preserving the + historical behaviour. + + Args: + family: Schema family. ``None`` (default) is treated as + :data:`SchemaFamily.UNTP` — same as the pre-Phase-2 API. + + Returns: + Default version string for the requested family. """ - from dppvalidator.schemas.registry import DEFAULT_SCHEMA_VERSION + from dppvalidator.schemas.registry import DEFAULT_VERSIONS + from dppvalidator.schemas.registry import SchemaFamily as _SF + + return DEFAULT_VERSIONS[family if family is not None else _SF.UNTP] - return DEFAULT_SCHEMA_VERSION +def is_version(version: str, family: SchemaFamily | None = None) -> bool: + """Return ``True`` if ``version`` matches the active default version. -def is_version(version: str) -> bool: - """Return ``True`` if ``version`` matches the active default version.""" - return version == active_version() + Phase 2 added the ``family`` keyword (defaults to UNTP for + back-compat). + """ + return version == active_version(family) __all__ = [ + # UNTP 0.6 → 0.7 upgrade "UPG_CODE_LOSSY", "UPG_CODE_REQUIRED_FIELD_MISSING", "UPG_CODE_SYNTHESISED", "UPG_CODE_UNMAPPED_COUNTRY", "UpgradeSeverity", "UpgradeWarning", + "upgrade", + # UNTP ↔ CIRPASS mapping (Phase 5) + "MAP_CODES", + "MAP_CODE_LOSSY", + "MAP_CODE_REQUIRED_FIELD_MISSING", + "MAP_CODE_SYNTHESISED", + "MAP_CODE_TEMPORAL_COLLAPSE", + "MAP_CODE_UNMAPPED", + "MappingSeverity", + "MappingWarning", + "to_cirpass_1_3", + "to_untp_0_7", + # Active-version helpers "active_version", "is_version", - "upgrade", ] diff --git a/src/dppvalidator/compat/_identifier_schemes.py b/src/dppvalidator/compat/_identifier_schemes.py new file mode 100644 index 0000000..197abbd --- /dev/null +++ b/src/dppvalidator/compat/_identifier_schemes.py @@ -0,0 +1,255 @@ +"""Static lookup table between UNTP IdentifierScheme and CIRPASS scheme codes. + +Phase 5 task 5.5 of [docs/plans/CIRPASS_2_MIGRATION.md] (resolves G18). +UNTP 0.7.0 carries the issuing register as a structured object — +``IdentifierScheme(id: URI, name: str)`` — whereas the CIRPASS v1.3.0 +message stores the same information as a flat ``Identifier.scheme`` +URI plus an optional ``schemeName`` short-form. The two are +information-equivalent but the wire-shapes differ. + +This module is the single point where the translation lives. The +forward shim (``untp_0_7_to_cirpass_1_3.py``) projects an +``IdentifierScheme`` onto a ``(scheme_uri, scheme_name)`` tuple; the +reverse shim (``cirpass_1_3_to_untp_0_7.py``) lifts a CIRPASS +``Identifier`` back into an ``IdentifierScheme``. When a value is +not in the lookup table, callers emit a ``MAP003`` (unmapped) +warning and pass the raw value through. + +The bundled table covers the commonly-seen scheme URIs for product / +party / facility identifiers in EU DPP scope: + +- GS1 GTIN (product identifiers) +- GS1 Digital Link (alternative product URI form) +- ISO/IEC 15459 (serial-part numbering) +- GLEIF LEI (legal-entity identifiers) +- EU EORI (economic-operator registration) +- EUID (EU business register) +- DUNS (commercial supplier numbering) +- WCO HS / TARIC / CPV (commodity classification) + +The table is *additive* — new entries are added as pilots surface +new schemes; removing a scheme is a breaking change because callers +pattern-match on the canonical scheme name. +""" + +from __future__ import annotations + +from dataclasses import dataclass + + +@dataclass(frozen=True, slots=True) +class IdentifierSchemeMapping: + """Two-axis lookup row for a single identifier scheme.""" + + scheme_uri: str + """Canonical URI of the issuing register (the value UNTP carries + in ``IdentifierScheme.id`` and CIRPASS in ``Identifier.scheme``).""" + + scheme_name: str + """Short human-readable label (the value UNTP carries in + ``IdentifierScheme.name`` and CIRPASS in ``Identifier.schemeName``).""" + + aliases: tuple[str, ...] = () + """Optional alternative URIs / names that resolve to the same + canonical scheme. Lookup is canonicalised on the first match.""" + + +# ============================================================================= +# Bundled mappings +# ============================================================================= +# +# Order is alphabetical-by-scheme-name for deterministic iteration. +# Adding a new scheme: append a row, leave the existing rows untouched. + +_BUNDLED: tuple[IdentifierSchemeMapping, ...] = ( + # Commercial / supply-chain + IdentifierSchemeMapping( + scheme_uri="https://www.dnb.com/duns-number.html", + scheme_name="DUNS", + aliases=("https://duns.com/", "https://www.dnb.com/"), + ), + IdentifierSchemeMapping( + scheme_uri="https://ec.europa.eu/taxation_customs/dds2/eos/eori_home.jsp", + scheme_name="EORI", + aliases=("https://ec.europa.eu/eori/",), + ), + IdentifierSchemeMapping( + scheme_uri="https://e-justice.europa.eu/topics/registers-business-insolvency-land/business-registers-search-company-eu_en", + scheme_name="EUID", + aliases=("https://e-justice.europa.eu/euid/",), + ), + # Product / commodity classification + IdentifierSchemeMapping( + scheme_uri="https://gs1.org/voc/", + scheme_name="GS1 GTIN", + aliases=("https://www.gs1.org/voc/",), + ), + IdentifierSchemeMapping( + scheme_uri="https://id.gs1.org/01/", + scheme_name="GS1 Digital Link", + aliases=("https://gs1.org/dl/",), + ), + # Party / legal entity + IdentifierSchemeMapping( + scheme_uri="https://www.gleif.org/lei/", + scheme_name="GLEIF LEI", + aliases=("https://www.gleif.org/", "https://gleif.org/lei/"), + ), + # Item-level serial numbering + IdentifierSchemeMapping( + scheme_uri="https://www.iso.org/standard/82220.html", + scheme_name="ISO/IEC 15459", + aliases=("https://www.iso.org/standard/iec-15459",), + ), + # Commodity classification (HS / TARIC / CPV) + IdentifierSchemeMapping( + scheme_uri=( + "https://www.wcoomd.org/en/topics/nomenclature/instrument-and-tools/" + "hs-nomenclature-2022-edition.aspx" + ), + scheme_name="WCO HS", + aliases=("https://www.wcoomd.org/", "https://hs.wcoomd.org/"), + ), + IdentifierSchemeMapping( + scheme_uri="https://ec.europa.eu/taxation_customs/dds2/taric/", + scheme_name="EU TARIC", + aliases=("https://ec.europa.eu/taric/",), + ), + IdentifierSchemeMapping( + scheme_uri="https://simap.ted.europa.eu/cpv", + scheme_name="EU CPV", + ), +) + + +# ============================================================================= +# Index — canonical URI / canonical name lookup +# ============================================================================= +# +# Both indices share the same value (a ``IdentifierSchemeMapping``). +# Aliases populate both indices so calls don't need to canonicalise +# URIs ahead of time. + + +_BY_URI: dict[str, IdentifierSchemeMapping] = {} +_BY_NAME: dict[str, IdentifierSchemeMapping] = {} + +for _row in _BUNDLED: + _BY_URI[_row.scheme_uri] = _row + for _alias in _row.aliases: + _BY_URI.setdefault(_alias, _row) + _BY_NAME[_row.scheme_name] = _row + + +# ============================================================================= +# Public API +# ============================================================================= + + +def lookup_by_uri(uri: str) -> IdentifierSchemeMapping | None: + """Return the mapping row for a scheme URI, or ``None`` if unmapped. + + Both canonical URIs and known aliases resolve. The canonical URI + is the one stored on the resulting :class:`IdentifierSchemeMapping` + — call sites that want a stable round-trip should use + ``mapping.scheme_uri`` rather than the input. + """ + return _BY_URI.get(uri) + + +def lookup_by_name(name: str) -> IdentifierSchemeMapping | None: + """Return the mapping row for a scheme short-name, or ``None``.""" + return _BY_NAME.get(name) + + +def all_mappings() -> tuple[IdentifierSchemeMapping, ...]: + """Return every bundled mapping row in canonical order. + + Used by tests to assert the table covers every name a CIRPASS + pilot fixture references; downstream consumers may iterate this + list when building schema-aware UIs. + """ + return _BUNDLED + + +def to_cirpass( + untp_scheme_id: str | None, + untp_scheme_name: str | None, +) -> tuple[str, str | None, IdentifierSchemeMapping | None]: + """Project a UNTP IdentifierScheme onto a CIRPASS (scheme, schemeName) pair. + + Returns ``(scheme_uri, scheme_name, mapping)`` where: + + - ``scheme_uri`` is the URI to put on + :class:`dppvalidator.models.cirpass.v1_3.Identifier.scheme`. + - ``scheme_name`` is the optional short-form to put on + :class:`Identifier.schemeName` (preserves the UNTP scheme name + verbatim when no canonical row matches). + - ``mapping`` is the :class:`IdentifierSchemeMapping` row when one + was found, else ``None`` — call sites use this to decide whether + to emit a ``MAP003`` (unmapped) warning. + + Empty / missing inputs are tolerated; the caller is expected to + surface ``MAP004`` separately when the field was required. + """ + if untp_scheme_id: + mapping = _BY_URI.get(untp_scheme_id) + if mapping is not None: + # Canonicalise: prefer the row's canonical URI so aliases + # converge to the same shape on the CIRPASS side. + return mapping.scheme_uri, mapping.scheme_name, mapping + # Fallback — pass through the raw URI + name. Caller emits MAP003. + return (untp_scheme_id or ""), untp_scheme_name, None + + +def to_untp( + cirpass_scheme: str, + cirpass_scheme_name: str | None, +) -> tuple[str, str, IdentifierSchemeMapping | None]: + """Lift a CIRPASS Identifier back onto a UNTP (scheme_id, scheme_name) pair. + + Returns ``(scheme_id, scheme_name, mapping)`` where the third + element is ``None`` for unmapped schemes (callers emit ``MAP003``). + The UNTP side requires both ``id`` and ``name`` to be non-empty; + when the CIRPASS payload omits ``schemeName``, this helper falls + back to the canonical name from the lookup table, or to the raw + URI's last path segment as a final defensive fallback. + """ + mapping = _BY_URI.get(cirpass_scheme) + if mapping is not None: + # Prefer the explicit CIRPASS schemeName when present (caller + # may have a more-specific label than the canonical row); + # otherwise fall back to the canonical row's name. + return mapping.scheme_uri, cirpass_scheme_name or mapping.scheme_name, mapping + if cirpass_scheme_name: + # Try resolving by name as a fallback. + mapping = _BY_NAME.get(cirpass_scheme_name) + if mapping is not None: + return cirpass_scheme, cirpass_scheme_name, mapping + # Synthesise a short-form name from the URI when nothing matches. + name = cirpass_scheme_name or _scheme_name_from_uri(cirpass_scheme) + return cirpass_scheme, name, None + + +def _scheme_name_from_uri(uri: str) -> str: + """Best-effort short-form synthesis from a URI. + + Used when the CIRPASS payload omits ``schemeName`` and the URI + isn't in the lookup table. Strips trailing slashes / fragments + and returns the last path segment, falling back to the host. + """ + cleaned = uri.rstrip("/").rstrip("#") + if "://" in cleaned: + cleaned = cleaned.split("://", 1)[1] + last = cleaned.rsplit("/", 1)[1] or cleaned if "/" in cleaned else cleaned + return last or "unknown-scheme" + + +__all__ = [ + "IdentifierSchemeMapping", + "all_mappings", + "lookup_by_name", + "lookup_by_uri", + "to_cirpass", + "to_untp", +] diff --git a/src/dppvalidator/compat/_mapping_codes.py b/src/dppvalidator/compat/_mapping_codes.py new file mode 100644 index 0000000..772ee98 --- /dev/null +++ b/src/dppvalidator/compat/_mapping_codes.py @@ -0,0 +1,198 @@ +"""Mapping warning codes + dataclass for UNTP ↔ CIRPASS shims. + +Phase 5 task 5.1 of [docs/plans/CIRPASS_2_MIGRATION.md]. The ``MAP`` +prefix distinguishes these from intra-family upgrade codes +(``UPG``) and validator codes (``CR/SUB/LCS/ACT/REL/SEM/VOC/CQ``). + +Five codes are reserved (per the plan §Phase 5 table): + +- ``MAP001`` — Lossy: target shape drops information that the source + carried. +- ``MAP002`` — Synthesised: a required target field had no donor on + the source side; the shim invented a value (typically a default + identifier scheme, role enum, or language tag). +- ``MAP003`` — Unmapped: no rule applied to the field; the source + value passed through unchanged. Surfaced so consumers can decide + whether to keep extension-style passthrough. +- ``MAP004`` — Required-field-missing: the source cannot supply a + field that the target requires. The output will fail target-side + validation until the caller fills it in. +- ``MAP005`` — Temporal collapse: source temporal semantics + collapsed into a less-expressive target shape (e.g. UNTP's + ``validFrom`` + ``validUntil`` collapsed into a single + ``EffectivePeriod``). + +The :class:`MappingWarning` dataclass mirrors :class:`UpgradeWarning` +from the UNTP 0.6 → 0.7 shim — same field layout, same severity +ladder, same semantic. The two are kept distinct types so call sites +can pattern-match without conflating intra-family vs cross-family +transformation events. +""" + +from __future__ import annotations + +from dataclasses import dataclass +from enum import Enum + +# ============================================================================= +# Warning codes +# ============================================================================= + +MAP_CODE_LOSSY = "MAP001" +MAP_CODE_SYNTHESISED = "MAP002" +MAP_CODE_UNMAPPED = "MAP003" +MAP_CODE_REQUIRED_FIELD_MISSING = "MAP004" +MAP_CODE_TEMPORAL_COLLAPSE = "MAP005" + + +# Tuple of all MAP codes in canonical order. Tests pin against this +# (every code must have a reproducible fixture) per the plan exit +# criteria. Keep ordered ascending so ``sorted(MAP_CODES) == MAP_CODES``. +MAP_CODES: tuple[str, ...] = ( + MAP_CODE_LOSSY, + MAP_CODE_SYNTHESISED, + MAP_CODE_UNMAPPED, + MAP_CODE_REQUIRED_FIELD_MISSING, + MAP_CODE_TEMPORAL_COLLAPSE, +) + + +# ============================================================================= +# Severity +# ============================================================================= + + +class MappingSeverity(str, Enum): + """Severity ladder for :class:`MappingWarning` (mirrors UpgradeSeverity). + + - ``info`` — transformation is benign and reversible (e.g. a + bare-string country code wrapped as a Country object). + - ``warning`` — transformation may need manual review (lossy + drop, synthesised value, role unmapped). Most ``MAP00X`` codes + land here. + - ``error`` — the result will not validate against the target + shape until the caller fixes the input. Reserved for + ``MAP004`` (required-field-missing). + """ + + INFO = "info" + WARNING = "warning" + ERROR = "error" + + +# Default severity per code. Call sites can override on a per-instance +# basis when context warrants (e.g. an unmapped role on a required +# field is an error, an unmapped role on an optional field is a +# warning). +DEFAULT_SEVERITY_BY_CODE: dict[str, MappingSeverity] = { + MAP_CODE_LOSSY: MappingSeverity.WARNING, + MAP_CODE_SYNTHESISED: MappingSeverity.WARNING, + MAP_CODE_UNMAPPED: MappingSeverity.INFO, + MAP_CODE_REQUIRED_FIELD_MISSING: MappingSeverity.ERROR, + MAP_CODE_TEMPORAL_COLLAPSE: MappingSeverity.WARNING, +} + + +# ============================================================================= +# Dataclass +# ============================================================================= + + +@dataclass(frozen=True, slots=True) +class MappingWarning: + """A single transformation event surfaced by a UNTP ↔ CIRPASS shim. + + Attributes mirror :class:`dppvalidator.compat.UpgradeWarning` so + consumers can write a single dispatch over both warning kinds. + + Attributes: + code: One of the ``MAP00X`` codes (see module-level constants). + Stable across releases; consumers may pattern-match on it. + path: JSONPath-like locator (e.g. ``$.credentialSubject.name`` + for the source side, or ``$.product.productName`` for the + target). The shim populates the locator on the side the + transformation *originated* — the side the user is + usually trying to debug. + message: Human-readable explanation. Should be actionable + when read in isolation (include both source and target + field names). + severity: One of :class:`MappingSeverity`. + details: Optional free-form context dictionary. Used by tests + for stable assertions when the message text isn't + self-describing (e.g. unmapped roles surface the original + role string in ``details["original_role"]``). + """ + + code: str + path: str + message: str + severity: MappingSeverity = MappingSeverity.WARNING + details: tuple[tuple[str, str], ...] = () + + @classmethod + def lossy(cls, path: str, message: str, **details: str) -> MappingWarning: + """Convenience factory for ``MAP001`` (lossy).""" + return cls( + code=MAP_CODE_LOSSY, + path=path, + message=message, + severity=DEFAULT_SEVERITY_BY_CODE[MAP_CODE_LOSSY], + details=tuple(sorted(details.items())), + ) + + @classmethod + def synthesised(cls, path: str, message: str, **details: str) -> MappingWarning: + """Convenience factory for ``MAP002`` (synthesised).""" + return cls( + code=MAP_CODE_SYNTHESISED, + path=path, + message=message, + severity=DEFAULT_SEVERITY_BY_CODE[MAP_CODE_SYNTHESISED], + details=tuple(sorted(details.items())), + ) + + @classmethod + def unmapped(cls, path: str, message: str, **details: str) -> MappingWarning: + """Convenience factory for ``MAP003`` (unmapped).""" + return cls( + code=MAP_CODE_UNMAPPED, + path=path, + message=message, + severity=DEFAULT_SEVERITY_BY_CODE[MAP_CODE_UNMAPPED], + details=tuple(sorted(details.items())), + ) + + @classmethod + def required_missing(cls, path: str, message: str, **details: str) -> MappingWarning: + """Convenience factory for ``MAP004`` (required-field-missing).""" + return cls( + code=MAP_CODE_REQUIRED_FIELD_MISSING, + path=path, + message=message, + severity=DEFAULT_SEVERITY_BY_CODE[MAP_CODE_REQUIRED_FIELD_MISSING], + details=tuple(sorted(details.items())), + ) + + @classmethod + def temporal_collapse(cls, path: str, message: str, **details: str) -> MappingWarning: + """Convenience factory for ``MAP005`` (temporal collapse).""" + return cls( + code=MAP_CODE_TEMPORAL_COLLAPSE, + path=path, + message=message, + severity=DEFAULT_SEVERITY_BY_CODE[MAP_CODE_TEMPORAL_COLLAPSE], + details=tuple(sorted(details.items())), + ) + + +__all__ = [ + "DEFAULT_SEVERITY_BY_CODE", + "MAP_CODES", + "MAP_CODE_LOSSY", + "MAP_CODE_REQUIRED_FIELD_MISSING", + "MAP_CODE_SYNTHESISED", + "MAP_CODE_TEMPORAL_COLLAPSE", + "MAP_CODE_UNMAPPED", + "MappingSeverity", + "MappingWarning", +] diff --git a/src/dppvalidator/compat/_shared.py b/src/dppvalidator/compat/_shared.py new file mode 100644 index 0000000..6027534 --- /dev/null +++ b/src/dppvalidator/compat/_shared.py @@ -0,0 +1,81 @@ +"""Shared helpers for the UNTP ↔ CIRPASS compat shims. + +Phase 8.5 polish (2026-05-09). Lifted from the two shim modules +(``untp_0_7_to_cirpass_1_3.py`` and ``cirpass_1_3_to_untp_0_7.py``) +where the helpers had drifted into near-duplicates. Centralising +them keeps the two shims symmetric and makes future maintenance +(e.g. tightening the ISO 8601 parser) a single-site edit. + +Module-private — the underscore prefix on the filename makes it a +private member of the ``compat`` package; consumers should reach +for the public surface (``to_cirpass_1_3``, ``to_untp_0_7``, +``MappingWarning``, etc.) rather than these helpers directly. +""" + +from __future__ import annotations + +from datetime import datetime +from typing import Any + + +def normalise_iso8601(value: Any) -> str: + """Normalise a datetime-shaped scalar to an ISO 8601 string. + + Accepts: + + - :class:`datetime.datetime` instances → ``.isoformat()``. + - Pre-formatted strings → returned unchanged. + - Anything else → ``str(value)``. + + The function does *not* validate the timestamp shape — the + model layer is responsible for catching malformed inputs at + validate time, and surfacing them as ``MDL050``-coded errors. + Both shims previously rolled their own (slightly divergent) + versions of this helper; lifting it here keeps the + forward / reverse shim symmetric. + """ + if isinstance(value, datetime): + return value.isoformat() + if isinstance(value, str): + return value + return str(value) + + +def pick_localised(items: Any, default_language: str) -> tuple[str | None, list[str]]: + """Pick a single string from a list of LocalisedText dicts. + + Returns ``(picked_value, dropped_languages)``: + + - ``picked_value`` is the ``value`` field of the entry whose + ``language`` matches ``default_language`` exactly. When no + entry matches, falls back to the first list entry. ``None`` + when the list is empty / not a list. + - ``dropped_languages`` lists every *other* language present so + callers can emit one ``MAP001`` per dropped entry. + + Used by the reverse shim (CIRPASS LocalisedText[] → UNTP scalar + string) — the forward shim wraps UNTP scalars in a single-entry + list directly without needing this helper. + """ + if not isinstance(items, list) or not items: + return None, [] + picked: dict[str, Any] | None = None + for item in items: + if isinstance(item, dict) and item.get("language") == default_language: + picked = item + break + if picked is None: + for item in items: + if isinstance(item, dict): + picked = item + break + picked_value = picked.get("value") if isinstance(picked, dict) else None + dropped_languages = [ + item.get("language") or "" + for item in items + if isinstance(item, dict) and item is not picked + ] + return (str(picked_value) if picked_value is not None else None), dropped_languages + + +__all__ = ["normalise_iso8601", "pick_localised"] diff --git a/src/dppvalidator/compat/_untp_cirpass_map.py b/src/dppvalidator/compat/_untp_cirpass_map.py new file mode 100644 index 0000000..f8245c1 --- /dev/null +++ b/src/dppvalidator/compat/_untp_cirpass_map.py @@ -0,0 +1,338 @@ +"""Declarative step table for UNTP 0.7.0 ↔ CIRPASS 1.3.0 transformations. + +Phase 5 task 5.2 of [docs/plans/CIRPASS_2_MIGRATION.md]. The two +shims (forward / reverse) read from this table so the *what's-mapped- +to-what* contract lives in one place. Each step is a self-contained +record: + +- ``forward_step`` / ``reverse_step`` are descriptive labels — + they show up in MAP-warning messages so consumers can grep for + the step that emitted them. +- ``untp_path`` and ``cirpass_path`` are JSONPath-style locators + for the source and target fields. Used in warning ``path`` + attributes. +- ``lossless`` is True when the step has a clean reversible + identity over the documented lossless subset; False when the + step is one-way (e.g. UNTP scorecards → CIRPASS LCA results + loses scorecard scoring). Round-trip property tests filter + on ``lossless=True``. +- ``codes`` records the ``MAP00X`` codes the step *may* emit. Tests + pin against this so accidental code drift fails CI. + +The table is the source of truth for the lossless-subset reference +doc at ``docs/concepts/untp-cirpass-mapping.md``; the doc is +*derived* from this list (Phase 5 task 5.9). +""" + +from __future__ import annotations + +from dataclasses import dataclass + +from dppvalidator.compat._mapping_codes import ( + MAP_CODE_LOSSY, + MAP_CODE_REQUIRED_FIELD_MISSING, + MAP_CODE_SYNTHESISED, + MAP_CODE_TEMPORAL_COLLAPSE, + MAP_CODE_UNMAPPED, +) + + +@dataclass(frozen=True, slots=True) +class MappingStep: + """One row in the declarative UNTP ↔ CIRPASS map.""" + + step_id: str + """Stable identifier (``M01``, ``M02``, …). Stable across releases + so MAP-warnings can reference the step that emitted them.""" + + description: str + """Short human-readable summary of the transformation.""" + + untp_path: str + """JSONPath of the source/target field on the UNTP side.""" + + cirpass_path: str + """JSONPath of the source/target field on the CIRPASS side.""" + + lossless: bool + """Whether the round-trip preserves the field bit-for-bit (over + the documented lossless subset).""" + + codes: tuple[str, ...] + """MAP-codes that this step may emit.""" + + +# ============================================================================= +# Step table (Phase 5) +# ============================================================================= +# +# Order matters for MAP-warning ordering inside both shims: each +# shim iterates the table top-to-bottom so warnings appear in +# transformation order. + +MAPPING_STEPS: tuple[MappingStep, ...] = ( + # ------------------------------------------------------------------- + # Identifiers — DPP / Product / Party / Facility (G18) + # ------------------------------------------------------------------- + MappingStep( + step_id="M01", + description="DPP credential identifier ↔ DPP identifier object", + untp_path="$.id", + cirpass_path="$.dppIdentifier.value", + lossless=True, + codes=(MAP_CODE_SYNTHESISED, MAP_CODE_REQUIRED_FIELD_MISSING), + ), + MappingStep( + step_id="M02", + description="Product credential subject ↔ root product", + untp_path="$.credentialSubject", + cirpass_path="$.product", + lossless=True, + codes=(MAP_CODE_REQUIRED_FIELD_MISSING,), + ), + MappingStep( + step_id="M03", + description=( + "Product identifier scheme: UNTP IdentifierScheme(id, name) " + "↔ CIRPASS Identifier(scheme, schemeName)" + ), + untp_path="$.credentialSubject.idScheme", + cirpass_path="$.product.productIdentifier.scheme", + lossless=True, + codes=(MAP_CODE_UNMAPPED,), + ), + # ------------------------------------------------------------------- + # Names / labels (G16 — i18n) + # ------------------------------------------------------------------- + MappingStep( + step_id="M04", + description=( + "DPP envelope name (UNTP scalar string) ↔ a synthesised " + "single-language LocalisedText is *not* part of the CIRPASS " + "tree-view; UNTP `name` is informational only and is " + "passed through to the reverse shim if present" + ), + untp_path="$.name", + cirpass_path="(not mapped — see notes)", + lossless=False, + codes=(MAP_CODE_LOSSY,), + ), + MappingStep( + step_id="M05", + description=( + "Product name: UNTP scalar string ↔ CIRPASS LocalisedText[]; " + "forward synthesises a single-entry list with the caller-" + "supplied default language. Reverse drops all but the first" + " language entry — additional languages emit MAP001." + ), + untp_path="$.credentialSubject.name", + cirpass_path="$.product.productName", + lossless=False, + codes=(MAP_CODE_LOSSY, MAP_CODE_SYNTHESISED), + ), + MappingStep( + step_id="M06", + description=( + "Product description: UNTP scalar string ↔ CIRPASS " + "LocalisedText[]; same lossy rule as M05 in reverse." + ), + untp_path="$.credentialSubject.description", + cirpass_path="$.product.description", + lossless=False, + codes=(MAP_CODE_LOSSY, MAP_CODE_SYNTHESISED), + ), + # ------------------------------------------------------------------- + # Temporal (G17) + # ------------------------------------------------------------------- + MappingStep( + step_id="M07", + description=( + "Issuance: UNTP `validFrom` (envelope) ↔ CIRPASS `issuedAt.timestamp`. " + "Forward copies validFrom into issuedAt (the DPP issued at the " + "moment it became valid). Reverse re-emits the same datetime." + ), + untp_path="$.validFrom", + cirpass_path="$.issuedAt.timestamp", + lossless=True, + codes=(MAP_CODE_REQUIRED_FIELD_MISSING,), + ), + MappingStep( + step_id="M08", + description=( + "Effective period: UNTP (validFrom, validUntil) ↔ " + "CIRPASS EffectivePeriod(start, end). Lossless when both " + "endpoints are present; MAP005 emitted on the forward side " + "when validUntil is absent and the caller still wants an " + "EffectivePeriod attached." + ), + untp_path="$.validFrom + $.validUntil", + cirpass_path="$.effectivePeriod", + lossless=True, + codes=(MAP_CODE_TEMPORAL_COLLAPSE,), + ), + # ------------------------------------------------------------------- + # Commodity classification + # ------------------------------------------------------------------- + MappingStep( + step_id="M09", + description=( + "Product category list: UNTP Classification[] ↔ CIRPASS " + "ClassificationCode[]. Both carry (code, scheme) pairs; " + "name-i18n flattens via M05's lossy rule." + ), + untp_path="$.credentialSubject.productCategory", + cirpass_path="$.product.commodityCode", + lossless=False, + codes=(MAP_CODE_LOSSY, MAP_CODE_UNMAPPED), + ), + # ------------------------------------------------------------------- + # Materials + # ------------------------------------------------------------------- + MappingStep( + step_id="M10", + description=( + "Material composition: UNTP Material[] ↔ CIRPASS " + "Composition.materials[]. mass_fraction / origin_country / " + "is_recycled all carry one-to-one. Material names flatten " + "via M05's lossy rule (UNTP carries a scalar name; CIRPASS " + "carries a localised list)." + ), + untp_path="$.credentialSubject.materialProvenance", + cirpass_path="$.composition.materials", + lossless=False, + codes=(MAP_CODE_LOSSY, MAP_CODE_SYNTHESISED), + ), + # ------------------------------------------------------------------- + # Actors / parties + # ------------------------------------------------------------------- + MappingStep( + step_id="M11", + description=( + "Related parties: UNTP PartyRole[] ↔ CIRPASS " + "ActorRole[]. Role enums map through " + "EUDPPRoleClass; unmapped UNTP roles synthesise " + "EUDPPRoleClass.ECONOMIC_OPERATOR with a MAP002 warning." + ), + untp_path="$.credentialSubject.relatedParty", + cirpass_path="$.relatedActors", + lossless=False, + codes=(MAP_CODE_LOSSY, MAP_CODE_SYNTHESISED, MAP_CODE_UNMAPPED), + ), + # ------------------------------------------------------------------- + # Issuer + # ------------------------------------------------------------------- + MappingStep( + step_id="M12", + description=( + "Issuer (UNTP envelope) ↔ CIRPASS related-actor with " + "ManufacturerRole. Forward synthesises a related-actor " + "from the issuer when no manufacturer party is present " + "(emits MAP002). Reverse extracts the manufacturer " + "actor and re-emits it as the issuer." + ), + untp_path="$.issuer", + cirpass_path="$.relatedActors[?(@.role=='eudpp:ManufacturerRole')]", + lossless=False, + codes=(MAP_CODE_SYNTHESISED, MAP_CODE_LOSSY), + ), + # ------------------------------------------------------------------- + # Performance / scorecards / LCA + # ------------------------------------------------------------------- + MappingStep( + step_id="M13", + description=( + "Performance claims: UNTP performanceClaim[] (scoped by " + "conformityTopic) ↔ CIRPASS LifeCycleAssessment results " + "for ESPR Annex impact categories. The full v0.7 claim " + "shape (assessor, evidence, claimedBenchmark) does not " + "round-trip — forward emits MAP001 for dropped fields." + ), + untp_path="$.credentialSubject.performanceClaim", + cirpass_path="$.lca", + lossless=False, + codes=(MAP_CODE_LOSSY, MAP_CODE_UNMAPPED), + ), + # ------------------------------------------------------------------- + # Substances of concern (UNTP has no first-class equivalent) + # ------------------------------------------------------------------- + MappingStep( + step_id="M14", + description=( + "CIRPASS substancesOfConcern has no UNTP equivalent in " + "the v0.7 base schema (the textile pilot extends with " + "substances; the base schema doesn't). Forward → CIRPASS " + "leaves the field empty; reverse drops the array with " + "MAP001." + ), + untp_path="(no source field)", + cirpass_path="$.substancesOfConcern", + lossless=False, + codes=(MAP_CODE_LOSSY,), + ), + # ------------------------------------------------------------------- + # Cross-module connector relations (CIRPASS-only) + # ------------------------------------------------------------------- + MappingStep( + step_id="M15", + description=( + "CIRPASS connectorRelations are a CIRPASS-only construct " + "(no UNTP equivalent). Reverse drops with MAP001; forward " + "leaves empty." + ), + untp_path="(no source field)", + cirpass_path="$.connectorRelations", + lossless=False, + codes=(MAP_CODE_LOSSY,), + ), +) + + +# ============================================================================= +# Public helpers +# ============================================================================= + + +def step(step_id: str) -> MappingStep: + """Return the :class:`MappingStep` row with the given ID. + + Raises ``KeyError`` if no row matches — the shims pattern-match + on stable IDs, so a missing one is a programming error. + """ + for row in MAPPING_STEPS: + if row.step_id == step_id: + return row + msg = f"Unknown mapping step ID: {step_id!r}" + raise KeyError(msg) + + +def lossless_step_ids() -> tuple[str, ...]: + """Return IDs of every step on the documented lossless subset.""" + return tuple(s.step_id for s in MAPPING_STEPS if s.lossless) + + +def codes_for_step(step_id: str) -> tuple[str, ...]: + """Return the MAP codes that ``step_id`` may emit.""" + return step(step_id).codes + + +def all_codes_in_use() -> set[str]: + """Set of every MAP code referenced by at least one step. + + Used by ``test_mapping_codes_have_step_coverage`` to assert + every code in :data:`MAP_CODES` is exercised by at least one + declared transformation. + """ + codes: set[str] = set() + for row in MAPPING_STEPS: + codes.update(row.codes) + return codes + + +__all__ = [ + "MAPPING_STEPS", + "MappingStep", + "all_codes_in_use", + "codes_for_step", + "lossless_step_ids", + "step", +] diff --git a/src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py b/src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py new file mode 100644 index 0000000..acefef2 --- /dev/null +++ b/src/dppvalidator/compat/cirpass_1_3_to_untp_0_7.py @@ -0,0 +1,739 @@ +"""Compatibility shim: rewrite CIRPASS v1.3.0 payloads into UNTP DPP 0.7.0 shape. + +Phase 5 task 5.4 of [docs/plans/CIRPASS_2_MIGRATION.md]. Reverse +direction of :mod:`untp_0_7_to_cirpass_1_3`. Together the two shims +form the round-trip identity over the documented lossless subset +(see ``docs/concepts/untp-cirpass-mapping.md`` for the field-by- +field table). + +Lossy transformations on the reverse side: + +- CIRPASS LocalisedText[] → UNTP scalar string. Forward picks the + first entry; additional languages are dropped with one ``MAP001`` + per language. +- CIRPASS substancesOfConcern / connectorRelations / lca have no + v0.7 base equivalent. The reverse drops them with ``MAP001`` per + array. +- Multi-actor relationships beyond the manufacturer / single-issuer + pair flatten into the UNTP relatedParty[] list, with role enum + remapped through :data:`_EUDPP_TO_UNTP_ROLE`. +""" + +from __future__ import annotations + +from copy import deepcopy +from typing import Any + +from dppvalidator.compat._identifier_schemes import to_untp as _id_scheme_to_untp +from dppvalidator.compat._mapping_codes import MappingWarning +from dppvalidator.compat._shared import ( + normalise_iso8601 as _normalise_iso8601, +) +from dppvalidator.compat._shared import ( + pick_localised as _pick_localised, +) +from dppvalidator.logging import get_logger +from dppvalidator.vocabularies.eudpp_actors import EUDPPRoleClass + +logger = get_logger(__name__) + + +# ============================================================================= +# Constants +# ============================================================================= + + +# UNTP context URLs and fixed envelope shape. We pin the v0.7.0 +# context and W3C VC v2 context here. +_UNTP_CONTEXT_URLS = ( + "https://www.w3.org/ns/credentials/v2", + "https://vocabulary.uncefact.org/untp/0.7.0/context/", +) + +# Default UNTP envelope name. CIRPASS does not carry an envelope-level +# name (the CIRPASS message is the credential subject directly). +# Reverse shim uses the product's first localised name as the envelope +# name, falling back to a generic placeholder. +_DEFAULT_ENVELOPE_NAME = "Digital Product Passport" + +# EUDPP role IRI → UNTP PartyRoleEnum value. The forward shim used a +# many-to-many mapping (e.g. ``recycler`` and ``remanufacturer`` both +# project to ``RecyclerRole`` super-category in some cases); reversing +# is therefore lossy. We pick the most-specific UNTP role that the +# v0.7.0 PartyRoleEnum exposes. +_EUDPP_TO_UNTP_ROLE: dict[str, str] = { + EUDPPRoleClass.MANUFACTURER.value: "manufacturer", + EUDPPRoleClass.IMPORTER.value: "importer", + EUDPPRoleClass.DISTRIBUTOR.value: "distributor", + EUDPPRoleClass.DEALER.value: "retailer", + EUDPPRoleClass.FULFILMENT_PROVIDER.value: "logisticsProvider", + EUDPPRoleClass.AUTHORISED_REP.value: "operator", + EUDPPRoleClass.RECYCLER.value: "recycler", + EUDPPRoleClass.REFURBISHER.value: "remanufacturer", + EUDPPRoleClass.REMANUFACTURER.value: "remanufacturer", + EUDPPRoleClass.CIRCULAR_ECONOMY_ROLE.value: "recycler", + EUDPPRoleClass.AUTHORITY.value: "regulator", + EUDPPRoleClass.MARKET_SURVEILLANCE.value: "regulator", + EUDPPRoleClass.CUSTOMS.value: "regulator", + EUDPPRoleClass.CUSTOMER.value: "owner", + EUDPPRoleClass.CONSUMER.value: "owner", + EUDPPRoleClass.END_USER.value: "owner", + EUDPPRoleClass.INDEPENDENT_OPERATOR.value: "serviceProvider", + EUDPPRoleClass.PROFESSIONAL_REPAIRER.value: "serviceProvider", + EUDPPRoleClass.DPP_SERVICE_PROVIDER.value: "serviceProvider", + EUDPPRoleClass.CONFORMITY_BODY.value: "certifier", + EUDPPRoleClass.CONFORMITY_ASSESSMENT_ROLE.value: "certifier", + EUDPPRoleClass.NOTIFIED_BODY.value: "certifier", + EUDPPRoleClass.CREDENTIAL_AGENCY.value: "serviceProvider", + EUDPPRoleClass.ISSUING_AGENCY.value: "serviceProvider", + EUDPPRoleClass.ECONOMIC_OPERATOR.value: "manufacturer", + EUDPPRoleClass.ROLE.value: "manufacturer", +} + + +# ============================================================================= +# Public entry point +# ============================================================================= + + +def to_untp_0_7( + data: dict[str, Any], + *, + default_language: str = "en", + issuer_did: str = "did:web:example.com", + issuer_name: str = "Unknown Issuer", + untp_id_granularity: str = "model", + country_lookup: dict[str, str] | None = None, + identifier_scheme_lookup: dict[str, tuple[str, str]] | None = None, +) -> tuple[dict[str, Any], list[MappingWarning]]: + """Project a CIRPASS reference structure v1.3.0 onto UNTP DPP 0.7.0. + + Args: + data: The CIRPASS payload as a parsed JSON ``dict``. Input is + deep-copied; the caller's object is never mutated. + default_language: Preferred BCP-47 tag when picking the + single-language UNTP scalar from a CIRPASS multilingual + list. Falls back to the first list entry if no match. + issuer_did: DID / URI to use as the synthesised UNTP envelope + issuer ``id`` when CIRPASS carries no manufacturer-role + actor. Defaults to ``did:web:example.com`` (deliberately + obviously-fake). + issuer_name: Human-readable issuer name fallback. + untp_id_granularity: ``"item"`` / ``"batch"`` / ``"model"``. + CIRPASS doesn't carry granularity; the caller picks one. + Defaults to ``"model"`` (the safest choice — doesn't + require itemNumber / batchNumber). + country_lookup: Optional ISO-3166 alpha-2 → country-name map. + Used to populate ``Country.countryName`` from a bare + ``Material.originCountry`` code. + identifier_scheme_lookup: Optional override for the bundled + scheme map. Keys are CIRPASS ``Identifier.scheme`` URIs; + values are ``(untp_id_scheme_id, untp_id_scheme_name)``. + + Returns: + Tuple of ``(untp_dict, warnings)``. The dict is ready to be + validated against the UNTP 0.7.0 envelope. + """ + if not isinstance(data, dict): + msg = f"to_untp_0_7() requires a dict, got {type(data).__name__!r}" + raise TypeError(msg) + + source = deepcopy(data) + warnings: list[MappingWarning] = [] + country_lookup = country_lookup or {} + identifier_scheme_lookup = identifier_scheme_lookup or {} + + untp: dict[str, Any] = { + "@context": list(_UNTP_CONTEXT_URLS), + "type": ["DigitalProductPassport", "VerifiableCredential"], + } + + # M01 — DPP identifier + _step_dpp_id(source, untp, warnings) + + # M02 + M03 + M05 + M06 + M09 — credentialSubject + _step_credential_subject( + source, + untp, + warnings, + default_language=default_language, + identifier_scheme_lookup=identifier_scheme_lookup, + untp_id_granularity=untp_id_granularity, + country_lookup=country_lookup, + ) + + # M07 + M08 — temporal envelope + _step_temporal(source, untp, warnings) + + # M11 + M12 — actors → relatedParty[] + envelope.issuer + _step_actors( + source, + untp, + warnings, + default_language=default_language, + issuer_did=issuer_did, + issuer_name=issuer_name, + ) + + # M13 + M14 + M15 — drop fields with no UNTP equivalent. + _step_drop_unmappable(source, untp, warnings) + + # Synthesise the envelope ``name`` field (UNTP requires it) from + # the credentialSubject.name. + if "name" not in untp: + untp_subject = untp.get("credentialSubject") or {} + untp["name"] = ( + untp_subject.get("name") if isinstance(untp_subject, dict) else None + ) or _DEFAULT_ENVELOPE_NAME + + return untp, warnings + + +# ============================================================================= +# Step implementations +# ============================================================================= + + +def _step_dpp_id( + source: dict[str, Any], + untp: dict[str, Any], + warnings: list[MappingWarning], +) -> None: + """M01 — CIRPASS dppIdentifier.value → UNTP envelope.id.""" + dpp = source.get("dppIdentifier") + if not isinstance(dpp, dict): + warnings.append( + MappingWarning.required_missing( + path="$.id", + message=("CIRPASS dppIdentifier missing; UNTP envelope.id left empty placeholder."), + step="M01", + ) + ) + untp["id"] = "" + return + untp["id"] = dpp.get("value") or "" + + +def _step_credential_subject( + source: dict[str, Any], + untp: dict[str, Any], + warnings: list[MappingWarning], + *, + default_language: str, + identifier_scheme_lookup: dict[str, tuple[str, str]], + untp_id_granularity: str, + country_lookup: dict[str, str], +) -> None: + """M02 + M03 + M05 + M06 + M09 — Product credentialSubject.""" + product = source.get("product") + if not isinstance(product, dict): + warnings.append( + MappingWarning.required_missing( + path="$.credentialSubject", + message=( + "CIRPASS payload is missing ``product``; UNTP requires " + "credentialSubject. Synthesising an empty subject." + ), + step="M02", + ) + ) + untp["credentialSubject"] = {} + return + + subject: dict[str, Any] = {"type": ["Product"]} + + # M03 — productIdentifier → id + idScheme + pid = product.get("productIdentifier") or {} + pid_value = pid.get("value") if isinstance(pid, dict) else None + pid_scheme = pid.get("scheme") if isinstance(pid, dict) else None + pid_scheme_name = pid.get("schemeName") if isinstance(pid, dict) else None + if pid_scheme and pid_scheme in identifier_scheme_lookup: + scheme_id, scheme_name = identifier_scheme_lookup[pid_scheme] + mapping = None + else: + scheme_id, scheme_name, mapping = _id_scheme_to_untp(pid_scheme or "", pid_scheme_name) + subject["id"] = pid_value or "" + subject["idScheme"] = {"id": scheme_id, "name": scheme_name} + if pid_scheme and mapping is None: + warnings.append( + MappingWarning.unmapped( + path="$.product.productIdentifier.scheme", + message=( + f"CIRPASS scheme {pid_scheme!r} is not in the bundled " + "lookup; passing through verbatim." + ), + step="M03", + scheme=pid_scheme, + ) + ) + + # M05 — productName[] → scalar name + name_list = product.get("productName") + name, dropped_languages = _pick_localised(name_list, default_language) + subject["name"] = name or "" + for lang in dropped_languages: + warnings.append( + MappingWarning.lossy( + path="$.credentialSubject.name", + message=( + f"Dropped CIRPASS productName entry with language=" + f"{lang!r} when projecting onto UNTP scalar name " + "(UNTP 0.7.0 has no envelope-level multilingual " + "structure for product name)." + ), + step="M05", + language=lang, + ) + ) + + # M06 — description[] → scalar + desc_list = product.get("description") + desc, dropped_desc_langs = _pick_localised(desc_list, default_language) + if desc: + subject["description"] = desc + for lang in dropped_desc_langs: + warnings.append( + MappingWarning.lossy( + path="$.credentialSubject.description", + message=( + f"Dropped CIRPASS description entry with language=" + f"{lang!r} when projecting onto UNTP scalar string." + ), + step="M06", + language=lang, + ) + ) + + # M09 — commodityCode[] → productCategory[] + commodity = product.get("commodityCode") + if isinstance(commodity, list) and commodity: + categories: list[dict[str, Any]] = [] + for i, cc in enumerate(commodity): + if not isinstance(cc, dict): + continue + scheme_uri = cc.get("scheme") or "" + cls_scheme_id, cls_scheme_name, _ = _id_scheme_to_untp(scheme_uri, None) + label_list = cc.get("name") + label, dropped_label_langs = _pick_localised(label_list, default_language) + if not label: + # UNTP Classification.name is required; synthesise a + # placeholder from the code. + label = cc.get("code") or "(unknown)" + warnings.append( + MappingWarning.synthesised( + path=f"$.credentialSubject.productCategory[{i}].name", + message=( + "CIRPASS commodityCode entry has no name[]; " + "synthesised UNTP Classification.name from " + "the code." + ), + step="M09", + ) + ) + for lang in dropped_label_langs: + warnings.append( + MappingWarning.lossy( + path=f"$.credentialSubject.productCategory[{i}].name", + message=( + f"Dropped commodityCode name entry with " + f"language={lang!r} during UNTP projection." + ), + step="M09", + language=lang, + ) + ) + categories.append( + { + "schemeId": scheme_uri, + "schemeName": cls_scheme_name, + "code": cc.get("code") or "", + "name": label, + } + ) + if categories: + subject["productCategory"] = categories + + # UNTP requires productCategory ≥ 1; synthesise a placeholder when + # CIRPASS carries no commodityCode. Unspecified is the safest + # fallback — downstream consumers can detect the placeholder via + # the ``ZZ-unspecified`` code value. + if "productCategory" not in subject: + subject["productCategory"] = [ + { + "schemeId": "https://w3id.org/eudpp#CommodityCode", + "schemeName": "EUDPP CommodityCode", + "code": "unspecified", + "name": "Unspecified", + } + ] + warnings.append( + MappingWarning.synthesised( + path="$.credentialSubject.productCategory", + message=( + "CIRPASS payload has no commodityCode; UNTP requires " + "productCategory ≥ 1. Synthesised an unspecified " + "placeholder Classification." + ), + step="M09", + ) + ) + + # M02 fields the UNTP schema requires but CIRPASS doesn't carry + # (idGranularity, producedAtFacility, countryOfProduction). The + # caller can override countryOfProduction via the manufacturer's + # actor address; we synthesise sensible defaults. + subject["idGranularity"] = untp_id_granularity + if "producedAtFacility" not in subject: + subject["producedAtFacility"] = { + "id": "https://example.com/facility/unspecified", + "type": ["Facility"], + "name": "Unspecified Facility", + } + warnings.append( + MappingWarning.synthesised( + path="$.credentialSubject.producedAtFacility", + message=( + "UNTP requires producedAtFacility but CIRPASS does " + "not carry an equivalent root-level field; " + "synthesised a placeholder facility." + ), + step="M02", + ) + ) + if "countryOfProduction" not in subject: + # Best-effort: the first material's originCountry. + composition = source.get("composition") or {} + materials = composition.get("materials") if isinstance(composition, dict) else None + first_country: str | None = None + if isinstance(materials, list): + for m in materials: + if isinstance(m, dict) and m.get("originCountry"): + first_country = m["originCountry"] + break + if first_country: + country_obj: dict[str, Any] = {"countryCode": first_country} + if first_country in country_lookup: + country_obj["countryName"] = country_lookup[first_country] + subject["countryOfProduction"] = country_obj + else: + subject["countryOfProduction"] = {"countryCode": "ZZ"} + warnings.append( + MappingWarning.synthesised( + path="$.credentialSubject.countryOfProduction", + message=( + "UNTP requires countryOfProduction; CIRPASS " + "carries no root-level country. Synthesised " + "ISO ``ZZ`` (unknown) — the caller should " + "supply a real country." + ), + step="M02", + ) + ) + + # M10 — composition.materials[] → materialProvenance[] + composition = source.get("composition") or {} + materials = composition.get("materials") if isinstance(composition, dict) else None + if isinstance(materials, list) and materials: + provenance = [] + for i, m in enumerate(materials): + if not isinstance(m, dict): + continue + mname_list = m.get("materialName") + mname, dropped_mname_langs = _pick_localised(mname_list, default_language) + for lang in dropped_mname_langs: + warnings.append( + MappingWarning.lossy( + path=f"$.credentialSubject.materialProvenance[{i}].name", + message=( + f"Dropped material name entry with language=" + f"{lang!r} during UNTP projection." + ), + step="M10", + language=lang, + ) + ) + country_code = m.get("originCountry") + country_obj: dict[str, Any] | None = None + if country_code: + country_obj = {"countryCode": country_code} + if country_code in country_lookup: + country_obj["countryName"] = country_lookup[country_code] + material_type_code = m.get("materialType") + material_type_obj: dict[str, Any] | None = None + if material_type_code: + # CIRPASS bare ISO 2076 code → UNTP Classification. + # The scheme URI is synthesised; consumers can swap in + # a more-specific scheme via Phase 6's CLI override. + material_type_obj = { + "schemeId": "https://w3id.org/eudpp#MaterialType", + "schemeName": "ISO 2076 / EUDPP MaterialType", + "code": material_type_code, + "name": material_type_code, + } + mass_fraction = m.get("massFraction") + entry: dict[str, Any] = {"name": mname or ""} + if country_obj is not None: + entry["originCountry"] = country_obj + else: + # UNTP Material requires originCountry; synthesise. + entry["originCountry"] = {"countryCode": "ZZ"} + warnings.append( + MappingWarning.synthesised( + path=(f"$.credentialSubject.materialProvenance[{i}].originCountry"), + message=( + "CIRPASS material has no originCountry; " + "synthesised ISO ``ZZ`` (unknown)." + ), + step="M10", + ) + ) + if material_type_obj is not None: + entry["materialType"] = material_type_obj + else: + # UNTP requires materialType; synthesise a minimal + # placeholder Classification. + entry["materialType"] = { + "schemeId": "https://w3id.org/eudpp#MaterialType", + "schemeName": "ISO 2076 / EUDPP MaterialType", + "code": "unspecified", + "name": "Unspecified", + } + warnings.append( + MappingWarning.synthesised( + path=(f"$.credentialSubject.materialProvenance[{i}].materialType"), + message=( + "CIRPASS material has no materialType; synthesised an UNTP placeholder." + ), + step="M10", + ) + ) + if mass_fraction is not None: + entry["massFraction"] = float(mass_fraction) + else: + # UNTP requires massFraction; synthesise zero (the + # validator will mark it as incomplete via SEM001). + entry["massFraction"] = 0.0 + warnings.append( + MappingWarning.synthesised( + path=(f"$.credentialSubject.materialProvenance[{i}].massFraction"), + message=( + "CIRPASS material has no massFraction; UNTP " + "requires it. Synthesised 0.0 (caller must " + "supply real values)." + ), + step="M10", + ) + ) + if m.get("isRecycled"): + # UNTP carries recycledMassFraction (a number); CIRPASS + # carries a boolean. The semantic mapping is "100% of + # this material's mass is recycled" — emit massFraction. + entry["recycledMassFraction"] = float(mass_fraction or 1.0) + provenance.append(entry) + if provenance: + subject["materialProvenance"] = provenance + + untp["credentialSubject"] = subject + + +def _step_temporal( + source: dict[str, Any], + untp: dict[str, Any], + warnings: list[MappingWarning], +) -> None: + """M07 + M08 — CIRPASS temporal → UNTP envelope.validFrom / validUntil.""" + issued_at = source.get("issuedAt") or {} + issued_ts = issued_at.get("timestamp") if isinstance(issued_at, dict) else None + period = source.get("effectivePeriod") or {} + start = period.get("start") if isinstance(period, dict) else None + end = period.get("end") if isinstance(period, dict) else None + valid_from = start or issued_ts + if not valid_from: + warnings.append( + MappingWarning.required_missing( + path="$.validFrom", + message=( + "CIRPASS issuedAt.timestamp and effectivePeriod.start " + "are both missing; UNTP requires validFrom. Output " + "validFrom is empty." + ), + step="M07", + ) + ) + valid_from = "" + untp["validFrom"] = _normalise_iso8601(valid_from) if valid_from else "" + if end: + untp["validUntil"] = _normalise_iso8601(end) + + +def _step_actors( + source: dict[str, Any], + untp: dict[str, Any], + warnings: list[MappingWarning], + *, + default_language: str, + issuer_did: str, + issuer_name: str, +) -> None: + """M11 + M12 — CIRPASS relatedActors[] → UNTP relatedParty[] + issuer.""" + actors = source.get("relatedActors") + if not isinstance(actors, list): + actors = [] + related_party: list[dict[str, Any]] = [] + issuer: dict[str, Any] | None = None + + for i, ar in enumerate(actors): + if not isinstance(ar, dict): + continue + actor = ar.get("actor") + role = ar.get("role") + if not isinstance(actor, dict): + continue + party = _project_party(actor, default_language) + untp_role = _EUDPP_TO_UNTP_ROLE.get(role or "") + if untp_role is None: + untp_role = "manufacturer" + warnings.append( + MappingWarning.synthesised( + path=f"$.credentialSubject.relatedParty[{i}].role", + message=( + f"CIRPASS role {role!r} has no canonical UNTP " + "PartyRole; falling back to ``manufacturer``." + ), + step="M11", + original_role=str(role), + ) + ) + # First manufacturer role becomes the envelope issuer + # (matches the M12 forward-shim invariant: forward synthesises + # a manufacturer from the issuer when relatedParty has none). + if untp_role == "manufacturer" and issuer is None: + issuer = { + "id": party["id"] or issuer_did, + "name": party["name"] or issuer_name, + "type": ["CredentialIssuer"], + } + related_party.append({"role": untp_role, "party": party}) + + subject = untp.get("credentialSubject") + if isinstance(subject, dict) and related_party: + subject["relatedParty"] = related_party + + if issuer is None: + # Synthesise an issuer from the first actor entry (or a fully + # placeholder when there are no actors at all). + if related_party: + first = related_party[0]["party"] + issuer = { + "id": first.get("id") or issuer_did, + "name": first.get("name") or issuer_name, + "type": ["CredentialIssuer"], + } + warnings.append( + MappingWarning.synthesised( + path="$.issuer", + message=( + "No CIRPASS actor with ManufacturerRole; UNTP " + "envelope.issuer synthesised from the first " + "related-actor entry." + ), + step="M12", + ) + ) + else: + issuer = { + "id": issuer_did, + "name": issuer_name, + "type": ["CredentialIssuer"], + } + warnings.append( + MappingWarning.synthesised( + path="$.issuer", + message=( + "CIRPASS payload carried no related actors; UNTP " + f"envelope.issuer synthesised from caller-" + f"supplied issuer_did={issuer_did!r} / " + f"issuer_name={issuer_name!r}." + ), + step="M12", + ) + ) + untp["issuer"] = issuer + + +def _project_party(actor: dict[str, Any], default_language: str) -> dict[str, Any]: + """Lift a CIRPASS Actor onto a UNTP Party.""" + identifier = actor.get("actorIdentifier") or {} + actor_id = identifier.get("value") if isinstance(identifier, dict) else None + scheme_uri = identifier.get("scheme") if isinstance(identifier, dict) else None + scheme_name = identifier.get("schemeName") if isinstance(identifier, dict) else None + untp_scheme_id, untp_scheme_name, _ = _id_scheme_to_untp(scheme_uri or "", scheme_name) + name_list = actor.get("actorName") + name, _ = _pick_localised(name_list, default_language) + party: dict[str, Any] = { + "id": actor_id or "", + "name": name or "", + "type": ["Party"], + "idScheme": {"id": untp_scheme_id, "name": untp_scheme_name}, + } + return party + + +def _step_drop_unmappable( + source: dict[str, Any], + untp: dict[str, Any], # noqa: ARG001 — symmetry with the other step helpers + warnings: list[MappingWarning], +) -> None: + """M13 + M14 + M15 — drop CIRPASS-only fields with MAP001.""" + if source.get("substancesOfConcern"): + warnings.append( + MappingWarning.lossy( + path="$.credentialSubject.materialProvenance", + message=( + f"CIRPASS substancesOfConcern (count=" + f"{len(source['substancesOfConcern'])}) has no UNTP " + "0.7.0 base equivalent; dropped during reverse mapping." + ), + step="M14", + ) + ) + if source.get("connectorRelations"): + warnings.append( + MappingWarning.lossy( + path="(no target field)", + message=( + f"CIRPASS connectorRelations (count=" + f"{len(source['connectorRelations'])}) is a CIRPASS-" + "only construct; dropped during reverse mapping." + ), + step="M15", + ) + ) + if source.get("lca"): + results = source["lca"].get("results") if isinstance(source["lca"], dict) else None + count = len(results) if isinstance(results, list) else 0 + warnings.append( + MappingWarning.lossy( + path="$.credentialSubject.performanceClaim", + message=( + f"CIRPASS LifeCycleAssessment (results=" + f"{count}) is dropped during reverse mapping. Phase 7 " + "pilot lifts cover the supported subset." + ), + step="M13", + ) + ) + + +# ============================================================================= +# Helpers +# ============================================================================= +# +# Cross-shim helpers (``_pick_localised``, ``_normalise_iso8601``) live at +# :mod:`dppvalidator.compat._shared` so the forward / reverse shims stay +# symmetric. + + +__all__ = [ + "to_untp_0_7", +] diff --git a/src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py b/src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py new file mode 100644 index 0000000..28a5dde --- /dev/null +++ b/src/dppvalidator/compat/untp_0_7_to_cirpass_1_3.py @@ -0,0 +1,645 @@ +"""Compatibility shim: rewrite UNTP DPP 0.7.0 payloads into CIRPASS v1.3.0 shape. + +Phase 5 task 5.3 of [docs/plans/CIRPASS_2_MIGRATION.md]. Mirrors +:mod:`dppvalidator.compat.upgrade_0_6_to_0_7` style — pure functions, +deep-copy input, deterministic order, structured warning codes — but +the warning-code prefix is ``MAP`` (cross-family) rather than ``UPG`` +(intra-family). + +Design principles: + +- **Pure transformation, no validation.** The shim never raises on + malformed input; it makes a best-effort projection and lets the + caller validate the result against the CIRPASS model. Anything + that *can't* be projected faithfully is surfaced as a + :class:`MappingWarning`. +- **Structural over semantic.** The shim rewires field names and + shapes but never invents content. UNTP scalar names cannot be + multilingualised without an explicit ``default_language=``; missing + required-in-CIRPASS fields produce ``MAP004`` warnings, not + synthesised values. +- **Side-effect-free.** The input ``dict`` is deep-copied; callers + can map-then-keep-original without surprises. +- **Deterministic.** Two runs against the same input produce + byte-identical output and the same warning set in the same order. + +The transformation is split into ~15 steps documented in the +:mod:`_untp_cirpass_map` declarative table; this module is the +operational glue that walks the table. +""" + +from __future__ import annotations + +from copy import deepcopy +from decimal import Decimal +from typing import Any + +from dppvalidator.compat._identifier_schemes import to_cirpass as _id_scheme_to_cirpass +from dppvalidator.compat._mapping_codes import MappingWarning +from dppvalidator.compat._shared import normalise_iso8601 as _normalise_iso8601 +from dppvalidator.logging import get_logger +from dppvalidator.vocabularies.eudpp_actors import EUDPPRoleClass + +logger = get_logger(__name__) + + +# ============================================================================= +# Constants +# ============================================================================= + + +# Default scheme URI used when the source UNTP payload omits +# ``credentialSubject.idScheme``. The CIRPASS Identifier requires a +# non-empty http(s) URI on ``scheme``; we pick the GS1 voc URI as the +# safe default for product identifiers (the synthesised value is +# surfaced via MAP002). +_DEFAULT_PRODUCT_SCHEME_URI = "https://gs1.org/voc/" +_DEFAULT_PRODUCT_SCHEME_NAME = "GS1 GTIN" + +# Default scheme URI for synthesised actor identifiers (e.g. when +# extracting an issuer with no idScheme). +_DEFAULT_ACTOR_SCHEME_URI = "https://www.gleif.org/lei/" +_DEFAULT_ACTOR_SCHEME_NAME = "GLEIF LEI" + +# UNTP PartyRole.role enum → EUDPP role class. The mapping covers +# the v0.7.0 PartyRoleEnum members (see +# ``dppvalidator.models.v0_7.identifiers.PartyRoleEnum``). Unmapped +# UNTP roles fall back to ``EconomicOperatorRole`` and emit MAP002. +_UNTP_TO_EUDPP_ROLE: dict[str, str] = { + "manufacturer": EUDPPRoleClass.MANUFACTURER.value, + "producer": EUDPPRoleClass.MANUFACTURER.value, + "remanufacturer": EUDPPRoleClass.REMANUFACTURER.value, + "recycler": EUDPPRoleClass.RECYCLER.value, + "importer": EUDPPRoleClass.IMPORTER.value, + "distributor": EUDPPRoleClass.DISTRIBUTOR.value, + "retailer": EUDPPRoleClass.DEALER.value, + "exporter": EUDPPRoleClass.ECONOMIC_OPERATOR.value, + "owner": EUDPPRoleClass.ECONOMIC_OPERATOR.value, + "operator": EUDPPRoleClass.ECONOMIC_OPERATOR.value, + "processor": EUDPPRoleClass.MANUFACTURER.value, + "serviceProvider": EUDPPRoleClass.DPP_SERVICE_PROVIDER.value, + "inspector": EUDPPRoleClass.CONFORMITY_BODY.value, + "certifier": EUDPPRoleClass.CONFORMITY_BODY.value, + "logisticsProvider": EUDPPRoleClass.FULFILMENT_PROVIDER.value, + "carrier": EUDPPRoleClass.FULFILMENT_PROVIDER.value, + "consignor": EUDPPRoleClass.ECONOMIC_OPERATOR.value, + "consignee": EUDPPRoleClass.ECONOMIC_OPERATOR.value, + "brandOwner": EUDPPRoleClass.MANUFACTURER.value, + "regulator": EUDPPRoleClass.AUTHORITY.value, +} + + +# ============================================================================= +# Public entry point +# ============================================================================= + + +def to_cirpass_1_3( + data: dict[str, Any], + *, + default_language: str = "en", + country_lookup: dict[str, str] | None = None, # noqa: ARG001 — reserved API kwarg + identifier_scheme_lookup: dict[str, tuple[str, str]] | None = None, +) -> tuple[dict[str, Any], list[MappingWarning]]: + """Project a UNTP DPP 0.7.0 payload onto CIRPASS reference structure v1.3.0. + + Args: + data: The UNTP 0.7.0 payload as a parsed JSON ``dict``. Input + is deep-copied; the caller's object is never mutated. + default_language: BCP-47 language tag used when projecting + UNTP scalar strings (``credentialSubject.name``, etc.) + onto CIRPASS LocalisedText[]. Defaults to ``"en"``; + callers should override per-DPP for non-English pilots. + country_lookup: Optional ISO-3166 alpha-2 → country-name map. + Reserved for future extension (the current shim does not + consume it; signature kept for parity with the reverse + shim and the migration plan API contract). + identifier_scheme_lookup: Optional override for the bundled + scheme map. Keys are UNTP IdentifierScheme.id URIs; + values are ``(cirpass_scheme_uri, cirpass_scheme_name)`` + tuples. Overrides take precedence over the bundled + :mod:`_identifier_schemes` lookup. + + Returns: + Tuple of ``(cirpass_dict, warnings)``. The dict is ready to + be validated against the CIRPASS reference-structure model. + Warnings are ordered by step and document position. + """ + if not isinstance(data, dict): + msg = f"to_cirpass_1_3() requires a dict, got {type(data).__name__!r}" + raise TypeError(msg) + + source = deepcopy(data) + warnings: list[MappingWarning] = [] + + cirpass: dict[str, Any] = {} + + # M01 — DPP identifier + _step_dpp_identifier(source, cirpass, warnings) + + # M02 + M03 + M05 + M06 + M09 — product (subject) + _step_product( + source, + cirpass, + warnings, + default_language=default_language, + identifier_scheme_lookup=identifier_scheme_lookup or {}, + ) + + # M07 — issuedAt + _step_issued_at(source, cirpass, warnings) + + # M08 — effective period + _step_effective_period(source, cirpass, warnings) + + # M10 — composition / materials + _step_composition(source, cirpass, warnings, default_language=default_language) + + # M11 + M12 — actors + _step_actors( + source, + cirpass, + warnings, + default_language=default_language, + ) + + # M13 — performance claims → LCA + _step_lca(source, cirpass, warnings) + + return cirpass, warnings + + +# ============================================================================= +# Step implementations +# ============================================================================= + + +def _step_dpp_identifier( + source: dict[str, Any], + cirpass: dict[str, Any], + warnings: list[MappingWarning], +) -> None: + """M01 — UNTP envelope ``id`` → CIRPASS ``dppIdentifier``.""" + dpp_id = source.get("id") + if not dpp_id: + warnings.append( + MappingWarning.required_missing( + path="$.id", + message=( + "UNTP envelope is missing ``id``; CIRPASS requires " + "``dppIdentifier.value``. The output's dppIdentifier " + "will be a placeholder until a real id is supplied." + ), + step="M01", + ) + ) + cirpass["dppIdentifier"] = { + "value": "", + "scheme": "https://example.com/dpp-register/", + } + return + cirpass["dppIdentifier"] = { + "value": str(dpp_id), + # The DPP-register scheme is not standardised; UNTP stores it + # in the issuer envelope, not as a per-credential scheme. We + # synthesise a generic register URI here. + "scheme": "https://example.com/dpp-register/", + "schemeName": "DPP register", + } + warnings.append( + MappingWarning.synthesised( + path="$.dppIdentifier.scheme", + message=( + "DPP-register scheme synthesised — UNTP carries the " + "credential id but no register URI; using a placeholder " + "register URI." + ), + step="M01", + ) + ) + + +def _step_product( + source: dict[str, Any], + cirpass: dict[str, Any], + warnings: list[MappingWarning], + *, + default_language: str, + identifier_scheme_lookup: dict[str, tuple[str, str]], +) -> None: + """M02 + M03 + M05 + M06 + M09 — Product subject.""" + subject = source.get("credentialSubject") + if not isinstance(subject, dict): + warnings.append( + MappingWarning.required_missing( + path="$.credentialSubject", + message=( + "UNTP envelope is missing ``credentialSubject``; " + "CIRPASS requires ``product``. The output's product " + "will be a placeholder until a real subject is supplied." + ), + step="M02", + ) + ) + cirpass["product"] = { + "productIdentifier": { + "value": "", + "scheme": _DEFAULT_PRODUCT_SCHEME_URI, + }, + "productName": [], + } + return + + product: dict[str, Any] = {} + + # productIdentifier + product_id = subject.get("id") or "" + untp_scheme = subject.get("idScheme") or {} + untp_scheme_id = untp_scheme.get("id") if isinstance(untp_scheme, dict) else None + untp_scheme_name = untp_scheme.get("name") if isinstance(untp_scheme, dict) else None + used_override = False + if untp_scheme_id and untp_scheme_id in identifier_scheme_lookup: + cirpass_scheme, cirpass_scheme_name = identifier_scheme_lookup[untp_scheme_id] + mapping = None # override path — caller-supplied row + used_override = True + else: + cirpass_scheme, cirpass_scheme_name, mapping = _id_scheme_to_cirpass( + untp_scheme_id, untp_scheme_name + ) + if not cirpass_scheme: + cirpass_scheme = _DEFAULT_PRODUCT_SCHEME_URI + cirpass_scheme_name = _DEFAULT_PRODUCT_SCHEME_NAME + warnings.append( + MappingWarning.synthesised( + path="$.product.productIdentifier.scheme", + message=( + "UNTP credentialSubject.idScheme is empty; synthesising " + f"the GS1 GTIN voc URI ({_DEFAULT_PRODUCT_SCHEME_URI}) " + "for the CIRPASS productIdentifier." + ), + step="M03", + ) + ) + elif mapping is None and untp_scheme_id and not used_override: + warnings.append( + MappingWarning.unmapped( + path="$.credentialSubject.idScheme.id", + message=( + f"UNTP idScheme.id={untp_scheme_id!r} is not in the " + "bundled scheme lookup; passing through verbatim." + ), + step="M03", + scheme=untp_scheme_id, + ) + ) + product_identifier: dict[str, Any] = { + "value": product_id, + "scheme": cirpass_scheme, + } + if cirpass_scheme_name: + product_identifier["schemeName"] = cirpass_scheme_name + product["productIdentifier"] = product_identifier + + # productName (M05) — UNTP scalar → CIRPASS LocalisedText[] + name = subject.get("name") + if name: + product["productName"] = [{"value": str(name), "language": default_language}] + warnings.append( + MappingWarning.synthesised( + path="$.product.productName[0].language", + message=( + "UNTP credentialSubject.name is a scalar string; " + f"synthesised a single CIRPASS LocalisedText entry " + f"with language={default_language!r}." + ), + step="M05", + language=default_language, + ) + ) + else: + product["productName"] = [] + warnings.append( + MappingWarning.required_missing( + path="$.product.productName", + message=("UNTP credentialSubject.name is empty; CIRPASS requires productName ≥ 1."), + step="M05", + ) + ) + + # description (M06) + description = subject.get("description") + if description: + product["description"] = [{"value": str(description), "language": default_language}] + + # commodityCode (M09) — Classification[] → ClassificationCode[] + classifications = subject.get("productCategory") + if isinstance(classifications, list) and classifications: + commodity: list[dict[str, Any]] = [] + for i, cls in enumerate(classifications): + if not isinstance(cls, dict): + continue + scheme = cls.get("schemeId") or "" + code = cls.get("code") or "" + label = cls.get("name") + entry: dict[str, Any] = {"code": code, "scheme": scheme} + if label: + entry["name"] = [{"value": str(label), "language": default_language}] + warnings.append( + MappingWarning.synthesised( + path=f"$.product.commodityCode[{i}].name[0].language", + message=( + "UNTP Classification.name is a scalar string; " + f"projected onto a single LocalisedText entry " + f"with language={default_language!r}." + ), + step="M09", + ) + ) + commodity.append(entry) + if commodity: + product["commodityCode"] = commodity + + cirpass["product"] = product + + +def _step_issued_at( + source: dict[str, Any], + cirpass: dict[str, Any], + warnings: list[MappingWarning], +) -> None: + """M07 — UNTP envelope ``validFrom`` → CIRPASS ``issuedAt``.""" + valid_from = source.get("validFrom") + if valid_from: + cirpass["issuedAt"] = {"timestamp": _normalise_iso8601(valid_from)} + return + warnings.append( + MappingWarning.required_missing( + path="$.validFrom", + message=( + "UNTP envelope is missing ``validFrom``; CIRPASS requires " + "``issuedAt.timestamp``. Synthesising the current UTC " + "timestamp would be misleading; the output's issuedAt " + "is a placeholder." + ), + step="M07", + ) + ) + cirpass["issuedAt"] = {"timestamp": "1970-01-01T00:00:00+00:00"} + + +def _step_effective_period( + source: dict[str, Any], + cirpass: dict[str, Any], + warnings: list[MappingWarning], +) -> None: + """M08 — UNTP (validFrom, validUntil) → CIRPASS effectivePeriod.""" + valid_from = source.get("validFrom") + valid_until = source.get("validUntil") + if not valid_from: + return + period: dict[str, Any] = {"start": _normalise_iso8601(valid_from)} + if valid_until: + period["end"] = _normalise_iso8601(valid_until) + else: + warnings.append( + MappingWarning.temporal_collapse( + path="$.effectivePeriod.end", + message=( + "UNTP validUntil is absent; CIRPASS effectivePeriod.end " + "left empty (open-ended). The forward shim does not " + "synthesise an end date." + ), + step="M08", + ) + ) + cirpass["effectivePeriod"] = period + + +def _step_composition( + source: dict[str, Any], + cirpass: dict[str, Any], + warnings: list[MappingWarning], + *, + default_language: str, +) -> None: + """M10 — UNTP materialProvenance[] → CIRPASS composition.materials[].""" + subject = source.get("credentialSubject") + if not isinstance(subject, dict): + return + materials = subject.get("materialProvenance") + if not isinstance(materials, list) or not materials: + return + out: list[dict[str, Any]] = [] + for i, m in enumerate(materials): + if not isinstance(m, dict): + continue + material_name = m.get("name") + country = m.get("originCountry") + country_code = country.get("countryCode") if isinstance(country, dict) else None + material_type = m.get("materialType") + # UNTP carries materialType as a Classification object; CIRPASS + # carries a bare ISO-2076 code. We extract the ``code`` field. + type_code: str | None = None + if isinstance(material_type, dict): + type_code = material_type.get("code") + mass_fraction = m.get("massFraction") + is_recycled = bool(m.get("recycledMassFraction")) + entry: dict[str, Any] = {} + if material_name: + entry["materialName"] = [{"value": str(material_name), "language": default_language}] + warnings.append( + MappingWarning.synthesised( + path=f"$.composition.materials[{i}].materialName[0].language", + message=( + "UNTP Material.name is a scalar string; projected " + f"onto a single CIRPASS LocalisedText entry " + f"with language={default_language!r}." + ), + step="M10", + ) + ) + else: + entry["materialName"] = [] + warnings.append( + MappingWarning.required_missing( + path=f"$.composition.materials[{i}].materialName", + message="UNTP Material.name is empty; CIRPASS requires materialName ≥ 1.", + step="M10", + ) + ) + if type_code: + entry["materialType"] = type_code + if country_code: + entry["originCountry"] = country_code + if mass_fraction is not None: + entry["massFraction"] = _decimal_str(mass_fraction) + if is_recycled: + entry["isRecycled"] = True + out.append(entry) + if out: + cirpass["composition"] = {"materials": out} + + +def _step_actors( + source: dict[str, Any], + cirpass: dict[str, Any], + warnings: list[MappingWarning], + *, + default_language: str, +) -> None: + """M11 + M12 — UNTP relatedParty[] + issuer → CIRPASS relatedActors[].""" + actors: list[dict[str, Any]] = [] + seen_manufacturer = False + + # M11 — relatedParty[] + subject = source.get("credentialSubject") or {} + related_party = subject.get("relatedParty") if isinstance(subject, dict) else None + if isinstance(related_party, list): + for i, pr in enumerate(related_party): + if not isinstance(pr, dict): + continue + party = pr.get("party") or {} + untp_role = pr.get("role") + if not isinstance(party, dict): + continue + actor = _project_actor(party, default_language) + eudpp_role = _UNTP_TO_EUDPP_ROLE.get(untp_role or "") + if eudpp_role is None and untp_role: + eudpp_role = EUDPPRoleClass.ECONOMIC_OPERATOR.value + warnings.append( + MappingWarning.synthesised( + path=f"$.relatedActors[{i}].role", + message=( + f"UNTP role {untp_role!r} has no canonical " + "EUDPP super-role; falling back to " + "``EconomicOperatorRole``." + ), + step="M11", + original_role=str(untp_role), + ) + ) + elif eudpp_role is None: + eudpp_role = EUDPPRoleClass.ECONOMIC_OPERATOR.value + warnings.append( + MappingWarning.synthesised( + path=f"$.relatedActors[{i}].role", + message=( + "UNTP relatedParty entry has no role; " + "synthesising ``EconomicOperatorRole``." + ), + step="M11", + ) + ) + if eudpp_role == EUDPPRoleClass.MANUFACTURER.value: + seen_manufacturer = True + actors.append({"actor": actor, "role": eudpp_role}) + + # M12 — issuer fallback. CIRPASS requires *some* manufacturer-shaped + # actor for ESPR Annex III(g); when relatedParty[] doesn't include + # one, we lift the UNTP issuer into a synthesised manufacturer + # entry. This is a one-way projection and surfaces MAP002. + issuer = source.get("issuer") + if not seen_manufacturer and isinstance(issuer, dict) and issuer.get("id"): + actor = _project_actor(issuer, default_language) + actors.append({"actor": actor, "role": EUDPPRoleClass.MANUFACTURER.value}) + warnings.append( + MappingWarning.synthesised( + path=f"$.relatedActors[{len(actors) - 1}]", + message=( + "No relatedParty with manufacturer role; lifted UNTP " + "envelope.issuer into a synthesised " + "ManufacturerRole actor." + ), + step="M12", + ) + ) + if actors: + cirpass["relatedActors"] = actors + + +def _project_actor(party: dict[str, Any], default_language: str) -> dict[str, Any]: + """Lift a UNTP Party / CredentialIssuer onto a CIRPASS Actor.""" + actor_id = party.get("id") or "" + name = party.get("name") or "" + untp_scheme = party.get("idScheme") or {} + untp_scheme_id = untp_scheme.get("id") if isinstance(untp_scheme, dict) else None + untp_scheme_name = untp_scheme.get("name") if isinstance(untp_scheme, dict) else None + cirpass_scheme, cirpass_scheme_name, _ = _id_scheme_to_cirpass(untp_scheme_id, untp_scheme_name) + if not cirpass_scheme: + cirpass_scheme = _DEFAULT_ACTOR_SCHEME_URI + cirpass_scheme_name = _DEFAULT_ACTOR_SCHEME_NAME + identifier: dict[str, Any] = {"value": actor_id, "scheme": cirpass_scheme} + if cirpass_scheme_name: + identifier["schemeName"] = cirpass_scheme_name + return { + "actorIdentifier": identifier, + "actorName": [{"value": str(name), "language": default_language}], + } + + +def _step_lca( + source: dict[str, Any], + cirpass: dict[str, Any], # noqa: ARG001 — symmetry with other step helpers + warnings: list[MappingWarning], +) -> None: + """M13 — UNTP performanceClaim[] → CIRPASS lca.results[]. + + The current shim is a *minimal* projection: only emissions / + LCA-style conformity claims are lifted. Other claim topics + (durability, traceability, etc.) drop with MAP001 — the CIRPASS + LCA module models PEF impact categories, not the full UNTP + claim ontology. Phase 7 (Tyres / Textile pilot) extends with + pilot-specific lifts. + """ + subject = source.get("credentialSubject") or {} + if not isinstance(subject, dict): + return + claims = subject.get("performanceClaim") + if not isinstance(claims, list) or not claims: + return + # Stub — drop with a single MAP001 noting the lossiness. Phase 7 + # replaces this with pilot-aware lifting. We don't generate a + # synthetic ``lca`` block because that would silently invent + # impact-result data the source never carried. + warnings.append( + MappingWarning.lossy( + path="$.lca", + message=( + f"UNTP carried {len(claims)} performanceClaim entries; " + "the v1.3.0 shim drops them without projecting onto " + "CIRPASS LifeCycleAssessment results (Phase 7 pilot " + "lifts cover the supported subset). Reverse round-trip " + "will not restore performance claims." + ), + step="M13", + ) + ) + + +# ============================================================================= +# Helpers +# ============================================================================= +# +# Cross-shim helpers (``_normalise_iso8601``) live at +# :mod:`dppvalidator.compat._shared` so the forward / reverse shims +# stay symmetric. The remaining helper below is forward-only. + + +def _decimal_str(value: Any) -> str: + """Stringify a Decimal-bound numeric value for CIRPASS wire shape. + + UNTP carries massFraction as ``float``; CIRPASS carries it as + Decimal. The mapping shim emits a string so JSON Decimal + round-tripping doesn't lose precision (Pydantic's ``Decimal`` + field accepts string input). + """ + if isinstance(value, Decimal): + return str(value) + if isinstance(value, (int, float)): + return str(value) + return str(value) + + +__all__ = [ + "to_cirpass_1_3", +] diff --git a/src/dppvalidator/exporters/__init__.py b/src/dppvalidator/exporters/__init__.py index 7e8ce2d..7358adb 100644 --- a/src/dppvalidator/exporters/__init__.py +++ b/src/dppvalidator/exporters/__init__.py @@ -1,5 +1,12 @@ """Exporter utilities for Digital Product Passports.""" +from typing import Any + +from dppvalidator.exporters.cirpass_jsonld import ( + CIRPASSJsonLDExporter, + export_cirpass_jsonld, + export_cirpass_jsonld_dict, +) from dppvalidator.exporters.contexts import ( CONTEXTS, DEFAULT_VERSION, @@ -7,7 +14,7 @@ ContextManager, ) from dppvalidator.exporters.eudpp_jsonld import ( - EUDPP_CONTEXT_URL, + EUDPP_CANONICAL_CONTEXT_URL, EUDPPJsonLDExporter, EUDPPTermMapper, export_eudpp_jsonld, @@ -20,6 +27,27 @@ from dppvalidator.exporters.jsonld import JSONLDExporter, export_jsonld from dppvalidator.exporters.protocols import Exporter + +def __getattr__(name: str) -> Any: + """PEP 562 hook so the deprecated ``EUDPP_CONTEXT_URL`` re-export + triggers the same :class:`DeprecationWarning` as the underlying + module-level attribute. + + Without this, importing the package eagerly resolves the constant + at module-load time and the warning fires at import rather than + use site. With it, the warning fires only when consumers actually + reach for the deprecated name. + """ + if name == "EUDPP_CONTEXT_URL": + # Defer to the underlying module's __getattr__, which emits + # the deprecation warning with the right ``stacklevel``. + from dppvalidator.exporters import eudpp_jsonld + + return eudpp_jsonld.EUDPP_CONTEXT_URL + msg = f"module {__name__!r} has no attribute {name!r}" + raise AttributeError(msg) + + __all__ = [ "Exporter", "JSONLDExporter", @@ -30,13 +58,18 @@ "ContextDefinition", "CONTEXTS", "DEFAULT_VERSION", - # EU DPP Export (Phase 9) + # EU DPP Export "EUDPPJsonLDExporter", "EUDPPTermMapper", - "EUDPP_CONTEXT_URL", + "EUDPP_CANONICAL_CONTEXT_URL", + "EUDPP_CONTEXT_URL", # deprecated alias resolved via __getattr__ "export_eudpp_jsonld", "export_eudpp_jsonld_dict", "get_eudpp_jsonld_context", "get_term_mapping_summary", "validate_eudpp_export", + # CIRPASS-LD Export (Phase 6) + "CIRPASSJsonLDExporter", + "export_cirpass_jsonld", + "export_cirpass_jsonld_dict", ] diff --git a/src/dppvalidator/exporters/cirpass_jsonld.py b/src/dppvalidator/exporters/cirpass_jsonld.py new file mode 100644 index 0000000..56ee04f --- /dev/null +++ b/src/dppvalidator/exporters/cirpass_jsonld.py @@ -0,0 +1,341 @@ +"""CIRPASS-LD exporter for the v1.3.0 reference-structure shape. + +Phase 6 task 6.1 of [docs/plans/CIRPASS_2_MIGRATION.md]. Emits a +CIRPASS reference-structure v1.3.0 payload as JSON-LD with the +canonical EUDPP v1.9.1 ``@context`` attached. The exporter accepts +*either*: + +- A :class:`dppvalidator.models.cirpass.v1_3.ReferencePassport` + instance (the native shape), or +- A :class:`dppvalidator.models.passport.DigitalProductPassport` + (UNTP envelope), which it routes through the Phase 5 forward + shim (``compat.to_cirpass_1_3``) before serialising. + +The two-input shape lets callers point ``dppvalidator export +--format cirpass-jsonld`` at a UNTP fixture without an explicit +mapping step, while still allowing native-CIRPASS callers to +serialise their own message tree directly. + +Output shape +============ + +The JSON-LD output has: + +- ``@context``: ``[W3C VC v2, https://w3id.org/eudpp#]`` plus an + inline namespace prefix table so consumers without the W3ID + resolver can still expand compact IRIs. +- ``type``: ``["DigitalProductPassport", "eudpp:DPP"]`` — both the + CIRPASS message-level type and the EUDPP class IRI. +- The serialised CIRPASS reference-structure tree (camelCase + aliases per :meth:`UNTPBaseModel.model_dump(by_alias=True)`). +""" + +from __future__ import annotations + +import json +from copy import deepcopy +from pathlib import Path +from typing import TYPE_CHECKING, Any + +from dppvalidator.logging import get_logger +from dppvalidator.vocabularies.ontology import EUDPPNamespace, get_eudpp_context + +if TYPE_CHECKING: + from dppvalidator.compat import MappingWarning + from dppvalidator.models.cirpass.v1_3 import ReferencePassport + from dppvalidator.models.passport import DigitalProductPassport + +logger = get_logger(__name__) + + +# ============================================================================= +# Constants +# ============================================================================= + + +# CIRPASS reference structure v1.3.0 default ``type`` array. The +# first entry is the CIRPASS message-level token (``DigitalProductPassport``); +# the second is the EUDPP ontology class IRI in compact form +# (``eudpp:DPP``). Consumers that resolve the @context get the full +# IRIs; those that don't see the message-level token directly. +_CIRPASS_TYPE_ARRAY = ("DigitalProductPassport", "eudpp:DPP") + + +# ============================================================================= +# Public exporter class +# ============================================================================= + + +class CIRPASSJsonLDExporter: + """Export a CIRPASS reference-structure passport to JSON-LD. + + Accepts either a native CIRPASS :class:`ReferencePassport` or a + UNTP :class:`DigitalProductPassport`; in the latter case the + Phase 5 forward shim does the projection. Mapping warnings (when + a UNTP source is supplied) are surfaced via + :attr:`last_mapping_warnings` after each export call so callers + can decide whether to accept the result. + """ + + def __init__( + self, + *, + version: str | None = None, + include_untp_namespace: bool = False, + ) -> None: + """Initialize the exporter. + + Args: + version: CIRPASS reference-structure version. Defaults to + the registry's :data:`DEFAULT_VERSIONS[SchemaFamily.CIRPASS]` + when ``None`` — keeps the no-version-literals guard + happy and tracks future spec bumps automatically. + include_untp_namespace: When ``True``, the emitted + ``@context`` array includes an extra entry binding + the ``untp:`` prefix to the UNTP 0.7.0 vocabulary. + Useful for round-trip exports that may be re-read by + UNTP-aware tooling. + """ + from dppvalidator.compat import active_version + from dppvalidator.schemas.registry import SchemaFamily + + self.version = version or active_version(SchemaFamily.CIRPASS) + self._include_untp_namespace = include_untp_namespace + self._last_mapping_warnings: list[MappingWarning] = [] + + @property + def last_mapping_warnings(self) -> list[MappingWarning]: + """Mapping warnings emitted by the most recent export call. + + Empty when the input was a native CIRPASS passport (no + forward-shim run) or when the UNTP forward shim emitted + nothing. + """ + return list(self._last_mapping_warnings) + + def export( + self, + passport: ReferencePassport | DigitalProductPassport | dict[str, Any], + *, + indent: int | None = 2, + include_type: bool = True, + default_language: str = "en", + ) -> str: + """Export a passport to a JSON-LD string. + + Args: + passport: Either a native :class:`ReferencePassport`, a + UNTP :class:`DigitalProductPassport`, or a parsed + JSON dict (one of the two shapes — auto-detected via + ``dppIdentifier`` presence). + indent: JSON indent for the output; ``None`` for compact. + include_type: When ``True``, the output includes the + CIRPASS / EUDPP type array. + default_language: BCP-47 tag used by the UNTP forward + shim to wrap UNTP scalar names as CIRPASS LocalisedText. + + Returns: + JSON-LD formatted string. + """ + document = self.export_dict( + passport, + include_type=include_type, + default_language=default_language, + ) + return json.dumps(document, indent=indent, ensure_ascii=False, default=str) + + def export_dict( + self, + passport: ReferencePassport | DigitalProductPassport | dict[str, Any], + *, + include_type: bool = True, + default_language: str = "en", + ) -> dict[str, Any]: + """Export a passport to a JSON-LD dict. + + Same as :meth:`export` but returns the dictionary directly + for callers that want to post-process the output (e.g. + attach a proof, sign, or wrap in a transport envelope). + """ + cirpass_dict = self._normalise_input(passport, default_language=default_language) + return self._wrap_with_context(cirpass_dict, include_type=include_type) + + def export_to_file( + self, + passport: ReferencePassport | DigitalProductPassport | dict[str, Any], + path: Path | str, + *, + indent: int | None = 2, + default_language: str = "en", + ) -> None: + """Export a passport to a JSON-LD file on disk.""" + content = self.export(passport, indent=indent, default_language=default_language) + Path(path).write_text(content, encoding="utf-8") + logger.info("Wrote CIRPASS-LD to %s", path) + + # ------------------------------------------------------------------ + # Internals + # ------------------------------------------------------------------ + + def _normalise_input( + self, + passport: Any, + *, + default_language: str, + ) -> dict[str, Any]: + """Resolve the input to a CIRPASS-shaped dict. + + Routes UNTP envelopes through the Phase 5 forward shim; + native CIRPASS passports / dicts are dumped/copied directly. + Pydantic-model detection is duck-typed (``model_dump``) so + both v0.6 and v0.7 ``DigitalProductPassport`` classes work + transparently — we route on shape, not on Python class + identity. + """ + # Late imports keep the cold-start budget intact: the + # ``dppvalidator.exporters`` package may be loaded by callers + # who never touch CIRPASS, so we delay the CIRPASS / compat + # imports until this method actually runs. + from dppvalidator.compat import to_cirpass_1_3 + from dppvalidator.models.cirpass.v1_3 import ReferencePassport + + # Native CIRPASS Pydantic passport — happy path. + if isinstance(passport, ReferencePassport): + self._last_mapping_warnings = [] + return passport.model_dump(by_alias=True, exclude_none=True, mode="json") + + # Pydantic model with a model_dump method — UNTP envelope (v0.6 + # or v0.7) or any structurally-compatible model. Both + # ``DigitalProductPassport`` classes carry the wire shape we + # need, so a duck-typed ``hasattr`` check is the right move + # here instead of pinning to a specific class identity. + if hasattr(passport, "model_dump"): + untp_dict = passport.model_dump(by_alias=True, exclude_none=True, mode="json") + cirpass_dict, warnings = to_cirpass_1_3(untp_dict, default_language=default_language) + self._last_mapping_warnings = warnings + return cirpass_dict + + # Already-parsed dict — auto-detect shape and pass through or + # forward-shim accordingly. + if isinstance(passport, dict): + if _looks_like_cirpass(passport): + self._last_mapping_warnings = [] + return deepcopy(passport) + cirpass_dict, warnings = to_cirpass_1_3(passport, default_language=default_language) + self._last_mapping_warnings = warnings + return cirpass_dict + + # Anything else — defensive fail-fast. The exporter is a + # boundary; surfacing the type mismatch here is much cheaper + # to debug than letting it propagate into JSON serialisation. + msg = ( + f"CIRPASSJsonLDExporter.export() expected a ReferencePassport, " + f"a Pydantic UNTP DigitalProductPassport, or a dict; got " + f"{type(passport).__name__}." + ) + raise TypeError(msg) + + def _wrap_with_context( + self, + document: dict[str, Any], + *, + include_type: bool, + ) -> dict[str, Any]: + """Attach the CIRPASS / EUDPP @context + type to ``document``.""" + result: dict[str, Any] = {} + result["@context"] = list(_build_context_array(self._include_untp_namespace)) + if include_type and "type" not in document: + result["type"] = list(_CIRPASS_TYPE_ARRAY) + # Preserve insertion order: @context, type, then the rest. + result.update(document) + return result + + +# ============================================================================= +# Helpers +# ============================================================================= + + +def _build_context_array(include_untp_namespace: bool) -> tuple[Any, ...]: + """Build the CIRPASS @context array. + + Always includes: + + - The W3C VC v2 context (every DPP-shaped credential's first entry). + - An inline ``{"eudpp": "https://w3id.org/eudpp#"}`` prefix table + with the canonical EUDPP namespace + the additional CIRPASS + reference-structure terms (``DigitalProductPassport``, + ``Product``, ``Identifier``, etc.). + + Optionally appends a UNTP namespace binding for round-trip + consumers that need to re-resolve UNTP-shaped labels. + """ + # Pull the canonical EUDPP context (already populated with the + # eudpp / lca / soc / actor / con prefixes by Phase 1 work). + context: list[Any] = [ + EUDPPNamespace.VC2.value, + get_eudpp_context(), + ] + if include_untp_namespace: + context.append({"untp": EUDPPNamespace.UNTP_DPP.value}) + return tuple(context) + + +def _looks_like_cirpass(data: dict[str, Any]) -> bool: + """Cheap structural heuristic: is ``data`` already in CIRPASS shape? + + The reference structure v1.3.0 always carries a ``dppIdentifier`` + object at the root and a ``product`` object — UNTP carries + ``credentialSubject`` instead. The two are mutually exclusive in + valid payloads, so a single key check suffices. + """ + if "dppIdentifier" in data and isinstance(data.get("dppIdentifier"), dict): + return True + return ( + "product" in data + and isinstance(data.get("product"), dict) + and "credentialSubject" not in data + ) + + +# ============================================================================= +# Convenience functions +# ============================================================================= + + +def export_cirpass_jsonld( + passport: ReferencePassport | DigitalProductPassport | dict[str, Any], + *, + indent: int | None = 2, + include_untp_namespace: bool = False, + default_language: str = "en", +) -> str: + """Convenience function: export a passport to a CIRPASS JSON-LD string. + + Mirrors :func:`dppvalidator.exporters.export_jsonld` for the UNTP + JSON-LD shape. + """ + exporter = CIRPASSJsonLDExporter( + include_untp_namespace=include_untp_namespace, + ) + return exporter.export(passport, indent=indent, default_language=default_language) + + +def export_cirpass_jsonld_dict( + passport: ReferencePassport | DigitalProductPassport | dict[str, Any], + *, + include_untp_namespace: bool = False, + default_language: str = "en", +) -> dict[str, Any]: + """Same as :func:`export_cirpass_jsonld` but returns the dict directly.""" + exporter = CIRPASSJsonLDExporter( + include_untp_namespace=include_untp_namespace, + ) + return exporter.export_dict(passport, default_language=default_language) + + +__all__ = [ + "CIRPASSJsonLDExporter", + "export_cirpass_jsonld", + "export_cirpass_jsonld_dict", +] diff --git a/src/dppvalidator/exporters/eudpp_jsonld.py b/src/dppvalidator/exporters/eudpp_jsonld.py index 98d8dae..5058400 100644 --- a/src/dppvalidator/exporters/eudpp_jsonld.py +++ b/src/dppvalidator/exporters/eudpp_jsonld.py @@ -1,16 +1,18 @@ """EU DPP-aligned JSON-LD export for Digital Product Passports. -Provides optional EU DPP-aligned JSON-LD output that transforms UNTP models -to the EU DPP Core Ontology vocabulary. The UNTP models remain unchanged; -this export layer performs vocabulary mapping at export time. +Provides optional EU DPP-aligned JSON-LD output that transforms UNTP +models to the EU DPP Core Ontology vocabulary. The UNTP models remain +unchanged; this export layer performs vocabulary mapping at export time. -Source: EU DPP Core Ontology v1.7.1 -Namespace: http://dpp.taltech.ee/EUDPP# +Source: EU DPP Core Ontology v1.9.1 (Phase 1 CIRPASS-2 target). Emits +predicate IRIs under the canonical ``https://w3id.org/eudpp#`` W3ID +prefix (rebased from per-publisher hosts in Phase 1 — see ADR 0002). """ from __future__ import annotations import json +import warnings from copy import deepcopy from pathlib import Path from typing import TYPE_CHECKING, Any @@ -35,7 +37,44 @@ # ============================================================================= -EUDPP_CONTEXT_URL = "https://dpp.vocabulary-hub.eu/context/v1" +# Canonical v1.9.1 namespace IRI for EU DPP terms — the W3ID-resolved +# fragment namespace declared by every bundled v1.9.x TTL. New code +# should reference this constant directly; the legacy +# :data:`EUDPP_CONTEXT_URL` remains as a deprecation-warned alias +# until Phase 10 of the migration plan. +EUDPP_CANONICAL_CONTEXT_URL = EUDPPNamespace.EUDPP.value # https://w3id.org/eudpp# + +# Legacy DPP Vocabulary Hub context URL. Phase 6 of the CIRPASS-2 +# migration deprecates this in favour of the canonical W3ID URL +# above. Access goes through the module-level :func:`__getattr__` +# below so callers (including ``from … import EUDPP_CONTEXT_URL``) +# get a :class:`DeprecationWarning`. Removed in Phase 10. +_EUDPP_CONTEXT_URL_LEGACY = "https://dpp.vocabulary-hub.eu/context/v1" + + +def __getattr__(name: str) -> Any: + """PEP 562 module-level attribute hook for deprecated names. + + Phase 6 task 6.2 of [docs/plans/CIRPASS_2_MIGRATION.md]. The + legacy ``EUDPP_CONTEXT_URL`` constant resolves through here to a + deprecation-warned value; consumers should switch to + :data:`EUDPP_CANONICAL_CONTEXT_URL` (the canonical v1.9.1 W3ID + fragment namespace). + """ + if name == "EUDPP_CONTEXT_URL": + warnings.warn( + "exporters.eudpp_jsonld.EUDPP_CONTEXT_URL is deprecated; " + "use EUDPP_CANONICAL_CONTEXT_URL " + f"({EUDPP_CANONICAL_CONTEXT_URL!r}) — the canonical v1.9.1 " + "W3ID fragment namespace. The legacy hub URL " + f"({_EUDPP_CONTEXT_URL_LEGACY!r}) is kept resolvable " + "through Phase 10 for back-compat.", + DeprecationWarning, + stacklevel=2, + ) + return _EUDPP_CONTEXT_URL_LEGACY + msg = f"module {__name__!r} has no attribute {name!r}" + raise AttributeError(msg) def get_eudpp_jsonld_context() -> list[Any]: diff --git a/src/dppvalidator/models/__init__.py b/src/dppvalidator/models/__init__.py index 91335c3..a4f77c9 100644 --- a/src/dppvalidator/models/__init__.py +++ b/src/dppvalidator/models/__init__.py @@ -1,4 +1,17 @@ -"""Pydantic models for UNTP Digital Product Passport entities.""" +"""Pydantic models for UNTP Digital Product Passport entities. + +Phase 3 task 3.14 of [docs/plans/CIRPASS_2_MIGRATION.md]: CIRPASS +reference-structure models live under +:mod:`dppvalidator.models.cirpass.v1_3` and are *not* re-exported here +— the cold-start budget (cross-cutting workstream X3) requires that +``import dppvalidator`` not pull the CIRPASS surface eagerly. Plugin +authors and consumers needing CIRPASS classes import them explicitly: + + from dppvalidator.models.cirpass.v1_3 import ReferencePassport + +The :func:`tests/unit/test_cold_start_import.py` guard pins the +non-eager-load contract. +""" from dppvalidator.models.base import UNTPBaseModel, UNTPStrictModel from dppvalidator.models.claims import Claim, Criterion, Metric, Regulation, Standard diff --git a/src/dppvalidator/models/cirpass/__init__.py b/src/dppvalidator/models/cirpass/__init__.py new file mode 100644 index 0000000..fe399ec --- /dev/null +++ b/src/dppvalidator/models/cirpass/__init__.py @@ -0,0 +1,16 @@ +"""CIRPASS reference-structure Pydantic models. + +Phase 3 of [docs/plans/CIRPASS_2_MIGRATION.md] introduced this package +for the CIRPASS DPP reference structure (a hierarchical message model +distinct from UNTP's Verifiable-Credential envelope). The single +versioned namespace ``v1_3`` mirrors the UNTP convention +(``models/v0_6/``, ``models/v0_7/``). + +Importing this package is intentionally lazy: ``import dppvalidator`` +must not pull the CIRPASS surface eagerly (per the migration plan's +cross-cutting workstream X3 — cold-start budget). Use: + + from dppvalidator.models.cirpass.v1_3 import ReferencePassport + +…and the package loads on first reference. +""" diff --git a/src/dppvalidator/models/cirpass/v1_3/__init__.py b/src/dppvalidator/models/cirpass/v1_3/__init__.py new file mode 100644 index 0000000..cade949 --- /dev/null +++ b/src/dppvalidator/models/cirpass/v1_3/__init__.py @@ -0,0 +1,88 @@ +"""CIRPASS DPP reference structure v1.3.0 — Pydantic models. + +Phase 3 of [docs/plans/CIRPASS_2_MIGRATION.md] introduced this module. +The package is the source of truth for the message-level shape; the +JSON Schema bundled at +``src/dppvalidator/schemas/data/cirpass-reference-1.3.0.json`` is +*derived* from these models via +``tools/codegen/cirpass/derive_schema.py`` (Phase 3 task 3.1) and +SHA-pinned in ``schemas/data/MANIFEST.json``. + +Importing the package eagerly imports every submodule. ``import +dppvalidator`` does *not* eagerly import this package — see +``models/__init__.py`` for the lazy-import wiring. + +Public surface: + +- :class:`ReferencePassport` (root) +- :class:`Product`, :class:`Identifier`, :class:`ClassificationCode` +- :class:`Actor`, :class:`Facility`, :class:`ActorRole`, + :class:`ActorRoleAssignment` +- :class:`Material`, :class:`Composition` +- :class:`SubstanceOfConcern`, :class:`Concentration`, + :class:`HazardClassification` +- :class:`LifeCycleAssessment`, :class:`ImpactResult`, + :class:`ImpactCategoryReference` +- :class:`ConnectorRelation`, :class:`RelationType` +- :class:`LocalisedText` (i18n) +- :class:`EffectivePeriod`, :class:`IssuedAt` (temporal) +""" + +from __future__ import annotations + +from dppvalidator.models.cirpass.v1_3.actor import ( + Actor, + ActorRole, + ActorRoleAssignment, + Facility, +) +from dppvalidator.models.cirpass.v1_3.connector import ( + ConnectorRelation, + RelationType, +) +from dppvalidator.models.cirpass.v1_3.i18n import LocalisedText, is_valid_bcp47 +from dppvalidator.models.cirpass.v1_3.lca import ( + ImpactCategoryReference, + ImpactResult, + LifeCycleAssessment, +) +from dppvalidator.models.cirpass.v1_3.material import Composition, Material +from dppvalidator.models.cirpass.v1_3.passport import ReferencePassport +from dppvalidator.models.cirpass.v1_3.product import ( + ClassificationCode, + Identifier, + Product, + looks_like_gtin, +) +from dppvalidator.models.cirpass.v1_3.substances import ( + Concentration, + HazardClassification, + SubstanceOfConcern, +) +from dppvalidator.models.cirpass.v1_3.temporal import EffectivePeriod, IssuedAt + +__all__ = [ + "Actor", + "ActorRole", + "ActorRoleAssignment", + "ClassificationCode", + "Composition", + "Concentration", + "ConnectorRelation", + "EffectivePeriod", + "Facility", + "HazardClassification", + "Identifier", + "ImpactCategoryReference", + "ImpactResult", + "IssuedAt", + "LifeCycleAssessment", + "LocalisedText", + "Material", + "Product", + "ReferencePassport", + "RelationType", + "SubstanceOfConcern", + "is_valid_bcp47", + "looks_like_gtin", +] diff --git a/src/dppvalidator/models/cirpass/v1_3/actor.py b/src/dppvalidator/models/cirpass/v1_3/actor.py new file mode 100644 index 0000000..62eb1bd --- /dev/null +++ b/src/dppvalidator/models/cirpass/v1_3/actor.py @@ -0,0 +1,168 @@ +"""Actor and role types for CIRPASS DPP reference structure v1.3.0. + +Phase 3 task 3.6 of [docs/plans/CIRPASS_2_MIGRATION.md]. Models the +``Actor`` / ``Role`` axes from EUDPP ACTOR v1.9.1 plus the new +``ActorRoleAssignment`` first-class assignment relationship. + +Mapping highlights: + +- ESPR Annex distinguishes ~24 specific economic-operator role types + (Manufacturer, Importer, Distributor, …); EUDPP v1.9.1 consolidates + these into 6 super-role classes. :class:`ActorRole` accepts + finer-grained ESPR roles AND the v1.9.1 super-categories — the + Phase 5 mapping shim handles the granularity gap. +- :class:`Facility` is now in ACTOR (was P_DPP in v1.7.1, removed in + v1.9.1 per the P_DPP changelog). +""" + +from __future__ import annotations + +from typing import ClassVar + +from pydantic import Field + +from dppvalidator.models.base import UNTPBaseModel +from dppvalidator.models.cirpass.v1_3.i18n import LocalisedText +from dppvalidator.models.cirpass.v1_3.product import Identifier +from dppvalidator.vocabularies.eudpp_actors import EUDPPRoleClass + + +class Actor(UNTPBaseModel): + """An economic operator, regulator, or other party. + + Maps to ``eudpp:Actor`` (ACTOR v1.9.1). Carries the actor's + primary identifier, a multilingual name, and optional contact / + address fields. + + Wire shape (minimal): + ``{"actorIdentifier": {...}, "actorName": [...]}`` + """ + + _jsonld_type: ClassVar[list[str]] = ["Actor"] + + actor_identifier: Identifier = Field( + ..., + alias="actorIdentifier", + description="Primary actor identifier (LEI, EUID, EORI, …).", + ) + actor_name: list[LocalisedText] = Field( + ..., + alias="actorName", + min_length=1, + description="Actor's legal / trade name(s), one per language.", + ) + registered_trade_name: list[LocalisedText] | None = Field( + default=None, + alias="registeredTradeName", + description="Registered trade-name(s) where distinct from the legal name.", + ) + registered_trademark: list[LocalisedText] | None = Field( + default=None, + alias="registeredTrademark", + description="Registered trademark(s) associated with the actor.", + ) + + +class Facility(UNTPBaseModel): + """A physical site where production / processing occurs. + + Maps to ``eudpp:Facility`` (ACTOR v1.9.1 — relocated from P_DPP in + the v1.9.1 spec rewrite). The CIRPASS message references a Facility + via the actor's ``usesFacility`` relation; the facility itself + carries an identifier scheme (so e.g. a permit ID and an ECEFACT + ID for the same facility don't collide). + """ + + _jsonld_type: ClassVar[list[str]] = ["Facility"] + + facility_identifier: Identifier = Field( + ..., + alias="facilityIdentifier", + description="Primary facility identifier with scheme.", + ) + facility_name: list[LocalisedText] = Field( + ..., + alias="facilityName", + min_length=1, + description="Facility name(s), one per language.", + ) + + +class ActorRole(UNTPBaseModel): + """An actor playing a specific role on this DPP. + + Wire shape: + ``{"actor": , "role": "eudpp:ManufacturerRole"}`` + + The ``role`` field accepts the compact ``eudpp:`` IRI of any role + class — both the v1.9.1 super-categories + (``EconomicOperatorRole``, ``CircularEconomyRole``, etc.) and the + finer-grained ESPR-derived roles + (``ManufacturerRole``, ``RecyclerRole``, etc.) that + :class:`EUDPPRoleClass` exposes for back-compat. + """ + + _jsonld_type: ClassVar[list[str]] = ["ActorRole"] + + actor: Actor = Field(..., description="The actor playing the role.") + role: str = Field( + ..., + description=( + "Compact ``eudpp:`` IRI of the role class. See " + ":class:`dppvalidator.vocabularies.eudpp_actors.EUDPPRoleClass` " + "for the canonical set." + ), + ) + + @property + def role_enum(self) -> EUDPPRoleClass | None: + """Return the :class:`EUDPPRoleClass` member if ``role`` matches one. + + Returns ``None`` for roles outside the registered set + (downstream taxonomies that extend the EUDPP role hierarchy + with their own classes are still legal — :class:`ActorRole` + carries the IRI string, not an enforced enum). + """ + try: + return EUDPPRoleClass(self.role) + except ValueError: + return None + + +class ActorRoleAssignment(UNTPBaseModel): + """First-class actor-plays-role-in-context relationship (ACTOR v1.9.1). + + +1.9.1: ACTOR v1.9.1 introduced ``eudpp:ActorRoleAssignment`` as a + first-class entity (rather than a binary edge) so that role + assignments can carry their own metadata — temporal scoping, + authority that conferred the role, supporting documentation, etc. + + Wire shape: + ``{"actor": , "role": "eudpp:...", + "validFrom": "2026-01-01T00:00:00Z", + "validTo": "2031-12-31T23:59:59Z"}`` + """ + + _jsonld_type: ClassVar[list[str]] = ["ActorRoleAssignment"] + + actor: Actor = Field(..., description="The actor receiving the role.") + role: str = Field( + ..., + description="Compact ``eudpp:`` IRI of the role class assigned.", + ) + valid_from: str | None = Field( + default=None, + alias="validFrom", + description=( + "ISO 8601 UTC datetime from which the assignment is " + "effective (``eudpp:assignmentValidFrom``)." + ), + ) + valid_to: str | None = Field( + default=None, + alias="validTo", + description=( + "ISO 8601 UTC datetime after which the assignment expires " + "(``eudpp:assignmentValidTo``). Absent ⇒ open-ended." + ), + ) diff --git a/src/dppvalidator/models/cirpass/v1_3/connector.py b/src/dppvalidator/models/cirpass/v1_3/connector.py new file mode 100644 index 0000000..7eacc6b --- /dev/null +++ b/src/dppvalidator/models/cirpass/v1_3/connector.py @@ -0,0 +1,130 @@ +"""Cross-module relations for CIRPASS DPP reference structure v1.3.0. + +Phase 3 task 3.10 of [docs/plans/CIRPASS_2_MIGRATION.md] (resolves G9). +Models the ``ConnectorRelation`` class — a typed relation that ties +entities across the EUDPP modules (P_DPP / SOC / ACTOR / LCA) per +the CON v1.9.1 module's design. + +EUDPP IRI bindings: + +- ``ConnectorRelation`` is a CIRPASS message-level wrapper for the + CON v1.9.1 cross-module relation predicates (``isConnectedTo``, + ``inContextOfActivity``, ``inContextOfDPP``, etc.). +- :class:`RelationType` enumerates the known v1.9.1 relation + predicates from + :class:`dppvalidator.vocabularies.eudpp_relations.EUDPPObjectProperty` + with the ``CON``-module subset prioritised. +""" + +from __future__ import annotations + +from enum import Enum +from typing import ClassVar + +from pydantic import Field, field_validator + +from dppvalidator.models.base import UNTPBaseModel + + +class RelationType(str, Enum): + """Known cross-module relation predicates (CIRPASS-2 CON v1.9.1). + + Each member is a compact ``eudpp:`` IRI for a CON-module object + property. Downstream consumers that emit :class:`ConnectorRelation` + payloads with relation predicates *outside* this enum (e.g. + pilot-specific extensions) are still valid — :class:`ConnectorRelation` + carries the IRI as a string, not as an enforced enum. + """ + + # CON-native relations (added to v1.9.1 connector module) + IS_CONNECTED_TO = "eudpp:isConnectedTo" + IN_CONTEXT_OF_ACTIVITY = "eudpp:inContextOfActivity" + IN_CONTEXT_OF_DPP = "eudpp:inContextOfDPP" + IN_CONTEXT_OF_PRODUCT = "eudpp:inContextOfProduct" + REPRESENTS_MANUFACTURER_FOR_PRODUCT = "eudpp:representsManufacturerForProduct" + + # Migrated to CON in v1.9.1 (were in P_DPP v1.7.1) + HAS_ISSUER = "eudpp:hasIssuer" + HAS_MANUFACTURER = "eudpp:hasManufacturer" + HAS_ECONOMIC_OPERATOR = "eudpp:hasEconomicOperator" + HAS_BACK_UP_COPY_HOST = "eudpp:hasBackUpCopyHost" + CONTAINS_SUBSTANCE_OF_CONCERN = "eudpp:containsSubstanceOfConcern" + + +class ConnectorRelation(UNTPBaseModel): + """A typed cross-module relation between two entities. + + Wire shape: + ``{"relation": "eudpp:hasManufacturer", + "subject": "https://example.com/dpp/123", + "object": "https://example.com/actor/abc"}`` + + The ``relation`` field is a compact ``eudpp:`` IRI from + :class:`RelationType` (or any other ontology-defined object + property). ``subject`` and ``object`` are URIs identifying the + related entities. + + Cardinality: + + - ``relation``: required (1). + - ``subject``: required (1). + - ``object``: required (1). + - ``valid_from`` / ``valid_to``: optional ISO 8601 datetimes — + enable temporally-scoped relations (e.g. an actor that played + a role only during a manufacturing batch). + """ + + _jsonld_type: ClassVar[list[str]] = ["ConnectorRelation"] + + relation: str = Field( + ..., + description=( + "Compact ``eudpp:`` IRI of the relation predicate. See " + ":class:`RelationType` for the canonical set; downstream " + "taxonomies may use other IRIs." + ), + ) + subject: str = Field( + ..., + min_length=1, + description="URI of the subject entity (the relation's source).", + ) + object: str = Field( + ..., + min_length=1, + description="URI of the object entity (the relation's target).", + ) + valid_from: str | None = Field( + default=None, + alias="validFrom", + description="Optional ISO 8601 datetime: relation effective from.", + ) + valid_to: str | None = Field( + default=None, + alias="validTo", + description="Optional ISO 8601 datetime: relation expires at.", + ) + + @field_validator("subject", "object") + @classmethod + def _looks_like_uri(cls, value: str) -> str: + # We don't enforce strict URI shape (downstream consumers may + # legitimately use compact ``eudpp:Foo`` form), but we reject + # bare-string values that obviously aren't identifiers. + if not value.strip(): + msg = "ConnectorRelation subject/object must not be empty." + raise ValueError(msg) + return value + + @property + def relation_type(self) -> RelationType | None: + """Return the :class:`RelationType` member if ``relation`` matches one. + + Returns ``None`` for relation IRIs outside the canonical set + (downstream extensions are valid — :class:`ConnectorRelation` + carries the IRI string, not an enforced enum). + """ + try: + return RelationType(self.relation) + except ValueError: + return None diff --git a/src/dppvalidator/models/cirpass/v1_3/i18n.py b/src/dppvalidator/models/cirpass/v1_3/i18n.py new file mode 100644 index 0000000..3851d05 --- /dev/null +++ b/src/dppvalidator/models/cirpass/v1_3/i18n.py @@ -0,0 +1,88 @@ +"""Multilingual labels for CIRPASS DPP reference structure v1.3.0. + +Phase 3 task 3.11 of [docs/plans/CIRPASS_2_MIGRATION.md] (resolves G16). +ESPR Annex requires DPP labels in the official languages of every +member state where the product is sold; the CIRPASS message expresses +this through fielded multilingual values rather than UNTP's default- +language strings. :class:`LocalisedText` is the wrapper. +""" + +from __future__ import annotations + +import re +from typing import ClassVar + +from pydantic import Field, field_validator + +from dppvalidator.models.base import UNTPBaseModel + +# BCP-47 language tag — pragmatic regex that matches the real-world +# subset (RFC 5646's full grammar accepts more shapes; this covers +# everything ESPR actually emits). +# +# primary subtag 2-3 ASCII letters en, de, fra, zh +# region subtag optional 2-letter or 3-digit territory en-US, en-029 +# script subtag optional 4-letter ISO-15924 zh-Hant +# variant subtag optional 5-8 alphanum en-GB-oxendict +_BCP47_RE = re.compile( + r"^(?P[a-z]{2,3})" + r"(?:-(?P