feat(chord): validate translation constraints on create/update by SanjeevLakhwani · Pull Request #705 · bento-platform/katsu

SanjeevLakhwani · 2026-06-08T09:30:57Z

Summary

Redmine ticket 2830

Reject translations where primary_contact or any stakeholders entry has different roles than the canonical (English) dataset
Reject translations where any list field (keywords, stakeholders, taxa, links, publications, logos, counts, resources, participant_criteria, domain, funding_sources) grows beyond the canonical size
Validation runs in DatasetTranslationSerializer.to_internal_value() after Pydantic validation, covering both create and update paths; looks up the canonical dataset by translation.identifier so it works regardless of how the serializer is instantiated
11 new tests added covering rejection and acceptance cases for both rules on both POST and PUT

Reject translations where primary_contact or stakeholder roles differ from the canonical dataset, or where any list field gains items.

Clients no longer need to supply identifier or project in the translation request body; the serializer injects them from the dataset looked up via the URL identifier or the existing instance on update.

…onse

Previously only list fields were checked. Non-list optional fields (e.g. spatial_coverage, program_name) could be silently nulled out in translations. Now any field present in canonical must also be present in the translation; list fields must additionally match length.

…ions Translations cannot include a discovery configuration (rejected before Pydantic parsing, since extra="allow" would silently accept it). Non-translatable fields (version, release_date, last_modified, study_status, study_context) must exactly match the canonical dataset value; providing a different value or adding one where canonical has none is rejected.

Immutable fields (version, release_date, last_modified, study_status, study_context) are now exempt from Rule 2 (removal check). Rule 3 only fires when the field is explicitly provided and differs from canonical.

Add discovery and dac_id to _IMMUTABLE_FIELDS so Rule 2 does not require them to be present in a translation. Rule 3 still rejects them if explicitly provided with a value that differs from canonical.

…traints

davidlougheed · 2026-06-15T16:01:10Z

+                               "discovery", "pcgl_dac_id"})
+
+
+def _check_translation_constraints(translation: ProjectScopedDatasetModel):


this is done in serialization, but I feel like it should be done in the actual model save method instead to prevent invalid translations from ever being created, not just in the API (e.g., in a hypothetical CLI.) this is my general preference for MVC apps, to have validation closer to the data. however, if there's a reason this is not as good, i'm fine with being convinced otherwise.

The model is just taking a JSON field, I don't see how close to the model we can get with that. Given that in the past we discussed about the effort to be done in implementing this and was asked to be low. It is known that a tech debt is being added. My idea for the future for this is to separate main model from translation. Everything gets serialized into a different translation model, which allows to not store duplicates of data that shouldn't be and have validation against canonical.

…table check not required.

…anslation-validation

codecov · 2026-06-16T06:02:09Z

Codecov Report

❌ Patch coverage is 89.06250% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.51%. Comparing base (de23adf) to head (df31022).

Files with missing lines	Patch %	Lines
chord_metadata_service/chord/serializers.py	88.52%	5 Missing and 2 partials ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #705      +/-   ##
===========================================
- Coverage    95.59%   95.51%   -0.08%     
===========================================
  Files          138      138              
  Lines         5805     5867      +62     
  Branches       552      568      +16     
===========================================
+ Hits          5549     5604      +55     
- Misses         205      210       +5     
- Partials        51       53       +2

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This reverts commit 75e3880.

SanjeevLakhwani added 3 commits June 8, 2026 05:08

feat(chord): validate translation constraints on create/update

6f0a0bf

Reject translations where primary_contact or stakeholder roles differ from the canonical dataset, or where any list field gains items.

style: fix E501 line too long in test_api_translations

a3f9391

feat(chord): infer identifier and project on translation from URL

8e9e57c

Clients no longer need to supply identifier or project in the translation request body; the serializer injects them from the dataset looked up via the URL identifier or the existing instance on update.

SanjeevLakhwani requested a review from davidlougheed June 8, 2026 10:27

SanjeevLakhwani mentioned this pull request Jun 8, 2026

feat(datasets): French translation upload and provenance export bento-platform/bento_web#555

Open

SanjeevLakhwani added 7 commits June 9, 2026 11:15

feat(chord): expose available translations in dataset serializer resp…

396f224

…onse

feat(chord): allow omitting immutable fields in translations

a00bfac

Immutable fields (version, release_date, last_modified, study_status, study_context) are now exempt from Rule 2 (removal check). Rule 3 only fires when the field is explicitly provided and differs from canonical.

feat(chord): extend translation constraints to discovery and dac_id

8de4a1f

Add discovery and dac_id to _IMMUTABLE_FIELDS so Rule 2 does not require them to be present in a translation. Rule 3 still rejects them if explicitly provided with a value that differs from canonical.

fix(chord): update dac_id field name to pcgl_dac_id in immutable cons…

d7cb3a0

…traints

docs(chord): add docstring to _roles_for helper

402fe68

davidlougheed reviewed Jun 15, 2026

View reviewed changes

fix(chord): discovery translation was forced to be not included, immu…

72576ba

…table check not required.

SanjeevLakhwani requested a review from davidlougheed June 15, 2026 17:26

Merge remote-tracking branch 'origin/develop' into feature/dataset-tr…

12f4f1b

…anslation-validation

SanjeevLakhwani changed the base branch from refactor/unify-datasets-v2-endpoint to develop June 15, 2026 20:06

chore: trigger CI

75e3880

Revert "chore: trigger CI"

df31022

This reverts commit 75e3880.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chord): validate translation constraints on create/update#705

feat(chord): validate translation constraints on create/update#705
SanjeevLakhwani wants to merge 14 commits into
developfrom
feature/dataset-translation-validation

SanjeevLakhwani commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

davidlougheed Jun 15, 2026

Uh oh!

SanjeevLakhwani Jun 15, 2026

Uh oh!

Uh oh!

codecov Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		"discovery", "pcgl_dac_id"})


		def _check_translation_constraints(translation: ProjectScopedDatasetModel):

Conversation

SanjeevLakhwani commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Uh oh!

davidlougheed Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

SanjeevLakhwani Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SanjeevLakhwani commented Jun 8, 2026 •

edited

Loading

codecov Bot commented Jun 16, 2026 •

edited

Loading