feat(chord): validate translation constraints on create/update#705
feat(chord): validate translation constraints on create/update#705SanjeevLakhwani wants to merge 14 commits into
Conversation
Reject translations where primary_contact or stakeholder roles differ from the canonical dataset, or where any list field gains items.
Clients no longer need to supply identifier or project in the translation request body; the serializer injects them from the dataset looked up via the URL identifier or the existing instance on update.
Previously only list fields were checked. Non-list optional fields (e.g. spatial_coverage, program_name) could be silently nulled out in translations. Now any field present in canonical must also be present in the translation; list fields must additionally match length.
…ions Translations cannot include a discovery configuration (rejected before Pydantic parsing, since extra="allow" would silently accept it). Non-translatable fields (version, release_date, last_modified, study_status, study_context) must exactly match the canonical dataset value; providing a different value or adding one where canonical has none is rejected.
Immutable fields (version, release_date, last_modified, study_status, study_context) are now exempt from Rule 2 (removal check). Rule 3 only fires when the field is explicitly provided and differs from canonical.
Add discovery and dac_id to _IMMUTABLE_FIELDS so Rule 2 does not require them to be present in a translation. Rule 3 still rejects them if explicitly provided with a value that differs from canonical.
| "discovery", "pcgl_dac_id"}) | ||
|
|
||
|
|
||
| def _check_translation_constraints(translation: ProjectScopedDatasetModel): |
There was a problem hiding this comment.
this is done in serialization, but I feel like it should be done in the actual model save method instead to prevent invalid translations from ever being created, not just in the API (e.g., in a hypothetical CLI.) this is my general preference for MVC apps, to have validation closer to the data. however, if there's a reason this is not as good, i'm fine with being convinced otherwise.
There was a problem hiding this comment.
The model is just taking a JSON field, I don't see how close to the model we can get with that. Given that in the past we discussed about the effort to be done in implementing this and was asked to be low. It is known that a tech debt is being added. My idea for the future for this is to separate main model from translation. Everything gets serialized into a different translation model, which allows to not store duplicates of data that shouldn't be and have validation against canonical.
…table check not required.
…anslation-validation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #705 +/- ##
===========================================
- Coverage 95.59% 95.51% -0.08%
===========================================
Files 138 138
Lines 5805 5867 +62
Branches 552 568 +16
===========================================
+ Hits 5549 5604 +55
- Misses 205 210 +5
- Partials 51 53 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This reverts commit 75e3880.
Summary
Redmine ticket 2830
primary_contactor anystakeholdersentry has differentrolesthan the canonical (English) datasetkeywords,stakeholders,taxa,links,publications,logos,counts,resources,participant_criteria,domain,funding_sources) grows beyond the canonical sizeDatasetTranslationSerializer.to_internal_value()after Pydantic validation, covering both create and update paths; looks up the canonical dataset bytranslation.identifierso it works regardless of how the serializer is instantiated