Normalize diacritics in JSON schema#1029
Conversation
fa7db96 to
4b9a0b8
Compare
|
I thought these renaming and name clashes problem were solved in a general fashion thanks to @AltGr's https://github.com/CatalaLang/catala/blob/master/compiler/shared_ast/renaming.mli But you're right, if each backend has its own renaming rules they won't agree on a unique name that can be re-exported in JSON Schema. This is getting more complicated that planned but I feel it's one of these cases where we have to do the extra mile to make sure everything works out perfectly... I suggest we discuss this next week and leave this PR as is in the meantime? Thanks Vincent! |
|
The The idea when printing the values to the user was to use the original Catala name, which was consistent, readable and avoided clashes. At this point it's not a design flaw, but a bug in some backends which don't use the correct printing function — that part should not be hard to fix, everything is in place. But since we are making other changes... However, at that point I noticed that the JSON standard allowed arbitrary identifiers, so I thought it was a good idea to leverage this and use the same original Catala source idents. This is where #1017 and this PR come to importance, as some user-level tools have much stricter restrictions on the JSON they accept. We discussed with @vincent-botbol yesterday and could see a few ways to solve this reliably, but they all have drawbacks:
|
Indeed, we continued the discussion this morning and the simplest course of action would to be add to each backend runtime's types the normalized form as well. Thus, it would contain the Catala's original name and its normalized version that went through the same renaming process for each backend. This way we would maintain the consistency. The (non-existing yet) deserialization in the backend is another pair of hands though, this deserve an important discussion, yes. Let's put it on hold for now. |
Fixes #1017
Currently, this fails if two fields/cases would be normalized to the same string. Should we would rename those to fresh identifiers instead? This would complexifies the logic by quite a lot.
ALSO, while working on that the generated JSON values from the different backends are not consistent with this new schema. Incidentally, this was also not previously the case (even for the outputs). This needs a bit of future work to achieve this.