Add Italian (it) language support#1032
Conversation
Adds --language=it support with Italian keyword translations throughout the compiler. Files with extension .catala_it are automatically detected. Key keyword choices follow Italian legal vocabulary: - campo di applicazione (scope), regola (rule), definizione (definition) - eccezione (exception), etichetta (label), conseguenza (consequence) - sotto condizione (under condition), vale (equals/defined as), è (is) - soddisfatto (fulfilled), dichiarazione (declaration), contesto (context) Number formatting follows Italian conventions: comma as decimal separator, period as thousands separator, euro (€) as currency suffix. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Italian (it) versions of all 9 stdlib modules: Integer_it, Decimal_it, Duration_it, List_it, Money_it, Date_it, MonthYear_it, Period_it, and Stdlib_it. Italian keyword conflicts (anno=YEAR, a=TO, contiene=CONTAINS) are handled by renaming the affected identifiers. Also enables Italian in has_localised_stdlib and adds .catala_it to the module file extension lookup in parser_driver, which was preventing cross-module imports within the Italian stdlib from resolving. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds Italian (it) as a first-class locale variant in the Catala toolchain, including surface syntax, localised stdlib, and regression tests, so Italian .catala_it(.md) sources can be parsed/typechecked/executed and rendered in literate outputs.
Changes:
- Added an Italian surface lexer and wired it into the parser driver/build (
Lexer_it,.catala_itextension support). - Introduced an Italian-localised standard library (
Stdlib_itplusdate/decimal/duration/integer/list/money/monthyear/periodmodules). - Added Italian equivalents of several language-specific regression tests (money literal parsing, date rounding mode behavior, UTF-8 exception labels/conditions, type-token disambiguation) and updated user-facing printers/literate strings for
it.
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/typing/good/token_tipi.catala_it | New Italian regression test for type-token lexing/disambiguation. |
| tests/money/good/literal_parsing_it.catala_it | New Italian test for money literal parsing/printing with € and , decimals. |
| tests/exception/good/condizioni_utf8.catala_it | New Italian test exercising UTF-8 conditions/labels with the Exceptions command output. |
| tests/date/good/rounding_option_it.catala_it | New Italian “good” test for date rounding mode behavior. |
| tests/date/bad/rounding_option_it.catala_it | New Italian “bad” test ensuring ambiguous date computations error without rounding mode. |
| stdlib/stdlib_it.catala_it | New Italian stdlib root module wiring the Italian submodules. |
| stdlib/date_it.catala_it | Italian Date module (localized API + bindings to Date_internal). |
| stdlib/decimal_it.catala_it | Italian Decimal helper module. |
| stdlib/duration_it.catala_it | Italian Duration helper module. |
| stdlib/integer_it.catala_it | Italian Integer helper module. |
| stdlib/list_it.catala_it | Italian List helper module. |
| stdlib/money_it.catala_it | Italian Money helper module (Euro formatting-oriented naming). |
| stdlib/monthyear_it.catala_it | Italian MonthYear helper module (uses n_anno to avoid keyword conflicts). |
| stdlib/period_it.catala_it | Italian Period helper module (renames contiene to data_nel_periodo). |
| compiler/surface/lexer_it.mli | Declares the Italian localised lexer interface. |
| compiler/surface/lexer_it.cppo.ml | Defines Italian keyword set + lexer macros (including literate directives). |
| compiler/surface/dune | Builds and preprocesses the Italian lexer as part of the surface library. |
| compiler/surface/parser_driver.ml | Enables Italian parsing/lexing and .catala_it(.md) extension resolution. |
| compiler/shared_ast/print.ml | Adds Italian number separators, boolean literals, money suffix, duration units, and injection “content” keyword. |
| compiler/plugins/explain.ml | Makes explain plugin language selection exhaustive by handling It (fallbacking to English strings). |
| compiler/literate/literate_common.ml | Adds Italian literate document strings + language extension mapping. |
| compiler/literate/latex.ml | Supports Italian LaTeX language selection and metadata label. |
| compiler/literate/html.ml | Adds Italian label for the table of contents. |
| compiler/catala_utils/global.mli | Extends backend_lang with It. |
| compiler/catala_utils/global.ml | Extends backend_lang with It and enables localised stdlib for Italian. |
| compiler/catala_utils/cli.ml | Adds it language code and .catala_it(.md) extension inference. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Use anni/mesi/giorni for plural values, matching the approach used for English (years/months/days) and French (ans/jours). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous commits introduced match expressions that exceed ocamlformat's line-length limit. Run the formatter to expand them to multi-line form as required by make check-promoted in CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Italian identifier 'individuo.reddito' is longer than the French 'individu.revenu', pushing the exception tree condition line to 81 columns. Update the expected output to match the wrapped form that CI produces at the default 80-column width. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
denismerigoux
left a comment
There was a problem hiding this comment.
Thanks very much @andreabedini for this PR! It looks quite complete and comes with tests, which is very nice. Would you be OK to commit as a maintainer of this Italian translation of Catala? We would want to ping you in the future anytime we change something that requires a new Italian translation. This includes changes to the standard library which are bound to happen quite often.
@vincent-botbol what's needed on top of this PR to add a new language supported by the VSCode plugin? We may want to add these extra steps to https://github.com/CatalaLang/catala/blob/master/CONTRIBUTING.md#contribution-example-internationalization-of-the-catala-syntax
|
Note that I have ongoing changes (#1031) that will conflict with this 😬 |
|
@denismerigoux sure, I can do that. @AltGr no rush, I can rebase when you are done. Nevertheless I feel I need to sit in front of a longer source code written in Italian, the translation might still be improved. |
Hi all,
I am big fan of this project. My father has spent his whole career in the public administration and Catala has gifted us the beautiful afternoons spent discussing the project and the ideas behind it.
I allowed myself to work on a Italian translation. Admittedly it has been mostly developed with Claude Code but hey, AI is good at translations! No harm done if you don't accept AI contributions but, in case something technical is wrong let me know and I'll jump to fix it.
Summary
lexer_it.cppo.ml) with full keyword set including literate formatting supportstdlib_it,date_it,decimal_it,duration_it,integer_it,list_it,money_it,monthyear_it,period_it)has_localised_stdliband fixes.catala_itfile extension lookup in module resolution (parser_driver.ml)Italian keyword conflicts handled:
anno(YEAR),a(TO),contiene(CONTAINS) cannot be used as identifiers, so affected stdlib names were adapted (e.g.da_anno_mese_giornousesn_anno,money_itusesmontante,period_itrenamescontiene→data_nel_periodo).Test plan
clerk test tests/— all 673 tests passclerk test tests/stdlib/— all 21 stdlib tests passliteral_parsing_it,rounding_option_it(good/bad),condizioni_utf8,token_tipi🤖 Generated with Claude Code