Skip to content

Add Italian (it) language support#1032

Open
andreabedini wants to merge 6 commits into
CatalaLang:masterfrom
andreabedini:add-italian-syntax
Open

Add Italian (it) language support#1032
andreabedini wants to merge 6 commits into
CatalaLang:masterfrom
andreabedini:add-italian-syntax

Conversation

@andreabedini

@andreabedini andreabedini commented May 19, 2026

Copy link
Copy Markdown

Hi all,

I am big fan of this project. My father has spent his whole career in the public administration and Catala has gifted us the beautiful afternoons spent discussing the project and the ideas behind it.

I allowed myself to work on a Italian translation. Admittedly it has been mostly developed with Claude Code but hey, AI is good at translations! No harm done if you don't accept AI contributions but, in case something technical is wrong let me know and I'll jump to fix it.

Summary

  • Adds Italian surface syntax (lexer_it.cppo.ml) with full keyword set including literate formatting support
  • Adds Italian test equivalents for French language-specific tests (money literals, date rounding, exception UTF-8, type token disambiguation)
  • Adds Italian localised standard library (stdlib_it, date_it, decimal_it, duration_it, integer_it, list_it, money_it, monthyear_it, period_it)
  • Enables Italian in has_localised_stdlib and fixes .catala_it file extension lookup in module resolution (parser_driver.ml)

Italian keyword conflicts handled: anno (YEAR), a (TO), contiene (CONTAINS) cannot be used as identifiers, so affected stdlib names were adapted (e.g. da_anno_mese_giorno uses n_anno, money_it uses montante, period_it renames contienedata_nel_periodo).

Test plan

  • clerk test tests/ — all 673 tests pass
  • clerk test tests/stdlib/ — all 21 stdlib tests pass
  • Italian-specific tests: literal_parsing_it, rounding_option_it (good/bad), condizioni_utf8, token_tipi

🤖 Generated with Claude Code

andreabedini and others added 3 commits May 15, 2026 12:23
Adds --language=it support with Italian keyword translations throughout
the compiler. Files with extension .catala_it are automatically detected.

Key keyword choices follow Italian legal vocabulary:
- campo di applicazione (scope), regola (rule), definizione (definition)
- eccezione (exception), etichetta (label), conseguenza (consequence)
- sotto condizione (under condition), vale (equals/defined as), è (is)
- soddisfatto (fulfilled), dichiarazione (declaration), contesto (context)

Number formatting follows Italian conventions: comma as decimal separator,
period as thousands separator, euro (€) as currency suffix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Italian (it) versions of all 9 stdlib modules: Integer_it,
Decimal_it, Duration_it, List_it, Money_it, Date_it, MonthYear_it,
Period_it, and Stdlib_it. Italian keyword conflicts (anno=YEAR, a=TO,
contiene=CONTAINS) are handled by renaming the affected identifiers.

Also enables Italian in has_localised_stdlib and adds .catala_it to the
module file extension lookup in parser_driver, which was preventing
cross-module imports within the Italian stdlib from resolving.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Italian (it) as a first-class locale variant in the Catala toolchain, including surface syntax, localised stdlib, and regression tests, so Italian .catala_it(.md) sources can be parsed/typechecked/executed and rendered in literate outputs.

Changes:

  • Added an Italian surface lexer and wired it into the parser driver/build (Lexer_it, .catala_it extension support).
  • Introduced an Italian-localised standard library (Stdlib_it plus date/decimal/duration/integer/list/money/monthyear/period modules).
  • Added Italian equivalents of several language-specific regression tests (money literal parsing, date rounding mode behavior, UTF-8 exception labels/conditions, type-token disambiguation) and updated user-facing printers/literate strings for it.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/typing/good/token_tipi.catala_it New Italian regression test for type-token lexing/disambiguation.
tests/money/good/literal_parsing_it.catala_it New Italian test for money literal parsing/printing with and , decimals.
tests/exception/good/condizioni_utf8.catala_it New Italian test exercising UTF-8 conditions/labels with the Exceptions command output.
tests/date/good/rounding_option_it.catala_it New Italian “good” test for date rounding mode behavior.
tests/date/bad/rounding_option_it.catala_it New Italian “bad” test ensuring ambiguous date computations error without rounding mode.
stdlib/stdlib_it.catala_it New Italian stdlib root module wiring the Italian submodules.
stdlib/date_it.catala_it Italian Date module (localized API + bindings to Date_internal).
stdlib/decimal_it.catala_it Italian Decimal helper module.
stdlib/duration_it.catala_it Italian Duration helper module.
stdlib/integer_it.catala_it Italian Integer helper module.
stdlib/list_it.catala_it Italian List helper module.
stdlib/money_it.catala_it Italian Money helper module (Euro formatting-oriented naming).
stdlib/monthyear_it.catala_it Italian MonthYear helper module (uses n_anno to avoid keyword conflicts).
stdlib/period_it.catala_it Italian Period helper module (renames contiene to data_nel_periodo).
compiler/surface/lexer_it.mli Declares the Italian localised lexer interface.
compiler/surface/lexer_it.cppo.ml Defines Italian keyword set + lexer macros (including literate directives).
compiler/surface/dune Builds and preprocesses the Italian lexer as part of the surface library.
compiler/surface/parser_driver.ml Enables Italian parsing/lexing and .catala_it(.md) extension resolution.
compiler/shared_ast/print.ml Adds Italian number separators, boolean literals, money suffix, duration units, and injection “content” keyword.
compiler/plugins/explain.ml Makes explain plugin language selection exhaustive by handling It (fallbacking to English strings).
compiler/literate/literate_common.ml Adds Italian literate document strings + language extension mapping.
compiler/literate/latex.ml Supports Italian LaTeX language selection and metadata label.
compiler/literate/html.ml Adds Italian label for the table of contents.
compiler/catala_utils/global.mli Extends backend_lang with It.
compiler/catala_utils/global.ml Extends backend_lang with It and enables localised stdlib for Italian.
compiler/catala_utils/cli.ml Adds it language code and .catala_it(.md) extension inference.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread compiler/shared_ast/print.ml
andreabedini and others added 3 commits May 19, 2026 12:13
Use anni/mesi/giorni for plural values, matching the approach used for
English (years/months/days) and French (ans/jours).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous commits introduced match expressions that exceed ocamlformat's
line-length limit. Run the formatter to expand them to multi-line form as
required by make check-promoted in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Italian identifier 'individuo.reddito' is longer than the French
'individu.revenu', pushing the exception tree condition line to 81 columns.
Update the expected output to match the wrapped form that CI produces at
the default 80-column width.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@denismerigoux denismerigoux left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much @andreabedini for this PR! It looks quite complete and comes with tests, which is very nice. Would you be OK to commit as a maintainer of this Italian translation of Catala? We would want to ping you in the future anytime we change something that requires a new Italian translation. This includes changes to the standard library which are bound to happen quite often.

@vincent-botbol what's needed on top of this PR to add a new language supported by the VSCode plugin? We may want to add these extra steps to https://github.com/CatalaLang/catala/blob/master/CONTRIBUTING.md#contribution-example-internationalization-of-the-catala-syntax

@AltGr

AltGr commented May 26, 2026

Copy link
Copy Markdown
Contributor

Note that I have ongoing changes (#1031) that will conflict with this 😬
While the contribution is very nice, I am a bit wary of taking the burden to maintain generated code. We do have ideas to make the addition of languages much simpler though.

@andreabedini

Copy link
Copy Markdown
Author

@denismerigoux sure, I can do that. @AltGr no rush, I can rebase when you are done.

Nevertheless I feel I need to sit in front of a longer source code written in Italian, the translation might still be improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants