feat: Include rule name in parse error messages#38
Conversation
`report_no_viable_alternative` and `report_input_mismatch` in the
default error strategy now append `in rule '<name>'` to their
messages. Previously these messages provided no context about which
grammar rule was being parsed when the error occurred, making it
hard to diagnose parse failures in grammars with many alternatives.
The rule name is extracted using the same pattern already used by
`report_failed_predicate`:
recognizer.get_rule_names()
[recognizer.get_parser_rule_context().get_rule_index()]
This is an improvement over Java's behavior as well — the Java
ANTLR4 runtime also omits rule names from these two error paths.
Tested with the Labels grammar by feeding `)` as input, which
triggers `report_input_mismatch` in rule `e` (where `e` expects
INT, ID, or `(`). The test captures error messages via a custom
`ErrorListener` and verifies they contain `in rule '`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This one makes me somewhat uncomfortable. What's the context here? Are you debugging your grammar? Maybe this could be additional information when the |
|
My thinking was two-fold here. First, for consistency: there are other places where we surface what rule lexing/parsing fails in, so felt like we should extend that to other cases too. Second, because surfacing the rule gives authors strictly more information of why their thing won't lex/parse. Sure, ideally it's obvious what their error is, but I could also imagine authors going back to the grammar for "this feels like it should be supported" (possibly to contribute fixes to the grammar). That said, I wouldn't be opposed to putting these "in rule" extensions behind a debug flag. |
report_no_viable_alternativeandreport_input_mismatchin the default error strategy now appendin rule '<name>'to their messages. Previously these messages provided no context about which grammar rule was being parsed when the error occurred, making it hard to diagnose parse failures in grammars with many alternatives.The rule name is extracted using the same pattern already used by
report_failed_predicate:This is an improvement over Java's behavior as well — the Java ANTLR4 runtime also omits rule names from these two error paths.
Tested with the Labels grammar by feeding
)as input, which triggersreport_input_mismatchin rulee(whereeexpects INT, ID, or(). The test captures error messages via a customErrorListenerand verifies they containin rule '.