Skip to content

[2.x] OPENNLP-1835: Tolerate unsupported XML parser security options#1066

Merged
mawiesne merged 1 commit into
apache:opennlp-2.xfrom
RankoR-GOS:fix-android-xml-properties
Jun 5, 2026
Merged

[2.x] OPENNLP-1835: Tolerate unsupported XML parser security options#1066
mawiesne merged 1 commit into
apache:opennlp-2.xfrom
RankoR-GOS:fix-android-xml-properties

Conversation

@RankoR

@RankoR RankoR commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

There's no existing ticket for this issue.

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn clean install at the root opennlp folder?
  • Have you written or updated unit tests to verify your changes?

What changed

OpenNLP 2.5.9 added stricter XML parser hardening in XmlUtil, including JAXP external-access properties, implementation-specific parser features, and XInclude configuration.

Some XML parser providers, including Android's, reject these optional settings even though they can still create a usable secure parser. This caused XmlUtil.createDocumentBuilder() to fail during OpenNLP model initialization on Android.

We faced this issue in SpeechServices in GrapheneOS: GrapheneOS/SpeechServices#18.

This PR keeps the hardening behavior where supported, but applies provider-specific XML security options defensively:

  • unsupported DocumentBuilderFactory attributes are logged and ignored
  • unsupported parser features are logged and ignored
  • unsupported XInclude configuration is logged and ignored
  • actual parser construction failures still remain fatal

A focused regression test was added using a custom DocumentBuilderFactory that rejects these optional settings.

Verification

  • ./mvnw -pl opennlp-tools -Dtest=XmlUtilTest test
  • ./mvnw -pl opennlp-tools test

Also manually verified with an SpeechServices app on a Pixel device.

@rzo1 rzo1 changed the title Tolerate unsupported XML parser security options OPENNLP-1835 - Tolerate unsupported XML parser security options Jun 3, 2026
@rzo1 rzo1 requested review from mawiesne and rzo1 June 3, 2026 05:53
@rzo1 rzo1 added the java Pull requests that update Java code label Jun 3, 2026

@rzo1 rzo1 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. In general, it looks good to me. Left some comments.

We will need to port this to the 3.x line as well.

Comment thread opennlp-tools/src/main/java/opennlp/tools/util/XmlUtil.java Outdated
Comment thread opennlp-tools/src/main/java/opennlp/tools/util/XmlUtil.java Outdated
Comment thread opennlp-tools/src/main/java/opennlp/tools/util/XmlUtil.java Outdated
Comment thread opennlp-tools/src/main/java/opennlp/tools/util/XmlUtil.java Outdated
@RankoR RankoR force-pushed the fix-android-xml-properties branch from 99571fe to 43ffe08 Compare June 3, 2026 08:30
@RankoR RankoR force-pushed the fix-android-xml-properties branch from 43ffe08 to 1fff0ab Compare June 3, 2026 08:31
@mawiesne mawiesne changed the title OPENNLP-1835 - Tolerate unsupported XML parser security options [2.x] OPENNLP-1835: Tolerate unsupported XML parser security options Jun 3, 2026
@jzonthemtn

Copy link
Copy Markdown
Contributor

@RankoR Thanks for the PR!

@mawiesne mawiesne left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @RankoR for your initial contribution on the 2.x branch!

@mawiesne mawiesne merged commit 3a37041 into apache:opennlp-2.x Jun 5, 2026
12 checks passed
@RankoR

RankoR commented Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

@mawiesne no problem! Do you have any estimates on the release date of this fix?

@mawiesne

mawiesne commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

@mawiesne no problem! Do you have any estimates on the release date of this fix?

@RankoR 2.5.10 and 3.0.0-M4 could be out late June / early July - stay tuned. Hint: join OpenNLP dev mailing list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

java Pull requests that update Java code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants