Skip to content

feat(model-protobuf): re-platform converter onto common-protobuf, drop protobuf-java#1912

Open
jfallows wants to merge 3 commits into
developfrom
claude/dazzling-meitner-l91wrj
Open

feat(model-protobuf): re-platform converter onto common-protobuf, drop protobuf-java#1912
jfallows wants to merge 3 commits into
developfrom
claude/dazzling-meitner-l91wrj

Conversation

@jfallows

Copy link
Copy Markdown
Contributor

Motivation

The converter layer of #1857: re-platform the model-protobuf converter off protobuf-java/protobuf-java-util and ANTLR onto the common-protobuf library (merged in #1906 + #1910, which provides common-json). This realizes the issue's goal of dropping the protobuf/jackson dependencies from model-protobuf.

Changes

  • ProtobufModelHandler caches a common-protobuf ProtobufSchema per schemaId, compiled from the catalog's .proto source via Protobuf.schema(text). The schema-registry message-index framing (zigzag varint + single-0x00 optimization) is preserved verbatim; a record's index path resolves via ProtobufSchema.messageIndexes / messageByIndexes. Padding is recomputed against ProtobufSchema/ProtobufMessage.
  • Read converter validates via ProtobufSchema.validate; view: json renders through the streaming ProtobufJson generator with FIELD_NAMES=PROTO + INCLUDE_DEFAULTS=true, reproducing the prior JsonFormat.preservingProtoFieldNames().includingDefaultValueFields(). The $.field extract feature is preserved with byte-identical value representations.
  • Write converter encodes JSON→wire via the ProtobufJson parser pipeline (view: json) or validates wire and prepends the index prefix.
  • Dependency removal: dropped protobuf-java/protobuf-java-util, the shade plugin, and ANTLR from the build; deleted the ANTLR grammars, ProtobufParser, Proto2/3Listener, and DescriptorTree (the .proto compiler now lives in common-protobuf). module-info requires common.protobuf; NOTICE regenerated. Zero com.google.protobuf / protobuf-java / ANTLR references remain in source.
  • Added ProtobufModelConfiguration placeholder.

Behavioral notes (two tracked follow-ups)

Surfaced during review; both require a small common-protobuf change and will land as a follow-up PR that this branch then consumes:

  1. Unknown JSON fields on the write path. The proto3 JSON mapping says parsers should reject unknown fields by default (opt-in to ignore) — which the old converter did. common-protobuf's ProtobufJson parser currently ignores them. Follow-up: add a reject-unknown option to ProtobufJson.parser and enable it here (restoring spec-default + legacy strictness; the shouldWriteInvalidProtobufEventFormatJson test is temporarily on malformed JSON instead of an unknown field).
  2. Validation-failure event message is currently generic (Invalid Protobuf event.); the descriptive protobuf-java text isn't available from the streaming pipeline. Follow-up: surface a rejection reason from the parser/validator.

Tests

  • Unit + EventIT parity preserved. Length/index-prefix (+1/+3, read strips), shouldExtract, and the view: json output assertions (e.g. {"content":"OK","date_time":"01012024"}) are unchanged.
  • Deleted ProtobufParserTest (the parser moved to common-protobuf, covered there by ProtobufSourceCompilerTest).
  • Added ProtobufModelEventFormatterTest.

./mvnw clean install -pl runtime/model-protobuf -DskipITs and ./mvnw clean verify -pl runtime/model-protobuf (EventIT) both pass.

🤖 Generated with Claude Code

https://claude.ai/code/session_01MfzS5MmFH2NQBefvoB3o7d


Generated by Claude Code

…p protobuf-java

Re-platform the model-protobuf converter off protobuf-java/protobuf-java-util
and ANTLR onto the common-protobuf library (which provides common-json):

- ProtobufModelHandler caches a common-protobuf ProtobufSchema per schemaId,
  compiled from catalog .proto source via Protobuf.schema(text). The
  schema-registry message-index framing (zigzag varint + single-0x00) is
  preserved verbatim; the record's index path resolves via
  ProtobufSchema.messageIndexes / messageByIndexes.
- Read converter validates via ProtobufSchema.validate; view:json renders
  through the streaming ProtobufJson generator with proto field names and
  default inclusion (reproducing the prior JsonFormat configuration). The
  $.field extract feature is preserved with identical value representations.
- Write converter encodes JSON -> wire via the ProtobufJson parser pipeline
  (view:json) or validates wire and prepends the index prefix.
- Remove protobuf-java/protobuf-java-util, the shade plugin, and ANTLR from
  the build; delete the ANTLR grammars, parser, listeners, and DescriptorTree
  (the .proto compiler now lives in common-protobuf). module-info requires
  common.protobuf. NOTICE regenerated.
- Add ProtobufModelConfiguration placeholder.

Known follow-ups (tracked, pending a common-protobuf change): restore
spec-default rejection of unknown JSON fields on the write path, and surface
descriptive validation-failure messages from the pipeline.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MfzS5MmFH2NQBefvoB3o7d
claude added 2 commits June 17, 2026 02:56
…lidation reasons

Consumes the common-protobuf strictness/reason APIs (#1913):

- Write (view: json): enable ProtobufJson.REJECT_UNKNOWN_FIELDS so unknown JSON
  fields are rejected again (restoring the pre-replatform proto3 JSON behavior),
  and emit pipeline.reason() as the validation-failure detail instead of the
  generic "Invalid Protobuf event".
- Read: validate via a reusable ProtobufSchema.validatorPipeline (added here) and
  emit its reason() on failure; fall back to the generic message when a reject
  carries no reason (e.g. a structural validator reject).
- ProtobufSchema.validatorPipeline(name) exposes the reusable parser->validator->
  discard pipeline that validate() already built per-call, so callers can read
  reason(); validate() now delegates to it.

Tests:
- ProtobufModelTest.shouldWriteInvalidProtobufEventFormatJson reverted to the
  unknown-field case (well-formed JSON with a field absent from the schema),
  asserting rejection.
- ProtobufModelTest.shouldReadInvalidProtobufEvent captures the emitted event and
  asserts the descriptive message "...truncated field: need 8 bytes."

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MfzS5MmFH2NQBefvoB3o7d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants