Skip to content

Add option to include fields that are equal to the default value in the JSON encoding#1526

Closed
mrabiciu wants to merge 1 commit into
apple:mainfrom
mrabiciu:simple-default-traversal
Closed

Add option to include fields that are equal to the default value in the JSON encoding#1526
mrabiciu wants to merge 1 commit into
apple:mainfrom
mrabiciu:simple-default-traversal

Conversation

@mrabiciu

Copy link
Copy Markdown

This adds the ability to include fields that are equal to the default value in the json encoding.

This is done by adding a traversalOptions var to Visitor as well as adding a includeDefaultValues to JSONEncodingOptions which controls the JSONEncodingVisitor's traversalOptions

I've modified the generation of traverse to check traversalOptions before calling visit on each individual field.

@mrabiciu

mrabiciu commented Dec 13, 2023

Copy link
Copy Markdown
Author

This commit has all of the functional changes 59a6212

@thomasvl

Copy link
Copy Markdown
Collaborator

Do the other languages end up encoding the empty arrays when visiting default values? i.e. - does it make sense to visit on those or just skip the visit calls still? We probably want to make sure our JSON support ends up doing the same things the other upstream languages do.

In any case, since this is public api, we should spell out in the comments what the expected behavior is for all cases:

  • repeated fields
  • map fields
  • extension fields (all types)
  • proto2 syntax files (since non repeated field types have presence all have presence)
  • proto3 syntax files where the field isn't optional (expect the visit)
  • proto3 syntax files where a field is optional (likely should be the same as proto2 syntax files)

And then use a custom listener in the test to ensure all these behaviors for all the different message types.

@thomasvl

Copy link
Copy Markdown
Collaborator

For the comments spelling out the behaviors, I guess we should use the terms from the upstream pages:

@thomasvl thomasvl requested a review from tbkka December 13, 2023 20:39
@mrabiciu

Copy link
Copy Markdown
Author

Do the other languages end up encoding the empty arrays when visiting default values?

Yes I'm using the python package for comparison and if you set including_default_value_fields=True it includes empty lists and empty maps.

https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html

including_default_value_fields – If True, singular primitive fields, repeated fields, and map fields will always be serialized. If False, only serialize non-empty fields. Singular message fields and oneof fields are not affected by this option.

@mrabiciu

Copy link
Copy Markdown
Author

And then use a custom listener in the test to ensure all these behaviors for all the different message types

I'm not sure what you mean by that. Is an encoding test that asserts the encoded json is as expected not enough?

@mrabiciu mrabiciu force-pushed the simple-default-traversal branch from 7319183 to 6d44684 Compare December 14, 2023 00:15
@thomasvl

Copy link
Copy Markdown
Collaborator

And then use a custom listener in the test to ensure all these behaviors for all the different message types

I'm not sure what you mean by that. Is an encoding test that asserts the encoded json is as expected not enough?

Haven't looked in detail, it might be. I wasn't sure if json had any specific logic of it's own vs. ensure all the methods expected did/didn't get called.

@thomasvl

Copy link
Copy Markdown
Collaborator

I finally got a chance to start looking at this, and I'm going to suggest we step back to hopefully won't be a bike shed –

As drafted, the PR calls the JSONEncodingOptions addition includeDefaultValues. And Visitor gets TraversalOptions which has visitDefaultValues.

The looking at the C++ api, their JSON is always_print_primitive_fields

The Java api has: alwaysOutputDefaultValueFields & includingDefaultValueFields

I don't think we want to eventually try to do the level of the api that Java has (the Set support would really complicate the visitor pattern).

So on the naming side, my 2¢ would be:

I think default gets a little confusing around the behavior with repeated fields and map<> fields. So I think I sorta like the C++ naming because it is about primitive fields, there still is a minor stretch in that a repeated or map field is primitive (it's an array or a map), but it a single message field clearly isn't a primitive field. So…

  • Visitor gets TraversalOptions with alwaysVisitPrimitiveFields.
  • JSONEncodingOptions gets alwaysPrintPrimitiveFields

The JSON value lines up with the two other names we already have that start alwaysPrint…, and the naming gnerally lines up with the C++ api (which is what I think we normally look to).

Thoughts?

@tbkka what makes sense to you?

@tbkka

tbkka commented Dec 14, 2023

Copy link
Copy Markdown
Collaborator

I like Thomas' naming ideas here.

Comment thread Tests/SwiftProtobufTests/Test_JSON.swift Outdated

@thomasvl thomasvl left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the test case added is in the regenerate commit, so I almost missed that addition.

}
}

var usesDefaultValueFlagForTraversal: Bool {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, we don't want to have to check syntax any more (working towards Editions). Would just looking at say hasFieldPresence and if the field is repeated/map vs. singular work here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a difference in behavior between proto2 and proto3 though. At least in the python lib which I was using for reference. In proto3 unset optional primitives are omitted from json, and in proto2 unset optional primitives are included. I'm not sure if theres a better way to do this.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Descriptor support should already be tracking most of that: https://github.com/apple/swift-protobuf/blob/main/Sources/SwiftProtobufPluginLibrary/Descriptor.swift#L638-L642 What's the python code you're looking at for reference?

@mrabiciu mrabiciu Dec 15, 2023

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look at python code, just observed how the python lib behaves when serializing a message with every kind of field type.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an example of what I mean

Proto3 behavior

syntax = "proto3";

message Proto3Message {
    int32 singularInt32 = 1;
    repeated int32 repeatedInt32 = 2;
    optional int32 optionalInt32 = 3;
}
print(MessageToJson(Proto3Message(), including_default_value_fields=True))
{
  "singularInt32": 0,
  "repeatedInt32": []
}

Proto2 behavior

message Proto2Message {
    required int32 singularInt32 = 1;
    repeated int32 repeatedInt32 = 2;
    optional int32 optionalInt32 = 3;
}
print(MessageToJson(Proto2Message(), including_default_value_fields=True))
{
  "singularInt32": 0,
  "repeatedInt32": [],
  "optionalInt32": 0
}

hasPresence will be true for optionalInt32 for both proto2 and proto3 but we want to print it in proto2 and omit it in proto3

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good I can work on that. Thank you!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a docs rollout is happening:

https://protobuf.dev/programming-guides/proto2/#json-options

Always emit fields without presence: Fields that don’t support presence and that have their default value are omitted by default in JSON output (for example, an implicit presence integer with a 0 value, implicit presence string fields that are empty strings, and empty repeated and map fields). An implementation may provide an option to override this behavior and output fields with their default values.

As of v25.x, the C++, Java, and Python implementations are nonconformant, as this flag affects proto2 optional fields but not proto3 optional fields. A fix is planned for a future release.

https://protobuf.dev/programming-guides/proto3/#json-options

Always emit fields without presence: Fields that don’t support presence and that have their default value are omitted by default in JSON output (for example, an implicit presence integer with a 0 value, implicit presence string fields that are empty strings, and empty repeated and map fields). An implementation may provide an option to override this behavior and output fields with their default values.

@mrabiciu mrabiciu Feb 11, 2024

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by those docs for proto2

(for example, an implicit presence integer with a 0 value, implicit presence string fields that are empty strings, and empty repeated and map fields). An implementation may provide an option to override this behavior and output fields with their default values.

Don't integer and string fields always have presence under proto2? Under what scenario would we actually emit these in proto2?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the logic to just use hasPresence for proto2 the same way as in proto3. But I think those docs could use some clarification.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please feel free to open an issue upstream with the feedback. Some times it helps to hear directly from devs.

Comment thread Sources/protoc-gen-swift/MessageFieldGenerator.swift Outdated
Comment thread Sources/protoc-gen-swift/MessageFieldGenerator.swift Outdated
Comment thread Sources/protoc-gen-swift/MessageFieldGenerator.swift Outdated
Comment thread Sources/SwiftProtobuf/Visitor.swift Outdated
Comment thread Protos/SwiftProtobufTests/test_messages_proto3.proto
Comment thread Tests/SwiftProtobufTests/Test_JSON.swift Outdated
Comment thread Sources/SwiftProtobuf/Visitor.swift Outdated
Comment thread Sources/SwiftProtobuf/Visitor.swift Outdated
Comment thread Sources/SwiftProtobuf/Message.swift Outdated
Comment thread Sources/SwiftProtobuf/PopulatorConstructable.swift Outdated
Comment thread Sources/protoc-gen-swift/MessageFieldGenerator.swift Outdated
Comment thread Sources/SwiftProtobuf/JSONEncodingOptions.swift Outdated
@thomasvl

Copy link
Copy Markdown
Collaborator

@mrabiciu looks like you might need to sync also to resolve some conflicts.

@thomasvl

Copy link
Copy Markdown
Collaborator

@tbkka can you take a look, I think given the new docs from the protobuf team, we're likely pretty good to go on this.

@mrabiciu mrabiciu force-pushed the simple-default-traversal branch from 8b366c4 to 49c7175 Compare February 21, 2024 20:55
@mrabiciu

Copy link
Copy Markdown
Author

Rebased on main and squashed all my commits

@mrabiciu mrabiciu force-pushed the simple-default-traversal branch from 49c7175 to ba9708f Compare February 22, 2024 23:46
@ChristopherDrum

ChristopherDrum commented Dec 26, 2024

Copy link
Copy Markdown

@thomasvl It's been 10 months now. Was there a blocker on this? It seemed at the time like it was ready to merge in. Has this been supplanted by another PR or methodology? Any guidance from the team on how we should deal with this situation in the meantime?

Edit: Found the answer in another thread from another thread. Have to wait for v2 apparently.
#1504 (comment)

@thomasvl

Copy link
Copy Markdown
Collaborator

New apis/additions to the library generally get signoff from someone at Apple since they own the repo.

Currently they want to ensure any changes/additions to the repo are non breaking as a major version bump could cause issues for downstream packages that also depend on this repo.

@FranzBusch FranzBusch left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a sensible change to me. This is only additive in adding new encoding options and it adds new test cases to check that this option is working correctly. The change is non breaking since we provide a default implementation of the new protocol requirement.

@gjcairo

gjcairo commented Jan 2, 2025

Copy link
Copy Markdown
Contributor

@tbkka left a comment here in 2023 for another change that also only adds new APIs, but he mentioned it's breaking because it's breaking between the runtime and the generated code (kind of similarly to this PR, new API was added to the Message protocol, which means the generated code has to be re-generated to build).

I agree there are only API additions with default implementations on this PR, but if you have mismatching versions for the generated code and SwiftProtobuf (specifically, newer generated code running with an older swift-protobuf), things will break because the generated code now references a new field in the Visitor.

FWIW, I don't think these count as breaking changes (so I agree with @FranzBusch approving this): if someone went to the trouble of updating the generated code with a newer version of the plugin, why would they keep the old version of the runtime (or vice versa)? As long as users are updating both things simultaneously they should be safe, and I don't see why this wouldn't happen, to be honest.

@thomasvl In the README we suggest that you should try to keep the versions of SwiftProtobuf and protoc-gen-swift matching, but have we always been clear about what we officially support here?

@thomasvl

thomasvl commented Jan 2, 2025

Copy link
Copy Markdown
Collaborator

@thomasvl In the README we suggest that you should try to keep the versions of SwiftProtobuf and protoc-gen-swift matching, but have we always been clear about what we officially support here?

I don't think we have. Since we've never come up with a better way to do the version mismatch checks compared to what the C based languages can do, it's sorta a space we've always struggled with.

@thomasvl

thomasvl commented Jan 2, 2025

Copy link
Copy Markdown
Collaborator

I guess it isn't a "break", but the potential catch I can see is if someone depends on protos compiled into another package, they would need those also updated for this option to work since the support is captured in the generated code.

The only other thing I can think of that has been a concern on additions is code size increase, and this does increase the generated code size (which I believe is what pushed some of the investigation into table based generation to try and factor things like this out).

/// - repeated fields: Will always be visited, even if they are empty.
/// - map fields: Will always be visited, even if they are empty.
/// - singular message and group fields: Will be visited only if they are set.
/// - singular primitive fields: Will always be visited under proto2. Under proto3 unset optional primitive fields will not be visited.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc string likely needs an update since everything is about Editions now upstream.

@mrabiciu mrabiciu closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants