refactor: distinguish between init and attribute types in testing state classes by tonyandrewmeyer · Pull Request #2331 · canonical/operator

tonyandrewmeyer · 2026-02-17T10:55:41Z

When designing the Scenario 7 API we introduced kw-only args, and originally had custom __init__ for each state class to support that. We decided to change that because it felt busy and a lot of maintenance.

However, we currently have an unfortunate mismatch between some of the types accepted to create an instance of a state class and the type the corresponding attribute will be. For example, the init might accept any Mapping but we know the attribute will always be a dict. It would be nice to provide that information to users.

Now that we are using Python 3.10+, we do have some classes without this issue that can continue to use the dataclasses generated __init__. However, there are many that would be better as more explicit, and I am not convinced it's too much work to maintain.

We opened the door to this in #2274 adjusting CheckInfo. This PR applies the same improvement to the rest of the state classes.

A couple of small behaviour changes fall out of this:

The *_addresses, *_subnets, redacted, ca_certificates, and Secret.remote_grants arguments now reject a bare str (or bytes) up front. Previously this would have silently iterated the string character-by-character, leaving you with ['s', 'o', 'm', ...] in the attribute.
Passing app_status=None or unit_status=None to State now coerces to UnknownStatus(), where previously it would have raised TypeError. Explicit None is now treated the same as omitting the argument.

Fixes #2152

…ting and types attributes will be

james-garner-canonical

I'm a big fan of making this change, thanks for taking care of this. I have a number of suggestions around typing and defaults, which I've made on individual lines, though they typically apply to more lines across the PR -- but I figured I'd keep the comments fewer than they'd otherwise be ...

Do you think this change warrants some additional unit tests, or are you happy that the existing tests would catch any errors in this PR? If the latter, please mention the relevant test suites.

tonyandrewmeyer · 2026-02-17T23:57:06Z

Do you think this change warrants some additional unit tests, or are you happy that the existing tests would catch any errors in this PR? If the latter, please mention the relevant test suites.

In terms of not breaking the __init__ signature, I feel like the tests from here downwards should cover that (right number of positional/keyword-only arguments in particular), and they should also cover the behaviour of __init__ in terms of forcing immutability (same file, the tests following on from the previous ones). So I feel comfortable that any regressions in this PR would be caught.

In terms of tests for the changes, I'm not super keen on having tests like:

c = CloudCredential(auth_type="foo", redacted=['a', 'b', 'c'])
assert isinstance(c.redacted, list)

I know we have some tests where we expect pyright to find issues, but I'm not sure it's the right move to add something like that for this either.

Do you have any suggestions in terms of tests?

james-garner-canonical · 2026-02-18T00:08:38Z

Do you think this change warrants some additional unit tests, or are you happy that the existing tests would catch any errors in this PR? If the latter, please mention the relevant test suites.

In terms of not breaking the __init__ signature, I feel like the tests from here downwards should cover that (right number of positional/keyword-only arguments in particular), and they should also cover the behaviour of __init__ in terms of forcing immutability (same file, the tests following on from the previous ones). So I feel comfortable that any regressions in this PR would be caught.

Nice!

In terms of tests for the changes, I'm not super keen on having tests like:
c = CloudCredential(auth_type="foo", redacted=['a', 'b', 'c'])
assert isinstance(c.redacted, list)
I know we have some tests where we expect pyright to find issues, but I'm not sure it's the right move to add something like that for this either.

Do you have any suggestions in terms of tests?

I wouldn't mind seeing tests a bit like the one you're not keen on, explicitly encoding (from a user perspective) the type conversion and copying behaviour that we're implementing.

redacted = ['a', 'b', 'c']
...
c = CloudCredential(auth_type="foo", redacted=redacted, ...)
assert isinstance(c.redacted, list)  # or tuple if we go that way
assert c.redacted == redacted
assert c.redacted is not redacted
...

The existing tests probably do cover a lot of this, but a lot of them are hard to follow at a glance due to the parametrization and abstraction.

dimaqq

I kinda like this.
If it works, it's fair to merge :)

Happy to leave the details to James.

tonyandrewmeyer · 2026-02-19T23:26:49Z

@james-garner-canonical brought up the excellent point that these are frozen dataclasses that we want people to treat as immutable. So giving their type checker information that they have a list (rather than an immutable Sequence) or a dict (rather than an immutable Mapping) leads them to where we don't want to go.

So rejecting this instead.

…ave type checkers alert to mutating the state.

tonyandrewmeyer · 2026-03-26T21:49:32Z

@james-garner-canonical I've adjusted per the discussion we had earlier in the week, and this should be good for reviewing again now, thanks!

james-garner-canonical

I really like the direction here, and I definitely think it's worth making these changes to decouple the __init__ argument typing from the attribute typing.

I've flagged a number of items that I think require some further thought before merging.

james-garner-canonical · 2026-03-26T23:56:12Z

+        object.__setattr__(self, 'content', dict(content))
+        object.__setattr__(self, '_data_type_name', _data_type_name)
+        _deepcopy_mutable_fields(self)


Suggested change

object.__setattr__(self, 'content', dict(content))

object.__setattr__(self, '_data_type_name', _data_type_name)

_deepcopy_mutable_fields(self)

object.__setattr__(self, 'content', copy.deepcopy(content))

object.__setattr__(self, '_data_type_name', _data_type_name)

james-garner-canonical · 2026-03-26T23:57:02Z

-    relations: Iterable[RelationBase] = dataclasses.field(default_factory=frozenset)
+    relations: frozenset[RelationBase]


Should we use Collection instead for all these attributes?

Co-authored-by: James Garner <james.garner@canonical.com>

dimaqq · 2026-04-09T03:16:28Z

I'll wait for Tony and James to hash it out.

dimaqq

WDYT about settings up a working group (maybe James and Tony only) to figure out the minutae?

dimaqq · 2026-04-09T23:49:26Z


 AnyJson = str | bool | dict[str, 'AnyJson'] | int | float | list['AnyJson']
-RawSecretRevisionContents = RawDataBagContents = dict[str, str]
+RawSecretRevisionContents = RawDataBagContents = Mapping[str, str]


This one I'm not convinced about, because it's an "in-out" field:

# user passes data in, Mapping is great here rel = Relation(..., remote_app_data=this_is_now_a_mapping) state = State(relations={rel})

# user inspects of modified data, dict is better here state = ctx.run(..., state) rel = state.get_relation(rel) # what we expect charmers to do, Mapping is helpful assert rel.remote_units_data[unit]["foo"] == '"bar"' # what charmers may also do, I think thaththis PR is a breaking change wrt. typing rel.remote_units_data[unit].setdefault("foo", '"bar"')

The intention of having this field typed as a Mapping is good, IMO.
The practical implications though, I'm not sure about.

james-garner-canonical

Looks good, apologies for the delays with this one. Thanks for all the work and super-tox-ing.

I've commented in one place about Iterable/Sequence[str] typing permitting bare str arguments. This is largely a pre-existing issue, except that inlining _deepcopy_mutable_fields leads to a behaviour change where a bare str now gets list called on it. Either way it would have been a technical error that should always have resulted in a runtime error at some point (though would it always have?). But if it's not a runtime error then we are introducing a behaviour change that could show up in state.

I'd be on board with adding an explicit check and error for bare str if that doesn't break existing tests.

james-garner-canonical · 2026-05-19T04:03:34Z

+        identity_endpoint: str | None = None,
+        storage_endpoint: str | None = None,
+        credential: CloudCredential | None = None,
+        ca_certificates: Iterable[str] = (),


This applied before the PR when the argument was typed as Sequence[str] -- ca_credentials='some string' is legal at type checking time.

Previously, _deepcopy_mutable_variables would have left us with ca_certificates = 'some string'. Now we'd get ca_certificates = ['s', 'o', 'm', ...].

I assume that both of these are bad and may result in an error at some later point, but maybe they can silently pass through a test in some cases? Presumably this would ideally be an error straight away. Alternatively, a legal shorthand for ca_certificates = ['some string'].

Good catch. Added a _list_of_str helper that raises StateValidationError when a bare str is passed where an iterable of strings is expected, and applied it to redacted, ca_certificates, ingress_addresses, egress_subnets, and the values in Secret.remote_grants. Existing tests still pass and there's a new test_bare_str_rejected parametrised test covering each site.

Add a _list_of_str helper that raises StateValidationError when a bare str is passed where an iterable of strings is expected, and apply it to redacted, ca_certificates, ingress_addresses, egress_subnets, and the values in Secret.remote_grants.

- _list_of_str now also rejects bytes (which iterate as ints). - State.config is now copied so the caller's dict isn't aliased. - Comment Secret._tracked_revision/_latest_revision: class attrs, not fields, deliberately excluded from __eq__. - Comment RelationBase: still uses dataclass-generated __init__, asymmetry with the rest of the module is deliberate. - Comment Container's mixed deepcopy/shallow-copy choices. - Comment why Exec._change_id, Container._base_plan, and StoredState._data_type_name remain in __init__ despite being private.

tonyandrewmeyer · 2026-06-09T07:51:32Z

This has two ticks, but one is Dima's so it doesn't pass any more. @tromai would you mind having a look?

james-garner-canonical · 2026-06-12T05:28:34Z

New concern that occurred to me today: adding a custom __init__ (which doesn't call __post_init__) breaks anyone who subclassed one of these classes and defined their own __post_init__ (it would no longer be called).

I guess we should (re-)check with Hyrum? I suppose checking for new failures in unit tests would be sufficient, but it also feels like something it should be possible to analyse statically ... perhaps a feature for another time.

benhoyt · 2026-06-12T23:21:19Z

We can definitely re-check with Hyrum. However, I guess we have to step back and decide what we're promising when exposing dataclasses as part of the public API (and document that). It seems to me people shouldn't need to subclass testing state classes, and maybe we should rule that out.

tromai

Thank you for waiting. The current changes look good to me.

It took me sometime to understand that dataclasses attribute type definition can be different from the types in __init__ signature. These classes need additional logic in __init__ to "pre-process" the parameters.

The changes make sense to me after I wrap my head around it.

refactor: use __init__ to distinguish between types possible for crea…

b24d4a4

…ting and types attributes will be

tonyandrewmeyer requested review from dimaqq and james-garner-canonical February 17, 2026 10:55

Restore accidentally removed deepcopies.

3658da3

james-garner-canonical reviewed Feb 17, 2026

View reviewed changes

dimaqq approved these changes Feb 18, 2026

View reviewed changes

Comment thread testing/src/scenario/state.py Outdated

Comment thread testing/src/scenario/state.py Outdated

Address review comments.

4b1c247

tonyandrewmeyer requested a review from james-garner-canonical February 18, 2026 09:12

tonyandrewmeyer closed this Feb 19, 2026

tonyandrewmeyer deleted the scenario-state-init branch February 19, 2026 23:26

tonyandrewmeyer restored the scenario-state-init branch March 21, 2026 08:20

tonyandrewmeyer reopened this Mar 21, 2026

tonyandrewmeyer marked this pull request as draft March 21, 2026 08:20

tonyandrewmeyer added 2 commits March 27, 2026 10:15

Don't change attribute types to non-frozen ones, because we want to h…

2c7c2be

…ave type checkers alert to mutating the state.

Copy the good bits of canonical#2335.

03c8381

tonyandrewmeyer mentioned this pull request Mar 26, 2026

fix: use immutable types for state component attributes to reflect that they are immutable #2335

Closed

Merge remote-tracking branch 'origin/main' into scenario-state-init

e79a1f9

tonyandrewmeyer marked this pull request as ready for review March 26, 2026 21:48

james-garner-canonical requested changes Mar 27, 2026

View reviewed changes

tonyandrewmeyer and others added 3 commits March 31, 2026 13:07

Apply suggestion from @tonyandrewmeyer

cb4c585

Apply suggestion from @james-garner-canonical

dd1dd96

Co-authored-by: James Garner <james.garner@canonical.com>

Apply suggestion from @tonyandrewmeyer

6dd5697

tonyandrewmeyer commented Mar 31, 2026

View reviewed changes

Comment thread testing/src/scenario/state.py Outdated

Apply suggestion from @tonyandrewmeyer

eb82470

dimaqq reviewed Apr 9, 2026

View reviewed changes

Comment thread testing/src/scenario/state.py

dimaqq reviewed Apr 9, 2026

View reviewed changes

Address review comments.

d4ee1f7

tonyandrewmeyer requested a review from james-garner-canonical April 14, 2026 03:50

dimaqq approved these changes Apr 15, 2026

View reviewed changes

james-garner-canonical approved these changes May 19, 2026

View reviewed changes

tonyandrewmeyer added 4 commits June 7, 2026 21:25

Merge branch 'main' into scenario-state-init

5ee87da

Merge branch 'main' into scenario-state-init

6e343f9

tonyandrewmeyer requested a review from tromai June 9, 2026 07:51

tromai approved these changes Jun 13, 2026

View reviewed changes

		relations: Iterable[RelationBase] = dataclasses.field(default_factory=frozenset)
		relations: frozenset[RelationBase]

Conversation

tonyandrewmeyer commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

james-garner-canonical left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tonyandrewmeyer commented Feb 17, 2026

Uh oh!

james-garner-canonical commented Feb 18, 2026

Uh oh!

dimaqq left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tonyandrewmeyer commented Feb 19, 2026

Uh oh!

tonyandrewmeyer commented Mar 26, 2026

Uh oh!

james-garner-canonical left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

james-garner-canonical Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

james-garner-canonical Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dimaqq commented Apr 9, 2026

Uh oh!

Uh oh!

dimaqq left a comment

Choose a reason for hiding this comment

Uh oh!

dimaqq Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

james-garner-canonical left a comment

Choose a reason for hiding this comment

Uh oh!

james-garner-canonical May 19, 2026

Choose a reason for hiding this comment

Uh oh!

tonyandrewmeyer Jun 7, 2026

Choose a reason for hiding this comment

Uh oh!

tonyandrewmeyer commented Jun 9, 2026

Uh oh!

james-garner-canonical commented Jun 12, 2026

Uh oh!

benhoyt commented Jun 12, 2026

Uh oh!

tromai left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

tonyandrewmeyer commented Feb 17, 2026 •

edited

Loading

dimaqq Apr 9, 2026 •

edited

Loading

tromai left a comment •

edited

Loading