fix: take Pebble defaults into consideration when consistency checking Checks#2567
fix: take Pebble defaults into consideration when consistency checking Checks#2567tonyandrewmeyer wants to merge 10 commits into
Conversation
… the data Pebble copies from the plan is using defsults in the layer.
…ensure we use the default values when necessary.
Co-authored-by: Dave Wilding <tech@dpw.me>
| # an absent value on the check info as also matching the default. | ||
| unset_defaults = { | ||
| 'level': ({None, pebble.CheckLevel.UNSET}, pebble.CheckLevel.UNSET), | ||
| 'startup': ({None, pebble.CheckStartup.UNSET}, pebble.CheckStartup.ENABLED), |
There was a problem hiding this comment.
Pebble seems to technically default to UNSET -- in Check.__init__: self.startup = CheckStartup(dct.get('startup', '')). Though apparently they're functionally equivalent ...
class CheckStartup(enum.Enum):
"""Enum of check startup options."""
UNSET = '' # Note that this is treated as ENABLED.
ENABLED = 'enabled'
DISABLED = 'disabled'Interestingly for this change though, if we use UNSET as the default, then our _normalise logic becomes is not None, and the need for the helper and the set of 'unset' values seems to go away.
There was a problem hiding this comment.
For reference:
Pebble:
- Combined plan defaults: Threshold → 3 (in plan.go). Level and Startup are not defaulted in the combined plan, they stay as UnsetLevel/CheckStartupUnknown ("", in the Go way of doing things).
- /v1/checks CheckInfo response (checkstate/manager.go): Startup is normalised server-side: Unknown → Enabled. Level passes through unchanged. Threshold passes through (so always 3 from the combined plan).
(I wish this worked differently, so startup ended up as enabled in the combined plan, even though it would be unset in the layers so combining works. Then the plan would reflect better what happens, and it'd be more consistent with threshold. But it can't be changed now.)
ops.pebble:
- Check: startup = CheckStartup(dct.get('startup', '')) → UNSET. threshold = dct.get('threshold') → None when not in layer. level → UNSET.
- CheckInfo.from_dict: startup = CheckStartup(d.get('startup', 'enabled')). Defaults to ENABLED if the key is missing (so it can never be UNSET from a real Pebble response, since Pebble sends enabled/disabled; and even if the key were missing, ops defaults to ENABLED). level → UNSET if missing. threshold is a required key.
scenario:
- CheckInfo has different defaults than real Pebble:
- level default: None (not CheckLevel.UNSET)
- startup default: CheckStartup.ENABLED (not UNSET)
- threshold default: 3 (int, never None)
(I think I did a poor job when I originally added this. It would probably make more sense to mirror Pebble more closely. But also not really changeable now.)
Harness:
- passes check.level/check.startup/check.threshold (None→3) straight through to pebble.CheckInfo.
There was a problem hiding this comment.
I'm good unrolling the loop as per another thread. Putting the logic there probably makes it clear enough?
|
|
||
| for attr, (unset, default) in unset_defaults.items(): | ||
| check_value = _normalise(getattr(check, attr), unset, default) | ||
| plan_value = _normalise(getattr(plan_check, attr), unset, default) |
There was a problem hiding this comment.
Do we need to explicitly normalise the plan value, or is this actually a no-op (or even counterproductive) since State.containers.plan ultimately is populated by calling the real pebble.Plan?
I'm a little unsure here because of the private-looking Container._base_plan attribute -- if users never pass this, then our defaults all come from the real ops.pebble.Plan directly. If they do pass it, then that's funky, but maybe whatever they pass in should be the ground truth for comparison rather than attempting to normalise it.
There was a problem hiding this comment.
The thing is that Container.plan does build a real pebble.Plan from _base_plan, but _render_checks then writes the layer's pebble.Check objects straight into plan.checks[name], bypassing Pebble's combine/defaulting logic for checks. So plan.checks[name].threshold stays None, and .startup/.level stay UNSET, whenever the layer didn't set them. That's what motivated the threshold default. So the plan-side normalisation is needed.
Possibly the way Container.plan works could be changed, the various render methods are mostly taken from pebble at the moment, but I'm not sure I want to do that big a change.
…arisons. This drops the _normalise helper and the unset/defaults mapping in favour of three direct comparisons, one per attribute, so type checking covers each branch and the per-attribute normalisation rules are obvious at the call site. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous commit only normalised the plan side, but scenario users can pass CheckInfo.startup=UNSET (or None) and CheckInfo.threshold=None too, so the check side needs the same treatment to avoid spurious mismatches when both sides are effectively unset. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Like test_stop_checks, this test calls Container.stop_checks, which logs a security event via _log_security_event and emits a RuntimeWarning when logging is not configured for Harness in these tests. Wrap the stop_checks call in pytest.warns(RuntimeWarning) so the warning does not surface as a test failure under -W error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…deterministic. ops.log._get_juju_log_and_app_id is functools.cached, so the RuntimeWarning emitted when no JujuLogHandler is set up only fires the first time it is called in a Python process. Under pytest-xdist, two harness tests in the same worker that both used pytest.warns(RuntimeWarning) around Container.stop_checks could race: whichever test ran first got the warning, and the second failed with "DID NOT WARN". Mirror the reset_security_logging fixture from testing/tests/test_e2e/test_pebble.py, applied to both test_stop_checks and test_stop_then_start, so the cache is cleared before and after each test and the warning fires deterministically. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
| """ | ||
| _get_juju_log_and_app_id.cache_clear() | ||
| yield | ||
| _get_juju_log_and_app_id.cache_clear() |
There was a problem hiding this comment.
Clearing again afterwards seems like it might just hide other tests forgetting to request this fixture when they really should.
james-garner-canonical
left a comment
There was a problem hiding this comment.
Looks good, thanks!
| # The scenario CheckInfo uses None for an absent level, while the | ||
| # plan uses CheckLevel.UNSET, so those two also need to compare | ||
| # equal. | ||
| check_level = check.level if check.level is not None else pebble.CheckLevel.UNSET |
There was a problem hiding this comment.
Thanks, I like the unrolled form. WDYT about localising comments per attribute?
| # The scenario CheckInfo uses None for an absent level, while the | |
| # plan uses CheckLevel.UNSET, so those two also need to compare | |
| # equal. | |
| check_level = check.level if check.level is not None else pebble.CheckLevel.UNSET | |
| # Scenario allows None for unset, Pebble defaults to UNSET. | |
| check_level = check.level if check.level is not None else pebble.CheckLevel.UNSET |
| f'container {container.name!r} has a check {check.name!r} with a ' | ||
| f"different 'level' ({check.level}) than the plan ({plan_check.level}).", | ||
| ) | ||
| check_startup = ( |
There was a problem hiding this comment.
| check_startup = ( | |
| # Scenario defaults to None, Pebble defaults to UNSET and collapses to ENABLED. | |
| check_startup = ( |
| f'container {container.name!r} has a check {check.name!r} with a ' | ||
| f"different 'startup' ({check.startup}) than the plan ({plan_check.startup}).", | ||
| ) | ||
| check_threshold = 3 if check.threshold is None else check.threshold |
There was a problem hiding this comment.
| check_threshold = 3 if check.threshold is None else check.threshold | |
| # Scenario and Pebble allow None, collapsing to 3. | |
| check_threshold = 3 if check.threshold is None else check.threshold |
When Pebble returns information about a check, it copies the level, startup, and threshold into the response from the computed plan. If none of the layers have provided values for these, the plan has defaults (unset, enabled, and 3, respectively).
When we check a
CheckInfofor consistency in Scenario (meaning that the values should be ones Pebble could sensibly return, given the rest of the state), we need to take that default value into account, as Pebble would. We don't actually model checks going up and down currently, so it's really only the consistency that we are concerned with.This PR fixes the consistency check so that it will fall back to the Pebble defaults when needed. It repeats the defaults, because the existing copy is buried in the mock Pebble client in Harness, and the source of truth is in the Pebble code.
Fixes #2566