fix: correct CRN status reporting and refresh robustness#4
Conversation
format_response reported the wrong source for several fields: - config_from_crn checked the CachedResponse object, which always exists, so it was always True. Check config.data instead. - The usage debug fields (debug_usage_from_crn_at, usage_from_crn_error) read from the config response, so a failed system fetch was invisible. Read them from the system response. fetch_node_list_and_node_data used "assert node_list", which raised inside the background task (and is stripped under python -O) when the upstream fetch returned nothing. Return early and keep the cached data instead. Background refresh tasks now log their exceptions via a done callback rather than failing silently. crn_infos was a mutable class attribute shared across every DataCache instance, leaking node state between instances (and between tests). Make it a per-instance attribute. Add a test that drives format_response with a failed config and a successful system fetch, which pins the config/system field separation and the config_from_crn check. Remove dead commented-out debug code.
foxpatch-aleph
left a comment
There was a problem hiding this comment.
This PR fixes three distinct bugs in the CRN data pipeline: (1) crn_infos was a mutable class-level default dict shared across all DataCache instances — moved to instance initialization. (2) config_from_crn checked whether the CachedResponse object existed (always True after init) rather than whether it held data — now checks config.data is not None. (3) Usage debug fields were reading from the config CachedResponse instead of the system CachedResponse — corrected both debug_usage_from_crn_at and usage_from_crn_error. The PR also replaces a brittle assert node_list with an early return that preserves cached data on failure, and adds a done callback to log background task exceptions. A focused regression test pins the config/system field separation. All changes are correct, well-motivated, and cleanly scoped.
format_response reported the wrong source for several fields:
fetch_node_list_and_node_data used "assert node_list", which raised inside the background task (and is stripped under python -O) when the upstream fetch returned nothing. Return early and keep the cached data instead. Background refresh tasks now log their exceptions via a done callback rather than failing silently.
crn_infos was a mutable class attribute shared across every DataCache instance, leaking node state between instances (and between tests). Make it a per-instance attribute.
Add a test that drives format_response with a failed config and a successful system fetch, which pins the config/system field separation and the config_from_crn check. Remove dead commented-out debug code.