Implement device-diff endpoints (closes #61 backend portion)#66
Merged
Anthony Sligar (sligara7) merged 2 commits intoJun 13, 2026
Merged
Conversation
Replaces the two 501 stubs from NSLS2#62 with working implementations: - GET /api/devices/diff_against_profile - POST /api/devices/sync_from_profile The UI flow described in issue NSLS2#61 needs two operations: a non-destructive "what would change?" preview, and an apply step the operator confirms. Both live on the HTTP server today as endpoints the operator's UI hits; no terminal access required, matching the deployment story from the IOS demo (NSLS2/ansible#4476). Implementation: * manager/config_service.py - New DeviceDiff dataclass and compute_diff() — pure function that compares the worker-introspected device payload dict (the same snapshot used by _bootstrap_with_retry / sync_devices_on_env_open) against the registry's instantiation specs. "Modified" detection uses structural equality on the spec dicts and reports the set of top-level keys that changed, matching issue NSLS2#61's "structural equality over the metadata+spec payloads already sent to config_service". - New apply_diff() — concurrent upserts then concurrent deletes (so a rename lands as add-then-remove rather than briefly removing the only copy), with the same return_exceptions / aggregate-and-raise pattern _bootstrap_with_retry uses for the no-silent-fallback policy. Strategies: "all", "additions_only", "selected". - Existing _bootstrap_with_retry and sync_devices_on_env_open are untouched. * manager/manager.py - Two new dispatcher entries (config_service_diff, config_service_sync) and matching handler methods. Both gate on the existing _config_service_settings.enabled and _environment_exists flags, returning success=false with a message for the HTTP layer to translate. The sync handler runs under the same async lock the env-open sync uses (_get_config_service_sync_alock) so a concurrent env-open + manual sync can't race on the registry. * http/routers/profile_collection.py - Live endpoints in place of the 501 stubs, wired through SR.RM.send_request("config_service_diff" | "config_service_sync"). Manager-side success=false is translated to HTTP 409 (with the manager's message as detail) — matches the convention already used by the pull/reload endpoints in this file. Pydantic request and response models added so the OpenAPI export documents the shape. * tests/manager/test_config_service.py - 15 new tests covering compute_diff (empty / all-added / all-removed / modified-with-changed-fields / noop / spec-less skip / sort order) and apply_diff (each strategy, "selected" without devices, invalid strategy, per-device failure aggregation, empty-diff noop, plus a guard that the APPLY_STRATEGIES constant stays in sync with the documented set). * shared-schema/queueserver_service.openapi.json - Regenerated via scripts/export_openapi.py. The path count is unchanged (the stub paths already existed); the new bodies bring request schemas, response schemas, and proper status code maps. Verification: tests/manager/test_config_service.py and tests/http/test_openapi_drift.py both pass locally. Forward compatibility: per NSLS2#61 the eventual RBAC scope for the sync endpoint should be "write:registry"; today it uses "write:manager:control" to match what the reload/pull endpoints already use. Swapping the scope is a one-line follow-up once the registry scope is introduced. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR implements the remaining backend pieces of the “profile collection reload + device registry diff/sync” UI flow by replacing the prior 501 stubs with working device-diff and device-sync endpoints, backed by new diff/apply helpers in the config-service integration.
Changes:
- Add
compute_diff()/apply_diff()helpers (plusDeviceDiffmodel + strategy handling) to diff worker-introspected devices against the config-service registry and apply the result. - Add manager-side ZMQ handlers and HTTP routes for
GET /api/devices/diff_against_profileandPOST /api/devices/sync_from_profile, including request/response models. - Extend unit tests for diff/apply behavior and regenerate the committed OpenAPI schema.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
backend/queueserver_service/queueserver_service/manager/config_service.py |
Adds pure diff computation and async “apply diff” writer with strategy selection and aggregated failure reporting. |
backend/queueserver_service/queueserver_service/manager/manager.py |
Registers new manager command handlers for diff/sync (but currently introduces a critical handler-definition regression; see comments). |
backend/queueserver_service/queueserver_service/http/routers/profile_collection.py |
Replaces device endpoints’ 501 stubs with live implementations and adds Pydantic request/response models. |
backend/queueserver_service/tests/manager/test_config_service.py |
Adds focused unit tests for compute_diff / apply_diff behavior and strategies. |
shared-schema/queueserver_service.openapi.json |
Regenerates OpenAPI to reflect the now-live endpoints and their schemas/status codes. |
Comments suppressed due to low confidence (1)
backend/queueserver_service/queueserver_service/manager/manager.py:3670
- The existing
script_uploadcommand handler appears to have been accidentally inlined into_config_service_sync_handler: there is noasync def _script_upload_handler(...)definition in this file anymore (but_builtin_command_handlersstill references it). This will break thescript_uploadZMQ method at runtime (missing attribute) and also leaves a large unreachable block of unrelated code inside_config_service_sync_handler.
"""
Upload script to RE worker environment. If ``update_lists==True`` (default), then lists
of existing and available plans and devices are updated after the execution of the script.
If ``update_re==False`` (default), the Run Engine (``RE``) and Data Broker (``db``) objects
are not updated in RE worker namespace even if they are defined (or redefined) in the uploaded script.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The previous commit inserted the two new config-service handlers immediately before _script_upload_handler and accidentally consumed the 'async def _script_upload_handler(self, request):' line, leaving the body dangling at class scope. Import didn't fail because the class-level NameError on 'request' was swallowed by the body's own try/except Exception, and ast.parse / hasattr checks happened to pass because the def line absence wasn't enforced at module load. CI for PR NSLS2#66 caught it via 22 ip_kernel_func tests failing with "Handler for the command 'script_upload' is not implemented". Restoring the def line; no logic change to the handler body. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the two 501 stubs landed in #62 with working implementations of the device-diff endpoints from issue #61:
GET /api/devices/diff_against_profile— non-destructive preview of what would change if the configuration-service registry were synced to match the running RE Worker's introspected device list.POST /api/devices/sync_from_profile— apply that diff per the requested strategy (all,additions_only, orselected).These are the missing pieces of the UI-driven "reload profile collection" flow: once an operator pulls and reloads the profile (#62's
pull+reloadendpoints), they need a UI affordance to inspect and optionally apply device changes to the registry. With this PR the whole flow is reachable from the frontend without any terminal access.What's here
manager/config_service.py— new pure helpercompute_diff()andapply_diff()async function, plus aDeviceDiffdataclass and anAPPLY_STRATEGIESconstant._bootstrap_with_retryandsync_devices_on_env_openfrom #65 are left untouched; the new functions sit beside them.compute_diffwalks the worker's device-data payload (the same snapshot the env-open sync consumes) against the registry's instantiation specs. "Modified" detection uses structural equality on the spec dicts and reports the set of top-level keys that changed.apply_diffruns upserts and deletes in two passes (upserts first, so a rename lands as add-then-remove rather than briefly removing the only copy). Per-bucket gathers usereturn_exceptions=Trueso one slow/failing device doesn't stall the rest; aggregated failures raise aConfigServiceErrorchaining the first underlying exception — same pattern_bootstrap_with_retryuses for the no-silent-fallback policy.manager/manager.py— two new dispatcher entries (config_service_diff,config_service_sync) and matching handler methods. Both check_config_service_settings.enabledand_environment_exists; on either gate they returnsuccess=falsewith a message for the HTTP layer to translate to 409. The sync handler runs under the same async lock the env-open sync uses (_get_config_service_sync_alock) so a concurrent env-open + manual sync can't race on the registry.http/routers/profile_collection.py— live handlers replace the 501 stubs and dispatch viaSR.RM.send_request(method=..., params=...). Pydantic request and response models added so the OpenAPI export documents the shape. Manager-sidesuccess=falseis mapped to HTTP 409 with the manager's message as detail (same convention used bypull/reload).tests/manager/test_config_service.py— 15 new tests:compute_diff: empty / all-added / all-removed / modified-with-changed-fields / noop / spec-less skip / sort order.apply_diff: each strategy (incl.selectedfiltering names that aren't in the diff),selectedwithout devices, invalid strategy, per-device failure aggregation, empty-diff noop, plus a guard thatAPPLY_STRATEGIESstays in sync with the documented set.shared-schema/queueserver_service.openapi.json— regenerated viascripts/export_openapi.py. Path count is unchanged (the stub paths already existed); the new bodies bring request schemas, response schemas, and proper status code maps.Verification
Notes for reviewers
write:registryas the eventual RBAC scope for the sync endpoint; this PR keepswrite:manager:controlto match whatreload/pullalready use. Swapping the scope is a one-line follow-up once the registry scope is introduced — no architectural change needed.before/afterspec dicts for each modified entry so the UI can render a per-field diff without a second round-trip. That's consistent with the issue's "show the operator what's about to change" requirement.GETcan become stale before the operator confirmsPOST. The sync handler recomputes the diff internally before applying — a stale GET cannot produce a sync that mutates devices the registry actually still matches — and it returnsdiff_afterso the UI can confirm convergence.