Skip to content

Implement device-diff endpoints (closes #61 backend portion)#66

Merged
Anthony Sligar (sligara7) merged 2 commits into
NSLS2:mainfrom
sligara7:feat/profile-collection-device-diff
Jun 13, 2026
Merged

Implement device-diff endpoints (closes #61 backend portion)#66
Anthony Sligar (sligara7) merged 2 commits into
NSLS2:mainfrom
sligara7:feat/profile-collection-device-diff

Conversation

@sligara7

Copy link
Copy Markdown
Collaborator

Summary

Replaces the two 501 stubs landed in #62 with working implementations of the device-diff endpoints from issue #61:

  • GET /api/devices/diff_against_profile — non-destructive preview of what would change if the configuration-service registry were synced to match the running RE Worker's introspected device list.
  • POST /api/devices/sync_from_profile — apply that diff per the requested strategy (all, additions_only, or selected).

These are the missing pieces of the UI-driven "reload profile collection" flow: once an operator pulls and reloads the profile (#62's pull + reload endpoints), they need a UI affordance to inspect and optionally apply device changes to the registry. With this PR the whole flow is reachable from the frontend without any terminal access.

What's here

manager/config_service.py — new pure helper compute_diff() and apply_diff() async function, plus a DeviceDiff dataclass and an APPLY_STRATEGIES constant. _bootstrap_with_retry and sync_devices_on_env_open from #65 are left untouched; the new functions sit beside them.

  • compute_diff walks the worker's device-data payload (the same snapshot the env-open sync consumes) against the registry's instantiation specs. "Modified" detection uses structural equality on the spec dicts and reports the set of top-level keys that changed.
  • apply_diff runs upserts and deletes in two passes (upserts first, so a rename lands as add-then-remove rather than briefly removing the only copy). Per-bucket gathers use return_exceptions=True so one slow/failing device doesn't stall the rest; aggregated failures raise a ConfigServiceError chaining the first underlying exception — same pattern _bootstrap_with_retry uses for the no-silent-fallback policy.

manager/manager.py — two new dispatcher entries (config_service_diff, config_service_sync) and matching handler methods. Both check _config_service_settings.enabled and _environment_exists; on either gate they return success=false with a message for the HTTP layer to translate to 409. The sync handler runs under the same async lock the env-open sync uses (_get_config_service_sync_alock) so a concurrent env-open + manual sync can't race on the registry.

http/routers/profile_collection.py — live handlers replace the 501 stubs and dispatch via SR.RM.send_request(method=..., params=...). Pydantic request and response models added so the OpenAPI export documents the shape. Manager-side success=false is mapped to HTTP 409 with the manager's message as detail (same convention used by pull / reload).

tests/manager/test_config_service.py — 15 new tests:

  • compute_diff: empty / all-added / all-removed / modified-with-changed-fields / noop / spec-less skip / sort order.
  • apply_diff: each strategy (incl. selected filtering names that aren't in the diff), selected without devices, invalid strategy, per-device failure aggregation, empty-diff noop, plus a guard that APPLY_STRATEGIES stays in sync with the documented set.

shared-schema/queueserver_service.openapi.json — regenerated via scripts/export_openapi.py. Path count is unchanged (the stub paths already existed); the new bodies bring request schemas, response schemas, and proper status code maps.

Verification

$ pytest tests/manager/test_config_service.py -q
88 passed in 0.25s

$ pytest tests/http/test_openapi_drift.py -q
1 passed

Notes for reviewers

  • Issue Profile-collection reload + device-registry diff as a UI flow (backend endpoints) #61 mentions write:registry as the eventual RBAC scope for the sync endpoint; this PR keeps write:manager:control to match what reload / pull already use. Swapping the scope is a one-line follow-up once the registry scope is introduced — no architectural change needed.
  • The diff response carries the full before/after spec dicts for each modified entry so the UI can render a per-field diff without a second round-trip. That's consistent with the issue's "show the operator what's about to change" requirement.
  • The two endpoints share the manager-side snapshot but make independent ZMQ round trips, so there is a small window where the diff returned by GET can become stale before the operator confirms POST. The sync handler recomputes the diff internally before applying — a stale GET cannot produce a sync that mutates devices the registry actually still matches — and it returns diff_after so the UI can confirm convergence.

Replaces the two 501 stubs from NSLS2#62 with working implementations:

- GET  /api/devices/diff_against_profile
- POST /api/devices/sync_from_profile

The UI flow described in issue NSLS2#61 needs two operations: a non-destructive
"what would change?" preview, and an apply step the operator confirms.
Both live on the HTTP server today as endpoints the operator's UI hits;
no terminal access required, matching the deployment story from the IOS
demo (NSLS2/ansible#4476).

Implementation:

* manager/config_service.py
  - New DeviceDiff dataclass and compute_diff() — pure function that
    compares the worker-introspected device payload dict (the same
    snapshot used by _bootstrap_with_retry / sync_devices_on_env_open)
    against the registry's instantiation specs. "Modified" detection
    uses structural equality on the spec dicts and reports the set of
    top-level keys that changed, matching issue NSLS2#61's "structural
    equality over the metadata+spec payloads already sent to
    config_service".
  - New apply_diff() — concurrent upserts then concurrent deletes
    (so a rename lands as add-then-remove rather than briefly
    removing the only copy), with the same return_exceptions /
    aggregate-and-raise pattern _bootstrap_with_retry uses for the
    no-silent-fallback policy. Strategies: "all", "additions_only",
    "selected".
  - Existing _bootstrap_with_retry and sync_devices_on_env_open are
    untouched.

* manager/manager.py
  - Two new dispatcher entries (config_service_diff, config_service_sync)
    and matching handler methods. Both gate on the existing
    _config_service_settings.enabled and _environment_exists flags,
    returning success=false with a message for the HTTP layer to
    translate. The sync handler runs under the same async lock the
    env-open sync uses (_get_config_service_sync_alock) so a concurrent
    env-open + manual sync can't race on the registry.

* http/routers/profile_collection.py
  - Live endpoints in place of the 501 stubs, wired through
    SR.RM.send_request("config_service_diff" | "config_service_sync").
    Manager-side success=false is translated to HTTP 409 (with the
    manager's message as detail) — matches the convention already used
    by the pull/reload endpoints in this file. Pydantic request and
    response models added so the OpenAPI export documents the shape.

* tests/manager/test_config_service.py
  - 15 new tests covering compute_diff (empty / all-added / all-removed /
    modified-with-changed-fields / noop / spec-less skip / sort order)
    and apply_diff (each strategy, "selected" without devices, invalid
    strategy, per-device failure aggregation, empty-diff noop, plus a
    guard that the APPLY_STRATEGIES constant stays in sync with the
    documented set).

* shared-schema/queueserver_service.openapi.json
  - Regenerated via scripts/export_openapi.py. The path count is
    unchanged (the stub paths already existed); the new bodies bring
    request schemas, response schemas, and proper status code maps.

Verification: tests/manager/test_config_service.py and
tests/http/test_openapi_drift.py both pass locally.

Forward compatibility: per NSLS2#61 the eventual RBAC scope for the sync
endpoint should be "write:registry"; today it uses
"write:manager:control" to match what the reload/pull endpoints already
use. Swapping the scope is a one-line follow-up once the registry
scope is introduced.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the remaining backend pieces of the “profile collection reload + device registry diff/sync” UI flow by replacing the prior 501 stubs with working device-diff and device-sync endpoints, backed by new diff/apply helpers in the config-service integration.

Changes:

  • Add compute_diff()/apply_diff() helpers (plus DeviceDiff model + strategy handling) to diff worker-introspected devices against the config-service registry and apply the result.
  • Add manager-side ZMQ handlers and HTTP routes for GET /api/devices/diff_against_profile and POST /api/devices/sync_from_profile, including request/response models.
  • Extend unit tests for diff/apply behavior and regenerate the committed OpenAPI schema.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
backend/queueserver_service/queueserver_service/manager/config_service.py Adds pure diff computation and async “apply diff” writer with strategy selection and aggregated failure reporting.
backend/queueserver_service/queueserver_service/manager/manager.py Registers new manager command handlers for diff/sync (but currently introduces a critical handler-definition regression; see comments).
backend/queueserver_service/queueserver_service/http/routers/profile_collection.py Replaces device endpoints’ 501 stubs with live implementations and adds Pydantic request/response models.
backend/queueserver_service/tests/manager/test_config_service.py Adds focused unit tests for compute_diff / apply_diff behavior and strategies.
shared-schema/queueserver_service.openapi.json Regenerates OpenAPI to reflect the now-live endpoints and their schemas/status codes.
Comments suppressed due to low confidence (1)

backend/queueserver_service/queueserver_service/manager/manager.py:3670

  • The existing script_upload command handler appears to have been accidentally inlined into _config_service_sync_handler: there is no async def _script_upload_handler(...) definition in this file anymore (but _builtin_command_handlers still references it). This will break the script_upload ZMQ method at runtime (missing attribute) and also leaves a large unreachable block of unrelated code inside _config_service_sync_handler.
        """
        Upload script to RE worker environment. If ``update_lists==True`` (default), then lists
        of existing and available plans and devices are updated after the execution of the script.
        If ``update_re==False`` (default), the Run Engine (``RE``) and Data Broker (``db``) objects
        are not updated in RE worker namespace even if they are defined (or redefined) in the uploaded script.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The previous commit inserted the two new config-service handlers
immediately before _script_upload_handler and accidentally consumed
the 'async def _script_upload_handler(self, request):' line, leaving
the body dangling at class scope. Import didn't fail because the
class-level NameError on 'request' was swallowed by the body's own
try/except Exception, and ast.parse / hasattr checks happened to pass
because the def line absence wasn't enforced at module load. CI for
PR NSLS2#66 caught it via 22 ip_kernel_func tests failing with
"Handler for the command 'script_upload' is not implemented".

Restoring the def line; no logic change to the handler body.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@sligara7 Anthony Sligar (sligara7) merged commit 48a9876 into NSLS2:main Jun 13, 2026
14 of 15 checks passed
@sligara7 Anthony Sligar (sligara7) deleted the feat/profile-collection-device-diff branch June 13, 2026 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants