Feat:Health check adapter#4330
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces an adapter health-check system that periodically evaluates (1) extension-service/backend availability and (2) adapter data-source connectivity, surfaces the aggregated status via new REST endpoints, and adds UI affordances (status light + details dialog + manual trigger) to inspect and trigger checks.
Changes:
- Added backend + extensions REST resources and shared model types for adapter health status reporting and triggering.
- Implemented a central health-check scheduler/manager in extensions-management and added data-source health checks for Kafka, MQTT, and OPC-UA adapters.
- Added Angular UI integration: polling health status, passing it into the status light, and a new details dialog with countdown + manual trigger.
Reviewed changes
Copilot reviewed 23 out of 24 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/src/app/connect/services/adapter-health.service.ts | Angular HTTP wrapper for adapter-health list + trigger endpoints |
| ui/src/app/connect/model/adapter-health-status.model.ts | UI model/types for adapter health status + enum |
| ui/src/app/connect/components/existing-adapters/existing-adapters.component.ts | Adds periodic polling and stores adapter health statuses for the overview table |
| ui/src/app/connect/components/existing-adapters/existing-adapters.component.html | Wires health status into the status light component |
| ui/src/app/connect/components/existing-adapters/adapter-status-light/adapter-status-light.component.ts | Enhances status light to reflect health state and open health details dialog |
| ui/src/app/connect/components/existing-adapters/adapter-status-light/adapter-status-light.component.html | Makes status indicator clickable with tooltip and dynamic class binding |
| ui/src/app/connect/components/existing-adapters/adapter-status-light/adapter-status-light.component.scss | Adds hover/transition styling for clickable status light |
| ui/src/app/connect/components/existing-adapters/adapter-health-details-dialog/adapter-health-details-dialog.component.ts | New dialog to show backend/data-source health, countdown, and trigger action |
| ui/src/app/connect/components/existing-adapters/adapter-health-details-dialog/adapter-health-details-dialog.component.html | Dialog UI rendering of health indicators, details, and backoff info |
| ui/src/app/connect/components/existing-adapters/adapter-health-details-dialog/adapter-health-details-dialog.component.scss | Styling for dialog layout and health status presentation |
| ui/package-lock.json | Dependency lockfile updates (eslint packages and transitive bumps) |
| streampipes-rest/src/main/java/org/apache/streampipes/rest/impl/AdapterHealthResource.java | New backend aggregator endpoint (/api/v2) to query/trigger extension adapter-health endpoints |
| streampipes-rest-extensions/src/main/java/org/apache/streampipes/rest/extensions/monitoring/AdapterHealthResource.java | New extensions endpoint (/api/v1) exposing manager health statuses + trigger |
| streampipes-model/src/main/java/org/apache/streampipes/model/connect/adapter/HealthCheckStatus.java | New shared enum for health check statuses |
| streampipes-model/src/main/java/org/apache/streampipes/model/connect/adapter/AdapterHealthStatus.java | New shared model representing adapter health and scheduling metadata |
| streampipes-extensions/streampipes-connectors-opcua/src/main/java/org/apache/streampipes/extensions/connectors/opcua/adapter/OpcUaAdapter.java | Adds OPC-UA data-source health check with reconnect/self-heal attempt |
| streampipes-extensions/streampipes-connectors-mqtt/src/main/java/org/apache/streampipes/extensions/connectors/mqtt/shared/MqttHealthChecker.java | New MQTT broker/topic connectivity checker used by health checks |
| streampipes-extensions/streampipes-connectors-mqtt/src/main/java/org/apache/streampipes/extensions/connectors/mqtt/shared/MqttBase.java | Refactors MQTT client creation to allow reuse for health-checker vs runtime client |
| streampipes-extensions/streampipes-connectors-mqtt/src/main/java/org/apache/streampipes/extensions/connectors/mqtt/adapter/MqttProtocol.java | Implements IDataSourceHealthCheck for MQTT adapters |
| streampipes-extensions/streampipes-connectors-kafka/src/main/java/org/apache/streampipes/extensions/connectors/kafka/adapter/KafkaProtocol.java | Implements IDataSourceHealthCheck for Kafka adapters (broker + topic existence) |
| streampipes-extensions-management/src/main/java/org/apache/streampipes/extensions/management/monitoring/AdapterHealthCheckManager.java | New scheduler/manager maintaining adapter health statuses and exponential backoff |
| streampipes-extensions-management/src/main/java/org/apache/streampipes/extensions/management/init/RunningAdapterInstances.java | Registers/unregisters adapters with the health-check manager |
| streampipes-extensions-api/src/main/java/org/apache/streampipes/extensions/api/connect/IDataSourceHealthCheck.java | New SPI interface for adapters to expose data-source health checks |
| streampipes-extensions-api/src/main/java/org/apache/streampipes/extensions/api/connect/DataSourceHealthCheckResult.java | New result record for health checks (status/message/details/exception) |
Files not reviewed (1)
- ui/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (backendHealth != HealthCheckStatus.HEALTHY) { | ||
| overallStatus = HealthCheckStatus.UNHEALTHY; | ||
| } else if (!dataSourceHealthSupported) { | ||
| overallStatus = backendHealth; | ||
| } else { | ||
| overallStatus = dataSourceHealth; |
There was a problem hiding this comment.
updateOverallStatus treats any non-HEALTHY backend status (including UNKNOWN) as UNHEALTHY, which makes the overall status misleading when the backend state is simply unknown. Consider explicitly propagating UNKNOWN (e.g., if backendHealth is UNKNOWN then overallStatus should be UNKNOWN) and only mark UNHEALTHY when backendHealth is UNHEALTHY.
| if (backendHealth != HealthCheckStatus.HEALTHY) { | |
| overallStatus = HealthCheckStatus.UNHEALTHY; | |
| } else if (!dataSourceHealthSupported) { | |
| overallStatus = backendHealth; | |
| } else { | |
| overallStatus = dataSourceHealth; | |
| if (backendHealth == HealthCheckStatus.UNKNOWN) { | |
| // Backend health is unknown, so the overall status should also be unknown | |
| overallStatus = HealthCheckStatus.UNKNOWN; | |
| } else if (backendHealth == HealthCheckStatus.UNHEALTHY) { | |
| // A clearly unhealthy backend makes the overall status unhealthy | |
| overallStatus = HealthCheckStatus.UNHEALTHY; | |
| } else { | |
| // Backend is healthy here | |
| if (!dataSourceHealthSupported) { | |
| // No data source health information; overall follows backend health (HEALTHY) | |
| overallStatus = backendHealth; | |
| } else if (dataSourceHealth == HealthCheckStatus.UNKNOWN) { | |
| // Data source health is unknown, so reflect that in the overall status | |
| overallStatus = HealthCheckStatus.UNKNOWN; | |
| } else { | |
| // Use the concrete data source health status (HEALTHY or UNHEALTHY) | |
| overallStatus = dataSourceHealth; | |
| } |
| return adapter[column]; | ||
| }; | ||
|
|
||
| this.healthPoll$ = interval(5000).subscribe(() => { |
There was a problem hiding this comment.
The UI polls /adapter-health every 5s (interval(5000)), even though successful checks run every ~60s and failures back off. This creates unnecessary backend traffic and can become expensive with many users/adapters. Consider polling less frequently (e.g., 30–60s), polling only when the health dialog is open, or switching to a push/refresh-on-demand approach.
| this.healthPoll$ = interval(5000).subscribe(() => { | |
| this.healthPoll$ = interval(60000).subscribe(() => { |
| @RestController | ||
| @RequestMapping("/api/v2/adapter-health") | ||
| public class AdapterHealthResource extends AbstractAuthGuardedRestResource { |
There was a problem hiding this comment.
PR description lists the public REST endpoint as /api/v1/adapter-health, but the UI calls /api/v2/adapter-health (and this controller is mapped to /api/v2/adapter-health). If v1 is intended only for the extensions service and v2 for the backend aggregator, consider updating the documentation to avoid confusion for API consumers.
| if (!connectLatch.await(TIMEOUT_SECONDS, TimeUnit.SECONDS)) { | ||
| return DataSourceHealthCheckResult.unhealthyWithException( | ||
| "MQTT connection timed out", | ||
| new java.util.concurrent.TimeoutException("Timed out waiting for broker response after " + TIMEOUT_SECONDS + "s") | ||
| ); |
There was a problem hiding this comment.
On connect timeout the method returns without disconnecting/closing the client. Consider disconnecting/closing the client before returning (or using a finally block) to avoid leaving background resources running after a timed-out connection attempt.
| if (connectError.get() != null) { | ||
| return DataSourceHealthCheckResult.unhealthyWithException("MQTT connection failed", connectError.get()); | ||
| } |
There was a problem hiding this comment.
On connect failure (connectError != null) the method returns without disconnecting/closing the client. Consider cleaning up the client before returning (or using a finally block) to avoid leaking resources after failed connection attempts.
| Thread.currentThread().interrupt(); | ||
| } | ||
| } | ||
|
|
||
| return List.of(); |
There was a problem hiding this comment.
When an exception occurs while calling the extensions service, the method falls through to return List.of(). Returning an empty list discards the failure signal, and upstream code treats it as “no statuses” (and then reports adapters as backend HEALTHY via fallback). Consider propagating a distinct error result (or throwing) so the caller can mark backend health as UNHEALTHY/UNKNOWN for that endpoint.
| status.setBackendHealth(HealthCheckStatus.HEALTHY); | ||
| status.setBackendHealthMessage("Extension service is running"); |
There was a problem hiding this comment.
createFallbackStatus hardcodes backendHealth=HEALTHY / "Extension service is running". This becomes inaccurate when health status fetching fails (e.g., extensions service down or request error). Consider distinguishing “no data yet” from “extensions service unreachable” and setting backend health to UNHEALTHY/UNKNOWN with an appropriate message in the latter case.
| @org.springframework.web.bind.annotation.PostMapping(value = "/{adapterId}/trigger", produces = MediaType.APPLICATION_JSON_VALUE) | ||
| @PreAuthorize("this.hasReadAuthority()") | ||
| public ResponseEntity<Void> triggerAdapterHealthCheck(@org.springframework.web.bind.annotation.PathVariable String adapterId) { | ||
| try { | ||
| var adapter = adapterStorage.getElementById(adapterId); |
There was a problem hiding this comment.
triggerAdapterHealthCheck is only guarded by hasReadAuthority() but does not enforce per-adapter permissions (unlike other adapter endpoints). A user with general READ privilege could trigger checks for adapters they are not allowed to read; add an adapter-level permission check (and return 401/403 when unauthorized) before forwarding the trigger request.
| status.setAdapterName(description.getName()); | ||
| status.setBackendHealth(HealthCheckStatus.HEALTHY); | ||
| status.setDataSourceHealthSupported(adapter instanceof IDataSourceHealthCheck); | ||
| status.setDataSourceHealth(status.isDataSourceHealthSupported() ? HealthCheckStatus.UNKNOWN : HealthCheckStatus.UNKNOWN); |
There was a problem hiding this comment.
This ternary is redundant: both branches set HealthCheckStatus.UNKNOWN. This can be simplified to a single assignment (and if desired, initialize an explanatory message when data source checks are unsupported).
| status.setDataSourceHealth(status.isDataSourceHealthSupported() ? HealthCheckStatus.UNKNOWN : HealthCheckStatus.UNKNOWN); | |
| status.setDataSourceHealth(HealthCheckStatus.UNKNOWN); |
| class="status-light-container" | ||
| (click)="openHealthDetails($event)" |
There was a problem hiding this comment.
The status light is a clickable <div> without keyboard interaction. For accessibility, use a <button> (preferred) or add role="button", tabindex="0", and key handlers for Enter/Space so the dialog can be opened via keyboard and announced correctly to assistive tech.
| class="status-light-container" | |
| (click)="openHealthDetails($event)" | |
| class="status-light-container" | |
| role="button" | |
| tabindex="0" | |
| (click)="openHealthDetails($event)" | |
| (keyup.enter)="openHealthDetails($event)" | |
| (keyup.space)="openHealthDetails($event)" |
Fix formatting issue by adding a newline at the end of the file.
|
Hello there 👋 |
|
Hello there 👋 |
Adapter Health Check System
How to use
Overview
The Adapter Health Check system monitors the health of running adapters by periodically checking both the backend service status and the data source connectivity.
Supported Adapters
Health Check Intervals
Architecture
Core Files
streampipes-extensions-api/.../IDataSourceHealthCheck.javastreampipes-extensions-api/.../DataSourceHealthCheckResult.javastreampipes-extensions-management/.../AdapterHealthCheckManager.javastreampipes-rest-extensions/.../AdapterHealthResource.javastreampipes-model/.../AdapterHealthStatus.javastreampipes-model/.../HealthCheckStatus.javaUI Files
ui/.../adapter-health.service.tsui/.../adapter-status-light/ui/.../adapter-health/details-dialog/ui/.../adapter-health/status-section/ui/.../adapter-health/error-output/Adding Support for New Adapters
IDataSourceHealthCheckinterface in your adapter class:AdapterHealthCheckManagerautomatically detects adapters implementingIDataSourceHealthCheckwhen they are registered.REST API
/api/v1/adapter-health/api/v1/adapter-health/{adapterId}/api/v1/adapter-health/{adapterId}/triggerSelf-Healing (OPC-UA)
I had the problem that if the adapter fails once, it fails forever (maybe just with my opcua simulator), that means even if the opcua server goes back online the adapter doesn't automatically resubscribe. to fix that the health check is now imitating a restart of the adapter
The OPC-UA adapter includes auto-reconnection logic: