Problem
When debugging ProviderStatus workflows from the Temporal UI we lose two important pieces of information:
-
Provider identity in activity I/O is opaque: processProviderStatus input is just the hex pubkey (66 chars), and on the unreachable/unhealthy path the result is the minimal upsert shape { statusResult: { id, status } }. No human-readable name/url anywhere, so figuring out which provider failed requires manually cross-referencing against listProviders output.
-
Failure reason is collapsed to a parse error: apps/middleman-workflows/src/lib/provider/index.ts:30-63 calls await status.json() without checking response.ok or Content-Type. Any non-JSON body (4xx HTML page, plain-text 502, etc.) throws a SyntaxError that gets logged generically, then the activity returns Unreachable regardless of the real failure mode. For the unhealthy branch the response body is destructured into statusProps and silently dropped.
Evidence (mainnet, 2026-05-26)
| Provider |
Status persisted |
Logged error |
Actual HTTP |
Real body |
| Nodefleet |
unreachable |
SyntaxError: Unexpected non-whitespace character after JSON at position 4 |
404 |
404 page not found |
| vido.info |
unreachable |
SyntaxError: Unexpected token '<', "<!DOCTYPE "... |
401 |
HTML 401 page |
| Poktpool |
unreachable |
SyntaxError: Unexpected token 'B', "Bad Gateway"... |
502 |
Bad Gateway |
| WeaversNodes |
unreachable |
TypeError: fetch failed |
timeout (>10s) |
– |
| Qspider |
unhealthy |
(nothing logged) |
500 |
{"error":"Invalid request"} |
All five problems look the same in the Temporal UI: statusResult: { id, status }. Operators can't tell "the provider is down" from "we are misconfigured against this provider" without crawling app logs.
Suggested change
1. Carry provider name/url through the activity boundary
- Pass
{ identity, name, url } as the activity input instead of just identity. The workflow already has these from listProviders.
- Always echo
{ name, identity, url } on the activity result, regardless of branch:
{
"name": "Nodefleet",
"identity": "03df…aab1",
"url": "https://igniter.nodefleet.org",
"statusResult": { "id": 9, "status": "unreachable" },
"failure": { "kind": "http_error", "httpStatus": 404, "bodySnippet": "404 page not found" }
}
2. Diagnose the response before parsing
In apps/middleman-workflows/src/lib/provider/index.ts status():
- Check
response.ok first. If not OK, capture { httpStatus, statusText, bodySnippet: text.slice(0, 200), contentType } and return Unreachable with that context.
- Only call
response.json() after confirming Content-Type is JSON-ish.
- For network errors (the current
catch), capture error.cause?.code (DNS, ECONNRESET, UND_ERR_HEADERS_TIMEOUT, etc.) and surface it as failure: { kind: "network", code: "...", message: "..." }.
3. Preserve the unhealthy reason
On the !healthy branch, keep the provider's self-reported diagnostics (e.g. reason, failingChecks, whatever the response carries) and include them on the activity result and optionally on the providers.supplier_stats JSONB or a new last_status_reason column.
Why
This is a pure observability change — no behavior change to which providers are marked healthy/unreachable/unhealthy, just makes activity history self-describing in the Temporal UI and lets us tell provider-side outages apart from middleman-side misconfiguration without SSHing into the workflow pod.
Problem
When debugging
ProviderStatusworkflows from the Temporal UI we lose two important pieces of information:Provider identity in activity I/O is opaque:
processProviderStatusinput is just the hex pubkey (66 chars), and on the unreachable/unhealthy path the result is the minimal upsert shape{ statusResult: { id, status } }. No human-readablename/urlanywhere, so figuring out which provider failed requires manually cross-referencing againstlistProvidersoutput.Failure reason is collapsed to a parse error:
apps/middleman-workflows/src/lib/provider/index.ts:30-63callsawait status.json()without checkingresponse.okorContent-Type. Any non-JSON body (4xx HTML page, plain-text 502, etc.) throws aSyntaxErrorthat gets logged generically, then the activity returnsUnreachableregardless of the real failure mode. For theunhealthybranch the response body is destructured intostatusPropsand silently dropped.Evidence (mainnet, 2026-05-26)
SyntaxError: Unexpected non-whitespace character after JSON at position 4404 page not foundSyntaxError: Unexpected token '<', "<!DOCTYPE "...SyntaxError: Unexpected token 'B', "Bad Gateway"...Bad GatewayTypeError: fetch failed{"error":"Invalid request"}All five problems look the same in the Temporal UI:
statusResult: { id, status }. Operators can't tell "the provider is down" from "we are misconfigured against this provider" without crawling app logs.Suggested change
1. Carry provider name/url through the activity boundary
{ identity, name, url }as the activity input instead of justidentity. The workflow already has these fromlistProviders.{ name, identity, url }on the activity result, regardless of branch:{ "name": "Nodefleet", "identity": "03df…aab1", "url": "https://igniter.nodefleet.org", "statusResult": { "id": 9, "status": "unreachable" }, "failure": { "kind": "http_error", "httpStatus": 404, "bodySnippet": "404 page not found" } }2. Diagnose the response before parsing
In
apps/middleman-workflows/src/lib/provider/index.tsstatus():response.okfirst. If not OK, capture{ httpStatus, statusText, bodySnippet: text.slice(0, 200), contentType }and returnUnreachablewith that context.response.json()after confirmingContent-Typeis JSON-ish.catch), captureerror.cause?.code(DNS, ECONNRESET, UND_ERR_HEADERS_TIMEOUT, etc.) and surface it asfailure: { kind: "network", code: "...", message: "..." }.3. Preserve the unhealthy reason
On the
!healthybranch, keep the provider's self-reported diagnostics (e.g.reason,failingChecks, whatever the response carries) and include them on the activity result and optionally on theproviders.supplier_statsJSONB or a newlast_status_reasoncolumn.Why
This is a pure observability change — no behavior change to which providers are marked healthy/unreachable/unhealthy, just makes activity history self-describing in the Temporal UI and lets us tell provider-side outages apart from middleman-side misconfiguration without SSHing into the workflow pod.