Fix thv_reg_srv_servers_total/skills_total staleness and labels#784
Merged
Conversation
Switch the two registry-count gauges from synchronous `Int64Gauge` to `Int64ObservableGauge` and source their values from the database at collection time. The previous implementation only recorded values inside the success branch of `coordinator.performRegistrySync`, so once `ShouldSync` returned up-to-date-no-policy the OTel Prometheus exporter stopped seeing fresh observations and dropped the series. The callback queries `source LEFT JOIN registry_entry` grouped by source name, which also closes the gap where `kubernetes` and `managed` source types never appeared in the metrics at all — they live in the `source` table even though they bypass the sync coordinator. Counting `registry_entry` rather than `entry_version` makes the gauge report distinct servers/skills per source, matching what the consumer UIs show. Rename the misleading `registry` attribute to `source` (the value was always a source name, never a registry name) and update the bundled Grafana dashboard and observability docs to match. Fixes #783
`revive`'s `context-as-argument` linter requires `context.Context` to be the first parameter of every function.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #784 +/- ##
==========================================
+ Coverage 61.38% 61.55% +0.16%
==========================================
Files 109 110 +1
Lines 10637 10667 +30
==========================================
+ Hits 6530 6566 +36
+ Misses 3527 3515 -12
- Partials 580 586 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
danbarr
approved these changes
May 12, 2026
Contributor
danbarr
left a comment
There was a problem hiding this comment.
Metrics return as expected now in the demo-sandbox dashboard, LGTM!
danbarr
added a commit
to StacklokLabs/toolhive-demo-sandbox
that referenced
this pull request
May 12, 2026
- Update to align with stacklok/toolhive-registry-server#784 - Add panels for skills by source and sync operations Signed-off-by: Dan Barr <6922515+danbarr@users.noreply.github.com>
danbarr
added a commit
to StacklokLabs/toolhive-demo-sandbox
that referenced
this pull request
May 12, 2026
- Update to align with stacklok/toolhive-registry-server#784 - Add panels for skills by source and sync operations Signed-off-by: Dan Barr <6922515+danbarr@users.noreply.github.com> Co-authored-by: Dan Barr <6922515+danbarr@users.noreply.github.com>
github-actions Bot
added a commit
to stacklok/docs-website
that referenced
this pull request
May 12, 2026
The 'registry' attribute on thv_reg_srv_servers_total and thv_reg_srv_skills_total was renamed to 'source' in toolhive-registry-server v1.4.3 (stacklok/toolhive-registry-server#784). Update the metrics reference table and add a note about the rename.
rdimitrov
pushed a commit
to stacklok/docs-website
that referenced
this pull request
May 12, 2026
* Update stacklok/toolhive-registry-server to v1.4.3 Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Update registry metrics labels for v1.4.3 The 'registry' attribute on thv_reg_srv_servers_total and thv_reg_srv_skills_total was renamed to 'source' in toolhive-registry-server v1.4.3 (stacklok/toolhive-registry-server#784). Update the metrics reference table and add a note about the rename. * Editorial polish for registry metrics note Use parallel "per source" phrasing in the table, switch the v1.4.3 admonition to :::warning since it documents a breaking change to user queries, and align terminology on "label" to match the table header. --------- Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
thv_reg_srv_servers_totalandthv_reg_srv_skills_totalfrom synchronousInt64GaugetoInt64ObservableGauge, with a callback that reads counts from the database at every scrape. The series no longer go stale once syncs quiesce.registryattribute tosource— the value was always a source name. Bundled Grafana dashboard anddocs/observability.mdupdated to match.registry_entry(notentry_version), grouped oversource LEFT JOIN registry_entry. This both matches the "latest only" semantics that consumer UIs show and includeskubernetes/managedsource types that previously never appeared in the metrics because they bypass the sync coordinator.Test plan
task gen && task lint-fix && task testTestDatabaseFactory_CreateRegistryMetricsReader) asserts: 1 entry with 2 versions ⇒ count 1, empty source ⇒ 0/0registrylabel will need to switch tosourceFixes #783