fix(viewer): compare-mode bridge KPIs render both arms in section + Notebook by thinkingfish · Pull Request #910 · iopsystems/rezolus

thinkingfish · 2026-05-11T07:37:34Z

Summary

Three related bugs along the compare-mode rendering path for category-bridge KPIs (e.g. `inference-library` TTFT/Prompt Token Rate, where the dashboard generates a vllm-side query and a sglang-side query for one chart). Hit at the same time when pinning a bridge chart into the Notebook.

1. Asymmetric label extraction for percentile (scatter) charts

Baseline read `spec.series_names` (string list, lossy on multi-dim metrics — only the first non-`name` label survives) then ran `canonicalQuantileLabel` over the string. Experiment read `item.metric` directly via `canonicalQuantileLabel(item)` which preferred `metric.quantile` and dropped extra dims. The two sides produced disjoint match keys → "compare: no shared labels between captures."

Fix: `composeScatterLabel(metric, options)` in `compare_math.js` — multi-dim aware, with an `excludeValues` set (the category bridge) that drops dims whose values are capture-identity names. `applyResultToPlot` now also stashes `spec.series_metrics` so the baseline path has the raw metric objects available. Both extracts forward `options` symmetrically. Falls back to `canonicalQuantileLabel` for malformed metrics.

2. Pin loses bridge metadata

Pinning a bridge chart kept only `spec.promql_query`. Re-rendering in the Notebook ran the baseline query on the experiment side too, returning empty.

Fix: entries gain two optional fields, `promql_query_experiment` and `category_members`, captured at pin time from the spec + Group's `sectionMetadata`. Round-tripped through localStorage persist/restore and JSON export/import, defaulting to null on older payloads. `chartLoaderMixin` restores `promql_query_experiment` onto the rebuilt spec; NotebookView passes `category_members` to `CompareChartWrapper` as `categoryMembers`.

3. Pinned charts node-inject the wrong host

NotebookView mounted `CompareChartWrapper` with `sectionRoute: '/notebook'`. `buildEffectiveQuery` skips node-label injection for `/service/*` routes (service KPIs are scoped by `source=...`, not node), but `/notebook` doesn't qualify, so the experiment-side fetch ended up with the baseline's hostname injected into the bridge query — zero matches against the experiment recording, server returned "Metric not found". Only the baseline trace rendered.

Fix: use `entry.section` (the chart's home route) instead of `/notebook`. Bridge KPIs now skip node injection symmetrically with their original section view. Non-bridge pinned charts also keep their original section context — slightly different from before but more consistent.

Plumbing

`renderServiceSection` forwards section-level metadata as `sectionMetadata: meta` (a new prop, not `metadata` which is reserved for per-group config). Group reads `attrs.sectionMetadata?.category_members`, threads it through chartBody → CompareChartWrapper → both extract* functions → composeScatterLabel.

Test plan

cargo build clean, viewer smoke 9/9
node tests 47/48 (the 1 failure is the pre-existing wasm comparison test)
15 new unit tests in compare_math.test.mjs covering composeScatterLabel: pure quantile, quantile + N extra dims (alpha-sorted), no-quantile fallback, malformed input, excludeValues option (drop matched, preserve non-matched, empty/missing → no-op), baseline=experiment symmetry assertion
Manual: `view vllm=vllm.parquet sglang=sglang.parquet --category inference-library` — TTFT and other bridge KPIs render both arms in the inference-library section
Manual: pin a bridge KPI from the section, navigate to Notebook — both arms render (was: only baseline; experiment query had `node=""` injected and returned empty)

Known limitations

The `composeScatterLabel` excludeValues filter is by VALUE (drops any dim whose value matches a category-member name). Footgun: if some unrelated label happens to carry a value like `vllm` or `sglang` (e.g. `binary=vllm`), it gets dropped too. In practice this is fine — the category-member names are distinct enough.
Multi-dim percentile compare for non-bridge sections (e.g. per-cgroup percentile) is no longer broken (was the original sub-bug 1) but isn't smoke-tested end-to-end — the unit tests cover the helper and the section/group plumbing pattern is straightforward.

🤖 Generated with Claude Code

Compare mode failed with "compare: no shared labels between captures" on percentile (scatter) charts whose series carry extra dims beyond `quantile`. Two distinct sub-bugs converged here: (1) Asymmetric label extraction: Baseline (extractBaselineCapture): read spec.series_names — strings precomputed by applyResultToPlot as the FIRST non-`__name__` metric value (order-dependent; on a multi-dim metric it picks "vllm" or "0.5" depending on iteration order). For scatter, fell back through canonicalQuantileLabel which only matches pure quantile strings — multi-dim strings like "0.5,vllm" produce null and the original value stands. Experiment (extractExperimentCapture): read item.metric directly via canonicalQuantileLabel(item), which prefers metric.quantile and returns just "p50" — extra dims dropped entirely. Result: baseline keys looked like {"vllm","sglang"} (or {"0.5","0.95"}) and experiment keys looked like {"p50","p95"}. (2) Capture-identity dims in match keys: In category-mode compare (e.g. inference-library, baseline=vllm vs experiment=sglang) the per-side queries produce series whose labels intentionally differ on `source=vllm` vs `source=sglang`. Even with symmetric extraction these dims must NOT contribute to the cross-capture match key — they distinguish the captures themselves, not subseries within each side. Fix: a single `composeScatterLabel(metric, options)` helper that canonicalizes the quantile dim to "pXX" and appends remaining dims (alpha-sorted by key, value-only) joined with " · ", with an `options.excludeValues` set that drops dims whose values match known capture-identity values (the category-bridge plumbing). Wiring: - applyResultToPlot (data.js) now also stashes plot.series_metrics (parallel to series_names) so the baseline path has the raw metric objects available — series_names alone is too lossy. - extractBaselineCapture / extractExperimentCapture both take an options arg and forward it to composeScatterLabel. - CompareChartWrapper reads attrs.categoryMembers, builds an excludeValues set, and passes it to both extracts. - Group reads attrs.sectionMetadata.category_members and forwards it as the categoryMembers attr on CompareChartWrapper. - renderServiceSection (service.js) forwards section metadata as a NEW `sectionMetadata` prop on Group — not `metadata`, which is reserved for per-group config like no_collapse. Pure-quantile metrics (no extra dims) round-trip identically to the previous behavior — composeScatterLabel({quantile:"0.5"}) returns "p50". Non-category sections pass no categoryMembers, so excludeValues stays empty and all dims are preserved (fixes the per-cgroup percentile compare case from sub-bug 1). Category sections pass member names, so source=vllm gets dropped (fixes the inference-library TTFT case from sub-bug 2). Tests: 15 new cases in compare_math.test.mjs cover pure quantile, quantile + N extra dims, no-quantile fallback, malformed input, the excludeValues option (dropping matched dims, preserving non-matched dims, empty/missing → no-op), and the explicit baseline=experiment symmetry assertion for both sub-bugs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When a category-bridge KPI (one with `spec.promql_query_experiment` set by the dashboard's category builder, e.g. inference-library TTFT) gets pinned, the entry kept only the baseline query. Re- rendering in compare mode then ran the baseline query on the experiment side too — different metric name, no result — and the user saw only the baseline trace ("Notebook only shows vllm on the chart"). Two new optional fields on each pinned entry: - `promql_query_experiment` — the per-side experiment query the category builder generates. Restored onto the rebuilt spec in chartLoaderMixin so CompareChartWrapper's existing `spec.promql_query_experiment || spec.promql_query` precedence picks the right query when fetching experiment data. - `category_members` — the member-name set forwarded as `categoryMembers` to CompareChartWrapper, so composeScatterLabel drops the capture-identity dim from the cross-capture match key even though the live Notebook section has no category metadata. Both fields are captured from the chart's section context at pin time (selectButton gains a `categoryMembers` field on its compareMeta arg, plumbed from Group's `attrs.sectionMetadata .category_members`). They round-trip through localStorage persist + restore and through buildPayload (JSON export) + entriesFromPayload (JSON load), defaulting to null on older payloads — pre-fix exports degrade to single-query compare (the original broken behavior, no worse than before the field existed). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

NotebookView mounted CompareChartWrapper with a hardcoded sectionRoute: '/notebook' for every pinned chart. buildEffectiveQuery gates topology-label injection on the sectionRoute prefix: service-section routes (/service/*) skip node injection because service KPIs are already scoped by source=... and don't carry a node label. Hardcoding /notebook bypassed that gate. The experiment-side fetch for a pinned bridge KPI (e.g., inference-library TTFT) ended up running: sglang_prompt_tokens_total{source="sglang", node="<baseline-host>"} against the experiment capture. The baseline's hostname doesn't exist in the experiment recording, so PromQL returned a "Metric not found" error and only the baseline trace rendered. Use entry.section (the section the chart was originally pinned from) to reconstruct the sectionRoute. Bridge KPIs pinned from /service/inference-library now route as /service/inference-library in compare mode, skipping node injection symmetrically with the original section view. Non-bridge pinned charts (CPU, network, etc.) keep their original sectionRoute too — slightly different behavior from before, but more consistent: a pinned chart now renders with the same topology-injection rules it had in its home section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The previous commits over-explained — long docstrings restating what the code does, history-of-the-bug exposition, and example blocks that duplicated the test cases. Cut to one or two lines per comment site that captures the WHY a future reader can't recover from the code. No behavior change. compare_math tests + viewer smoke still green. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pinned histogram charts in compare mode now render the same Full / Tail / Diff (scatter) or diff (heatmap) toggles the section view shows, by reusing chart_controls.compareToggle. The state already round-trips through notebookStore.chartToggles and persistNotebook — only the render was missing. Wraps the chart title in a chart-title-row so the toggle group sits on the right, matching the section-view chartHeader layout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The schema already carries anchors / chartToggles / compare per store, and Save as Report captures all of it in compare mode — but the load-side views ignored it and rendered single-Chart only. Both views now mirror NotebookView's compare-mode contract: - chartBody picks CompareChartWrapper when attrs.compareMode and spec.promql_query are present - chart-header renders compareToggle (Full / Tail / Diff) - per-store chartToggles + anchors threaded into the wrapper - bridge metadata (promql_query_experiment, category_members) on the entry restores the per-side query and the capture-identity exclusion set Toggle writes go to the right store: factored out a generic chartToggleSetter(store, persistFn) and bound three setters — notebook (persists), report (persists), loaded-selection (in-memory only, matches its store lifecycle). app.js mounts now pass compareMode, experimentQueryRange, and the alias pair through to both views. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Was hidden in compare mode under the (now-stale) assumption that selections are single-capture. Previous commit made LoadedSelectionView compare-mode capable — both arms render when an experiment is attached. Drop the !compareMode gate so the button is reachable from the compare-mode topnav.

thinkingfish and others added 7 commits May 10, 2026 23:09

thinkingfish marked this pull request as ready for review May 11, 2026 08:05

thinkingfish merged commit adadf55 into iopsystems:main May 11, 2026
21 checks passed

thinkingfish deleted the fix/compare-multidim-percentile-labels branch May 11, 2026 08:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(viewer): compare-mode bridge KPIs render both arms in section + Notebook#910

fix(viewer): compare-mode bridge KPIs render both arms in section + Notebook#910
thinkingfish merged 7 commits into
iopsystems:mainfrom
thinkingfish:fix/compare-multidim-percentile-labels

thinkingfish commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thinkingfish commented May 11, 2026

Summary

1. Asymmetric label extraction for percentile (scatter) charts

2. Pin loses bridge metadata

3. Pinned charts node-inject the wrong host

Plumbing

Test plan

Known limitations

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant