Skip to content

feat(query-insights): streaming UX for Stage 3 AI recommendations#711

Open
tnaum-ms wants to merge 110 commits into
mainfrom
dev/tnaum/stream-query-insights
Open

feat(query-insights): streaming UX for Stage 3 AI recommendations#711
tnaum-ms wants to merge 110 commits into
mainfrom
dev/tnaum/stream-query-insights

Conversation

@tnaum-ms

@tnaum-ms tnaum-ms commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Problem

After PR #690 shipped AI-powered Query Insights, the dominant pain point was a ~15 second blank wait: the user clicked "Get AI Performance Insights", watched a spinner for ~15 s, and the entire result appeared in one drop. Time-to-first-token from the model was ~3 s; the LLM streamed for ~10–12 s, but we buffered everything before parsing and rendering once. We were strictly worse than what the model was already producing.

What changed

Phase 1 — Plumb the stream

CopilotService gains a streamMessage API that exposes the LLM response as an AsyncIterable<string> of fragments plus a completion promise (CopilotStreamHandle). The existing buffered sendMessage path survives for non-streaming callers; both share a single private streamToModel primitive to prevent drift.

A new collectionView.queryInsights.* sub-router carries the Stage 3 procedures (split out of the monolithic collectionViewRouter.ts). The streamStage3 procedure is a tRPC subscription — the first real consumer of the framework's subscription transport in src/. It is an async function* that iterates streamHandle.fragments and yields domain events directly, with automatic backpressure, linear error propagation, and one-path AbortSignal cancellation.

Phase 2 — Tolerant incremental parser

StreamingResponseParser is a char-by-char state machine that consumes the cumulative JSON buffer and emits domain events as soon as it can prove them complete:

  • summary / educational events with cumulative markdown on every \n boundary.
  • recommendationStarted{index} on each improvements[] element open, recommendation{index} on close (string-and-escape-aware brace counting).
  • finalize() runs a canonical JSON.parse over the full buffer for reconciliation and emits the terminal complete event.

Ships with 25 unit tests covering: byte-at-a-time feeding, escape boundaries split across fragments, multiple improvements, zero improvements, truncated stream, malformed JSON fallback, double-finalize guards, and out-of-order keys.

Phase 3 — Progressive rendering

The webview swaps from .query() to .subscribe() and routes each event into a per-stream QueryInsightsStreamingState — the sole source of truth for Stage 3 cards.

Option A layout: all three Stage 3 slots (Analysis → Recommendations → Educational) are pre-reserved in canonical order the moment the user clicks "Get AI Performance Insights". Each slot starts with a spinner placeholder and is filled in place as its event arrives — layout never reorders. Earlier behaviour let cards arrive in LLM emission order, landing in arbitrary positions and pushing content down.

CardStack replaces AnimatedCardList at the Stage 3 call site — a lighter-weight container for "cards only ever added, group disappears as a unit". Per-item <Collapse appear visible> (or <Fade> via per-item opt-out) for enter, outer <Fade> for group exit. AnimatedCardList is kept in the tree for scenarios that need per-item enter/exit.

Stage3AnalyzingCard — a slim one-line card (Spinner + "AI is analyzing…" + Cancel button) shown during loading. The full GetPerformanceInsightsCard collapses out entirely while the stream is active and animates back in on cancel or completion. A single race-free reducer commit handles all three terminal states (complete / error / cancelled), so isStage3Loading and card visibility flip atomically.

Phase 4 — Telemetry preservation

A new documentDB.queryInsights.stage3.completed event is flushed from the subscription's finally block. It carries the same 17 properties/measurements the old buffered procedure used to record, plus durationMs (wall-clock from request to flush) and aborted. The auto rpc event still fires for subscription-create-rate metrics. No data is lost.

Cleanup + dead code removal

After the streaming path was stable:

  • Removed the buffered getQueryInsightsStage3 procedure and its transformAIResponseForUI family (−392 LoC, no callers since WI-6).
  • Removed TipsCard / "DocumentDB Performance Tips" stalling card (−238 LoC, existed only to entertain during the 15 s wait).
  • Dropped the stage3Promise field (always null on the streaming path) and the streaming OR stage3Data dual-source fallback render.
  • Added inline "do not simplify" comments on load-bearing non-obvious patterns (pendingEnter rAF flip, requestKey staleness guard, telemetry delivery guarantees, etc.).

Model simplification — hard-target copilot-utility

PR #690 introduced a per-feature preferred-model chain (gpt-4.1gpt-4ocopilot-utility). Two things changed: GPT-4.1 retired 2026-06-01, and GitHub's utility-models guidance confirmed that only requests routed through the copilot-utility alias avoid consuming premium request units. Any direct family-name target (gpt-4.1, gpt-4o) from a third-party extension bills the user's premium budget.

The chain is replaced by a single selectUtilityModel() that calls vscode.lm.selectChatModels({ vendor: 'copilot', family: 'copilot-utility' }) with no fallback to a billed model. CopilotService.isAvailable() uses the same filter. The four per-feature family constants, getPreferredFamilies, selectBestModel, and the "preferred family was not used" warnings are all deleted.

modelName and modelVersion are now recorded on every shared telemetry event so per-feature events attribute the actual backing model (the alias resolves to an opaque backing model at runtime).

Telemetry mapping

Old key New key Notes
documentDB.rpc.query.collectionView.getQueryInsightsStage3 removed Stage 3 path is now exclusively streamStage3
…getQueryInsightsStage3 → properties.* / measurements.* (17 keys) documentDB.queryInsights.stage3.completed → same keys Identical population
(none) …stage3.completed → measurements.durationMs New: wall-clock duration
(none) …stage3.completed → properties.aborted New: terminal abort state

The auto rpc event documentDB.rpc.subscription.collectionView.queryInsights.streamStage3 still fires per subscription (useful for create-rate metrics) but carries ~0 duration and no custom measurements. Switch telemetry queries targeting Stage 3 completion to documentDB.queryInsights.stage3.completed.

Related

tnaum-ms added 30 commits May 29, 2026 09:52
…pper (WI-1)

Introduce a shared StreamingPlaceholder shimmer row (variants: standalone/inline,
optional icon/label/elapsed/chars meta, role=status/aria-live=polite) under
src/webviews/.../queryInsightsTab/components/streamingPlaceholder/, and a
client-side StreamingProgressStepper that drives a four-step checklist
('Analyzing plan' → 'Identifying issues' → 'Generating recommendations' →
'Finalizing') using the placeholder for the active step.

Wire the stepper into GetPerformanceInsightsCard's loading state in place of
the previous Spinner + 'AI is analyzing…' row; keep the Cancel button and the
existing Announcer. No backend change. Tips/error card behavior preserved.

Phase 0 / WI-1 of the Stage 3 progressive-streaming plan.
Add CopilotService.streamMessage(messages, options) that exposes the LLM
response as a pull-based AsyncIterable<string> of fragments alongside a
completion promise carrying the full response metadata (modelId, family,
displayName, durationMs, best-effort token usage).

Implementation factors the model iteration loop into a private streamToModel
primitive shared with the existing sendToModel, so the buffered and streamed
paths cannot drift apart. The producer runs in the background under a single
callWithTelemetryAndErrorHandling('vscode-documentdb.copilot.streamMessage')
wrapper and pushes each fragment into a private single-consumer FragmentChannel
(supports return() for early break). The AbortSignal -> CancellationTokenSource
bridge is preserved so aborting the iterator stops the model call.

Adds src/services/copilotService.test.ts covering: fragments in order +
usage/duration on completion, mid-stream abort rejects iteration and
completion with UserCancelledError, and missing model rejects with the
explanatory error.

WI-2 of the Stage 3 progressive-streaming plan.
…hestration (WI-3)

Add optimizeQueryStreaming(...) alongside optimizeQuery in
indexAdvisorCommands.ts. Extracted a shared prepareOptimizationRequest
helper (cluster info + explain + stats + prompt build + message
construction + pre-LLM telemetry) plus a finalizeOptimizationResponse
helper so the buffered and streaming variants cannot drift apart — only
the LLM-call line differs (CopilotService.sendMessage vs. streamMessage).

Add AIOptimizationStreamHandle and
QueryInsightsAIService.getOptimizationRecommendationsStreaming(...) that
delegates to optimizeQueryStreaming and resolves completion with the
fully-parsed AIOptimizationResponse (single JSON.parse at end — WI-7/WI-8
will introduce incremental parsing).

Buffered callers are unchanged: optimizeQuery and
getOptimizationRecommendations still produce identical results.

Adds QueryInsightsAIService.streaming.test.ts covering: end-to-end
fragment iteration + parsed completion (model identity carried through),
and malformed-JSON rejection on completion.

WI-3 of the Stage 3 progressive-streaming plan.
…(WI-4)

Create src/webviews/documentdb/collectionView/queryInsights/queryInsightsRouter.ts
and move Stage 1/2/3 + executeQueryInsightsAction procedures (and the
readQueryInsightsDebugFile helper) into it as a pure relocation — procedure
bodies are unchanged.

Mount the new sub-router under collectionView.queryInsights so the four
procedures are now reachable at collectionView.queryInsights.* in the tRPC
client (e.g. collectionView.queryInsights.getQueryInsightsStage3.query).
Per D12 / the package README convention, the streaming subscription
landing in WI-5 will live in a sibling queryInsightsEventsRouter.ts merged
into queryInsightsRouter — keeping 'things the webview calls' separate
from 'things the host pushes'.

Update all four webview call sites (QueryInsightsTab.tsx + CollectionView.tsx)
to use the new paths. Remove now-unused imports from collectionViewRouter.ts.

⚠️ Telemetry: rpc event paths for these procedures now include a
'queryInsights' segment (e.g. documentDB.rpc.query.collectionView.queryInsights.getQueryInsightsStage3 —
previously documentDB.rpc.query.collectionView.getQueryInsightsStage3).
Telemetry queries that hard-coded the old path must be updated.

WI-4 of the Stage 3 progressive-streaming plan.
Add QueryInsightsStreamEvent discriminated union in
types/queryInsightsStream.ts (coarse subset for WI-5: status + result;
WI-8 will extend with per-domain events).

Add queryInsightsEventsRouter.ts exporting a queryInsightsEventsRoutes
record. Procedures are spread into queryInsightsRouter so the
webview-visible paths stay flat (collectionView.queryInsights.streamStage3).
This keeps push-style procedures in a sibling file per D12 / the package
README convention, without depending on the (currently non-re-exported)
t.mergeRouters helper.

streamStage3 is an async function* that:
1. yields status: connecting,
2. builds queryContext + staticAnalysisSummary like the buffered procedure,
3. calls getOptimizationRecommendationsStreaming (Option A — no TypedEventSink),
4. iterates fragments emitting throttled (250ms) status: receiving
   carrying elapsedMs + cumulative charsReceived,
5. yields status: parsing then awaits the parsed completion,
6. yields a single result carrying today's QueryInsightsStage3Response payload.

A per-subscription AbortController forwards ctx.signal aborts down to the
LLM call; finally aborts on iterator.return() (panel dispose /
subscription.stop). Webview migration is deferred to WI-6.

WI-5 of the Stage 3 progressive-streaming plan.
Convert QueryInsightsTab from the buffered
collectionView.queryInsights.getQueryInsightsStage3.query() call to the
streaming collectionView.queryInsights.streamStage3.subscribe() added in
WI-5.

Behavior preserved:
- requestKey staleness guard kept on every onData/onError path so late
  or raced callbacks from the framework's unsubscribe path are silently
  discarded.
- Cancel button now calls sub.unsubscribe() which the framework
  translates into subscription.stop -> server-side AbortController abort
  + iterator.return() on the generator (per package README), so the
  underlying LLM call is cancelled in lock step.
- Unmount cleanup uses unsubscribe() for the same reason.

WI-6 deliberately ignores the coarse type: 'status' events emitted by
streamStage3; only the terminal type: 'result' event drives the final
UI (writing stage3Data + transitionToStage(3, 'success')), keeping
the rendered Stage 3 output bit-identical to today. WI-9 will start
consuming the status events for progressive rendering.

stage3Promise field is kept on the context type for now (always set to
null in this code path); it was write-only state, and removing it
would be a wider refactor outside this WI.

WI-6 of the Stage 3 progressive-streaming plan.
Add src/documentdb/queryInsights/streamingResponseParser.ts: a pure,
char-by-char state-machine tokenizer that consumes fragments of the
Stage 3 LLM response (single top-level JSON object) and emits
structured QueryInsightsStreamEvent's without waiting for the response
to complete.

Emits:
- summary / educational with cumulative markdown at \n\n paragraph
  boundaries (complete:false) and once on string close (complete:true);
- recommendationStarted on each improvements[] item open and a
  fully-parsed recommendation on item close, with monotonic indices;
- verification (once on finalize) sourced from the reconciled JSON.parse
  rather than from per-item streaming extraction (see Deviation Log).

On finalize() the parser runs JSON.parse on the full buffer and
returns {events, parsed, parseError}; parsed is the canonical view per
plan D6 (zero-regression vs. the buffered path). Handles truncated
mid-string streams (flushes a final complete:true), malformed /
empty buffers (parseError set, parsed=null), unknown top-level keys
(object / array / number / bool / null all skip-consumed), and
fragment boundaries inside escapes (between \\ and the escaped char,
mid-\u, mid-hex).

Extended QueryInsightsStreamEvent (types/queryInsightsStream.ts) with
the structured variants: summary, educational, recommendationStarted,
recommendation, verification, complete (+ a webview-safe
QueryInsightsStreamUsage mirror of CopilotTokenUsage). The WI-5
transitional 'result' variant is kept and marked @deprecated; WI-8
will remove it from the subscription path once the parser is wired in.

Added 25 Jest cases in streamingResponseParser.test.ts covering:
basic happy path, byte-at-a-time feeding, progressive emission,
simple + unicode escapes incl. fragment boundaries, multiple
improvements with stream-order indices, braces / brackets inside
shellCommand / justification / risks (no false-positive boundary),
nested arrays in indexOptions, empty improvements, verification
reconciliation, unknown extra keys, truncation tolerance,
malformed / empty / whitespace-only buffer, double-finalize +
post-finalize-feed guards, out-of-order key arrival.

WI-7 of the Stage 3 progressive-streaming plan.
…n (WI-8)

Replace the coarse-only output of the WI-5 streamStage3 subscription
with parser-driven structured domain events.

The subscription now creates a per-call StreamingResponseParser
(WI-7) and feeds each fragment to it; the parser's structured events
(summary / educational with cumulative markdown at paragraph
boundaries, recommendationStarted / recommendation per
improvements[] item) are yielded ahead of the throttled
status: receiving for the same fragment so progressive UI gets first
priority. After the stream completes, the subscription yields
status: parsing, then the parser's trailing events via
parser.finalize() (final summary / educational with complete: true
for truncation tolerance, plus verification sourced from the
reconciled JSON.parse), then a terminal complete carrying
modelDisplayName / modelId / usage from streamHandle.completion's
parsed response.

Removed the transformAIResponseForUI call from this subscription:
the per-recommendation transform that turns AIIndexRecommendation
into the UI ImprovementCard moves to the webview in WI-9 (per plan
§4 / D7 "letting the webview own card-component choice"; the
subscription speaks domain only). Also dropped the now-unused
clusterId destructure and the transformations import.

Transient by design: the result event is no longer emitted (the
union still carries it as @deprecated for one more WI so WI-6's
event.type === 'result' check stays type-safe), so the WI-6 webview
handler will not populate stage3Data until WI-9 wires the
structured events into UI state. This matches the WI-8 acceptance
criterion ("events arrive in stream order"; webview reconstruction
is WI-9's job).

WI-8 of the Stage 3 progressive-streaming plan.
…ab (WI-9)

Wire the WI-8 structured streaming events into progressive UI.

Context: add QueryInsightsStreamingState (summary, educational,
recommendations[], verification) plus stage3Streaming on the
QueryInsightsState. The subscription's onData handler routes each
structured event into the matching slot via setQueryInsightsStateHelper
(initialising the per-stream object lazily on the first event;
recommendationStarted pushes a null sentinel at the right index,
recommendation fills it in), and on the terminal complete event a new
local synthesizeStage3Data helper materialises a full
QueryInsightsStage3Response into stage3Data so byline / collapse /
"powered by" code paths reading stage3Data keep working unchanged.
transitionToStage and handleCancelAI clear stage3Streaming alongside
stage3Data on phase=1 reset, phase=2 loading, phase=3 loading, and
cancel.

Components: new ImprovementCardShell (reuses ImprovementCard's outer
Card + ArrowTrendingSparkleRegular icon per D11 so card identity
never changes when content arrives), and new
utils/createImprovementCard.ts (pure webview-side per-recommendation
transform mirroring the server-side createImprovementCard in
transformations.ts byte-for-byte). Extended
QueryInsightsStreamEvent's complete variant with modelFamily so
stage3Data.modelFamily parity is preserved for the byline / WI-10
telemetry.

Render path: phase-3 cards now read from streaming first (summary /
educational grow paragraph-by-paragraph as their cumulative markdown
events arrive; each recommendation index renders ImprovementCardShell
while streaming.recommendations[i] is null, then the filled
ImprovementCard once it fills) and fall back to stage3Data for the
post-complete window / legacy buffered callers. Every card uses a
stable ${keyPrefix}rec-${index} key so the shell-to-filled
transition is in-place (no remount). The GetPerformanceInsightsCard
collapses the moment the first structured event arrives.

OPEN-1 (layout jump under R1): accepted the plan's default
"one downward shift" mitigation — the educational block fills
first (LLM emits educationalContent before analysis) and is pushed
down once summary arrives. AnimatedCardList's existing CollapseRelaxed
insert animation softens the transition. Logged in Deviation Log.

WI-9 of the Stage 3 progressive-streaming plan.
Emit documentDB.queryInsights.stage3.completed from the Stage 3
streaming subscription's finally block via callWithTelemetryAndErrorHandling
so its lifetime spans the whole stream and is unaffected by the
trpcToTelemetry subscription-timing trap (auto rpc event resolves at
generator-creation time and carries ~0 duration / no custom keys).

Per-subscription CompletionTelemetry accumulator
({ properties, measurements }) is populated incrementally throughout the
body:
 - platform from session.getClient().getClusterMetadata() (best-effort,
   catches metadata fetch failures and falls back to 'unknown' just
   like the buffered procedure),
 - hasCachedExecutionPlan from the cached explain output,
 - hasStaticAnalysisSummary / staticAnalysisSummaryLength /
   staticAnalysisSummaryError / staticAnalysisSummaryErrorKind from the
   buildStaticAnalysisSummary path,
 - recommendationCount + actionableRecommendationCount + per-action
   counters (create/drop/modify) from aiResponse.improvements,
 - aiModelDisclosed / aiModelFamily from aiResponse.model*,
 - promptTokens / responseTokens / totalTokens / maxInputTokens /
   promptUtilizationPct from aiResponse.usage.

A flushCompletionEvent() helper — idempotent, fire-and-forget — wraps
callWithTelemetryAndErrorHandling, sets errorHandling.suppressDisplay,
copies the accumulator into context.telemetry, and adds two NEW keys:
 - measurements.durationMs (wall-clock from request start to flush),
 - properties.aborted ('true'/'false', set from the per-subscription
   AbortController). It's called from finally so it fires exactly once
   per stream on success, abort (panel dispose / subscription.stop /
   user Cancel), and the rare throw path.

vscode-documentdb.copilot.streamMessage and
vscode-documentdb.queryInsights.getOptimizationRecommendationsStreaming
continue to fire from their existing callWithTelemetryAndErrorHandling
wrappers (unchanged in WI-10).

Old → new key mapping table for the PR description is in the plan
under WI-10's outcome block. All Stage 3 properties / measurements
that the buffered procedure carried are preserved 1:1 on the new
event with identical semantics; the new event adds durationMs and
aborted. No data lost.

WI-10 of the Stage 3 progressive-streaming plan.
Tick WI-11 in the plan with an outcome block recording the final
PR-checklist run results (npm run l10n / prettier-fix / lint / jest /
build all green on a clean tree after WI-10) and notes for the
reviewer:

 - Manual verification on a live slow query (progress < 2s, paragraph
   reveal, shells fill, mid-stream cancel clears partial UI, final
   state byte-identical to the buffered procedure) is deferred to
   the reviewer — it requires a live Copilot subscription and a
   slow query against a real cluster, neither of which the agent
   has access to. Recommends exercising: success, mid-stream Cancel,
   regenerate-while-loading, and verifying
   documentDB.queryInsights.stage3.completed fires with the right
   aborted flag in both paths.
 - getQueryInsightsStage3 (buffered) is no longer called from the
   webview after WI-6 but stays on the router per WI-3's contract
   ("keep the existing buffered path working for any non-streaming
   caller"); removing it is a follow-up.
 - stage3Promise on QueryInsightsState is dead write-only state
   after WI-6; removing it is a wider context-shape refactor not
   required by the streaming work.

All WI-1 → WI-11 boxes are now ticked. Three Deviation Log entries
recorded (WI-7 verification-from-reconcile, WI-7 \n\n-only progressive
trigger, WI-9 OPEN-1 accept-one-shift). The WI-10 outcome block holds
the old → new telemetry mapping table for the PR description.

WI-11 of the Stage 3 progressive-streaming plan.
… add yield delay

Three review-driven fixes from the WI-11 manual verification run.

1) Per-yield tracing on the Stage 3 streaming subscription. Every
   yield (connecting/receiving/parsing status, parser-emitted summary
   / educational / recommendationStarted / recommendation, trailing
   verification, terminal complete) now logs a single compact line via
   ext.outputChannel.trace in the form:
     [Query Insights Stage 3 stream] [+1234ms] yield: <desc> (requestKey: <key>)
   with a low-cardinality describeEvent() summary so the full stream
   is scannable in the output channel.

2) Fix: the GetPerformanceInsightsCard progress stepper used to
   disappear the moment the first streamed event arrived (because the
   CollapseRelaxed visibility condition was extended to
   only: the inner stepper is shown via isLoading=(phase===3 &&
   status==='loading') and that stays true until the terminal complete
   event materialises stage3Data, so the progress indicator persists
   throughout the stream and collapses only once the data is fully
   in.

3) Debug aid: DEBUG_YIELD_DELAY_MS (currently 1000) sleeps before
   every yield to the webview so the progressive UI is observable when
   the LLM responds quickly. Fast-path no-op when 0 — set back to 0
   before shipping. delayYield() is awaited before each yield in the
   subscription generator; abort is re-checked after the sleep so a
   user-clicked Cancel still ends iteration promptly.

Verified: l10n / prettier / lint / jest (2014 ✓) / build all pass.
…ulse bar in stepper

Two UI tweaks driven by the WI-11 manual run.

1) MarkdownCard gains an inFlight prop. When true, two thin
   skeleton lines (80 percent then 30 percent width) render below the
   markdown content with the same shimmer animation as
   StreamingPlaceholder. QueryInsightsTab passes
   markdown cards show a clear 'more is coming' affordance during the
   gaps between paragraph-boundary updates. Lines disappear as soon as
   the parser emits complete: true.

2) StreamingPlaceholder gains barPosition (leading vs trailing) and
   barStyle (shimmer vs pulse) props. Trailing/shimmer remains the
   default for in-card placeholders. StreamingProgressStepper's active
   step now uses leading/pulse so the bar sits in the same horizontal
   column as the bullet markers (visually replacing the bullet) and
   pulsates with a calm opacity animation instead of a traveling
   highlight. The SparkleRegular icon is dropped from the active step
   since the bar already occupies the marker slot.

Both animations honor prefers-reduced-motion.

Verified: prettier / lint / jest (2014 tests pass) / build all green.
1) Restore the StreamingProgressStepper active step to its original
   left-aligned layout: spark icon, label, trailing shimmer bar, then
   the elapsed-time meta. The barPosition=leading + barStyle=pulse
   experiment from the previous commit competed with the surrounding
   pending/done bullet markers visually; the bullet style is clearer.

2) Add per-fragment tracing on the Stage 3 streaming subscription.
   Every iteration of the streamHandle.fragments loop now logs a line
   like:
     [Query Insights Stage 3 stream] [+1234ms] fragment: len=42, totalChars=512, preview="…" (requestKey: …)
   via a new previewFragment() helper that JSON-encodes a 80-char
   preview with newlines/tabs/backslashes escaped. This surfaces the
   LLM's actual chunking so we can tell whether the parser's
   paragraph-boundary detection (current trigger: cumulative value ends
   in \n\n) is granular enough — the screenshot showed two paragraphs
   rendering, then a large gap, then the rest appearing all at once,
   which suggests the LLM is sending big chunks that span several
   paragraphs and the parser's per-char advance still only emits at
   the last \n\n boundary.

Verified: l10n / prettier / lint / jest (2014 ✓) / build all pass.
Restore the original main-branch loading layout (Fluent Spinner with
'AI is analyzing…' label + Cancel button) and add ONE sub-line below
the label showing the current action (Analyzing query plan… /
Identifying issues… / Generating recommendations… / Finalizing…).

Replaces the four-step StreamingProgressStepper checklist that was
introduced in WI-1 with a much lighter visual:

  ⠋ AI is analyzing…
       Identifying issues…                          [Cancel]

New CurrentActionLine component (sibling of StreamingProgressStepper,
exported from the same barrel) keeps the same client-side timer +
4-step derivation the stepper had, just rendering a single line of
description-foreground text instead of the full bullet list.

Verified: l10n / prettier / lint / jest (2014 ✓) / build all pass.
…ng UI

Two changes.

1) StreamingResponseParser: emit summary/educational events at every
   single \n in the cumulative markdown value, not only at \n\n
   paragraph boundaries. Each list item, heading, or blank-line break
   now shows up as its own progressive event — the LLM's chunking
   spans many lines per fragment, so the previous per-paragraph
   policy emitted only ~5 events across the entire markdown value;
   the new per-line policy raises that by roughly 5×, which is what
   makes the cards visibly grow in the UI.

   Updated the affected parser tests:
   - basic happy path: "Para 1.\n\nPara 2." → 3 events (was 2)
   - explicit policy test: "P1.\n\nP2.\n\nP3." → 5 events (was 3)
   - consecutive-\n test: now asserts a distinct event per \n.
   All 25 parser tests + full 2014-test suite green.

2) GetPerformanceInsightsCard: revert the Stage 3 loading state to
   the exact main-branch layout — Fluent Spinner + 'AI is
   analyzing…' label + Cancel button, all on a single horizontal
   row. The StreamingProgressStepper introduced in WI-1 and the
   CurrentActionLine added in the previous commit are both removed
   (deleted along with StreamingProgressStepper.scss); they kept
   breaking visually and the simple spinner row was already good
   enough. StreamingPlaceholder itself is kept because
   ImprovementCardShell still uses it for the recommendation shell
   body.

Verified: l10n / prettier / lint / jest (2014 ✓) / build all pass.
Two changes wrapped in one commit because they are both responses to the\nsame 'streaming feels static' investigation:\n\n1. AnimatedCardList learns a per-item 'inFlight' flag. Items with\n   'inFlight: true' enter via Fade (200 ms opacity-only) instead of\n   CollapseRelaxed (400 ms maxHeight + overflow:hidden). CollapseRelaxed\n   captures element.scrollHeight once at mount (when the streaming card is\n   nearly empty) and then clips with overflow:hidden for 400 ms, hiding\n   most of the early stream and producing a 'title-only -> suddenly full'\n   two-frame pop. Fade does not clip, so the markdown grows visibly as\n   chunks arrive. The motion choice is captured on first mount and frozen\n   for the item's lifetime so the wrapper component never swaps (which\n   would remount and reset the inner card).\n\n   QueryInsightsTab marks the analysis ('Query Performance Analysis') and\n   educational ('Understanding Your Query Execution Plan') cards as\n   inFlight while their progressive 'complete: false' is still in effect.\n   Recommendation shells, error card, tips, and the post-success snapshot\n   re-render continue to use CollapseRelaxed unchanged.\n\n2. Drop the experimental Stage 3 stream diagnostics added in WI-5: the\n   per-stream StreamDiagnostics accumulator, the info-level Diagnostics\n   summary, both UX SIGNAL heuristics, every per-event / per-fragment\n   outputChannel.trace line, and the debug-only DEBUG_YIELD_DELAY_MS\n   artificial delay. These served their purpose (proved the per-newline\n   parser emission is healthy and the LLM does not buffer). What is left\n   in queryInsightsEventsRouter.ts is the minimum: the STATUS_EVENT_INTERVAL_MS\n   throttle, the per-stream AbortController, the WI-10 completion telemetry\n   accumulator, and the two existing trace lines bracketing the stream\n   (Started / Completed). Net -143 lines.\n\nNo behaviour change to the stream protocol or telemetry contract.
…nt in AnimatedCardList

Two latent bugs in AnimatedCardList that combined to make the Stage 3\nstreaming cards feel 'frozen until they snap to complete':\n\n1. Enter animation was silently skipped on every mount. Items mounted\n   with visible={true} on first render, and Fluent's createPresenceComponent\n   factory defaults appear={false} — so the framework treated the very\n   first mount as 'already in' and ran NO enter motion. Fix: add a\n   pendingEnter flag. New items first render with visible={false}, and a\n   requestAnimationFrame on the next frame flips them to visible={true}.\n   The presence component now sees the false → true transition it needs\n   and the enter motion (CollapseRelaxed for non-streaming cards, Fade\n   for streaming ones) actually plays.\n\n2. Same-key component updates were silently dropped. The useEffect\n   early-returned when no keys had been added or removed — but the\n   parent (QueryInsightsTab during streaming) hands us a fresh items\n   array every render, including when only an existing card's component\n   reference changes to carry new streamed markdown. With the early\n   return in place, the new component reference never landed in\n   displayItems, so the rendered tree kept showing the stale ReactNode\n   and the card looked frozen even though its source content was\n   changing. Fix: drop the early return. The general case below already\n   handles 'no additions, no removals' correctly — it walks items in\n   order, updates component for matched keys, and produces a fresh\n   displayItems array. Removals are still detected because we consume\n   displayMap entries as we walk; leftovers in displayMap are the items\n   to exit.\n\nBoth fixes are independent but ship together because the streaming UX\nneeds both: enter motion makes the card animate IN, the dropped early\nreturn makes the markdown grow visibly while it's IN.
Standalone dev-only webview wired to a new testing command\n(vscode-documentdb.command.testing.openMotionSandbox, sibling of the\nexisting startDemoTask command). Pure UI playground — no tRPC, no data,\nno telemetry, no localised strings. Never bound to any user-facing UI\nsurface; opens only from the Command Palette.\n\nThree independent labs:\n\n- CleanCollapseLab — single Collapse + Show/Hide + appear switch +\n  Remount. The smallest possible test of Fluent's enter/exit motion,\n  isolated from any queue/reducer logic. Toggle appear OFF to\n  reproduce the original 'no enter animation' bug.\n\n- StaggerLab — wraps N motion-wrapped cards in @fluentui/react-motion-\n  components-preview's new Stagger choreography helper (≥ 0.15.0).\n  Exposes itemDelay, reversed, hideMode (visibleProp /\n  visibilityStyle / unmount), and delayMode (timing / delayProp). Group\n  presence — fits 'show/hide a whole list together', NOT 'prepend one\n  new item to an existing list'.\n\n- AddFromTopLab — the production scenario. 'Add at TOP' prepends to\n  the list; siblings get pushed down. Confirms (visually) that only\n  the Collapse* variants animate height — every other variant lets the\n  newly-mounted item grab its natural height instantly, causing\n  existing siblings to jump.\n\nWiring:\n- Registered as 'motionSandbox' in WebviewRegistry.\n- MotionSandboxController extends WebviewControllerBase with the\n  minimum router context (dbExperience + webviewName).\n- Command handler in ClustersExtension lazy-imports the controller so\n  the dev tool has zero cost when not opened.\n- Command entry added to package.json under the 'DocumentDB' category.
Stage 3 has been served by the 'streamStage3' subscription since WI-6;\nthe buffered 'getQueryInsightsStage3' query has had no caller in 'src/'\nfor several commits. Removing it removes 386 lines of dead code and\nshrinks the surface a future contributor could accidentally re-wire.\n\n- Delete the 'getQueryInsightsStage3' tRPC query and its now-unused\n  imports ('QueryObject', 'buildStaticAnalysisSummary',\n  'QueryInsightsStage3Response') from queryInsightsRouter.ts. The\n  subscription 'streamStage3' (in queryInsightsEventsRouter.ts, spread\n  in via 'queryInsightsEventsRoutes') is unchanged.\n\n- Delete 'transformAIResponseForUI' and the five private helpers it\n  used (createImprovementCard, getPrimaryButtonLabel, getCardTitle,\n  getPrimaryActionId, generateIndexExplanation) from\n  src/documentdb/queryInsights/transformations.ts. They were only ever\n  consumed by the deleted procedure. The webview-side equivalent in\n  utils/createImprovementCard.ts is now the single source of truth for\n  the AIIndexRecommendation -> ImprovementCardConfig shape (also drives\n  the WI-9 progressive recommendation cards).\n\n- Drop the dead 'stage3Promise' field from QueryInsightsState. The\n  field was always written as 'null' along the streaming path and never\n  read by any UI code; it was a leftover from the buffered '.query()' /\n  promise-based path. Five dead writes removed in QueryInsightsTab.tsx,\n  the field + default removed from collectionViewContext.ts.\n\nNo behaviour change. No telemetry change.\n'documentDB.queryInsights.stage3.completed' (the WI-10 dedicated\ncompletion event) remains the canonical Stage 3 telemetry source.
…treaming state

Nothing in the UI surfaces verification items today, and stashing them\nin 'QueryInsightsStreamingState.verification' invited bugs of the form\n'why is this field populated but never shown'.\n\n- Remove the 'verification: string[] | null' slot from\n  'QueryInsightsStreamingState'.\n- Remove the reducer case that wrote to it; replace with an explicit\n  comment that the 'verification' event is intentionally ignored.\n- Drop the now-empty join in 'synthesizeStage3Data'; explicit comment\n  there too.\n\nThe parser is unchanged: it still emits a 'verification' event from\n'finalize()' (sourced from the canonical 'JSON.parse'), so a future\ncard that wants to render verification items can wire it up without a\nprotocol change. To restore: re-add the 'verification' slot to the\nstreaming state and a matching reducer case.\n\nNo wire change. No telemetry change.
Eliminates the 'streaming OR stage3Data fallback' dual-source pattern in the card render path. After cleanup #2 removed the buffered 'getQueryInsightsStage3' procedure, the fallback to 'stage3Data' was dead — 'stage3Streaming' is populated from the very first structured event and preserved past the terminal 'complete' event (the reducer writes both stage3Data and stage3Streaming together).

- Analysis Card: render gated only on streaming?.summary.

- Improvement Cards: drop the entire else-if (... stage3Data?.improvementCards) fallback branch (~25 lines) including the MarkdownCard-without-buttons edge case that mapped to the old buffered transform shape.

- Educational Card: render gated only on streaming?.educational.

stage3Data still has two legitimate readers (documented in the rewritten comment block above the render): (a) the GetPerformanceInsightsCard collapse condition uses (not stage3Data) as a 'has succeeded at least once' sentinel; (b) the byline / model-disclosure footer reads stage3Data.modelDisplayName.

No visible behaviour change; one less branch on every Stage-3 render.
…lification' in AnimatedCardList

Adds an explicit 'do NOT simplify this to appear={true} visible={true}' note next to the pendingEnter flag on ItemState. The two-step (mount with visible={false}, flip to true on rAF) exists because items are added from a parent reducer that re-renders the list synchronously: by the time Fluent's presence component effect runs, visible={true} is already the initial value, and the framework only animates on a subsequent visible change. The note saves a future maintainer the 'I tried appear=true and it didn't help, why does this trick exist' loop.
…ngle source of truth

The server-side twin (transformAIResponseForUI + createImprovementCard in src/documentdb/queryInsights/transformations.ts) was deleted as part of the buffered-Stage-3-path cleanup. Updates the file's preamble JSDoc to (a) drop the 'kept in sync with the server twin' contract that no longer applies and (b) explicitly tell future contributors: if you change this file, no server-side counterpart needs the same edit; if you ever reintroduce a server-side renderer, port this file rather than diverging from it.
…mment refresh

Two doc-only changes to queryInsightsEventsRouter.ts:

1. The 'flushCompletionEvent' helper now spells out, line by line, when the dedicated 'documentDB.queryInsights.stage3.completed' event actually reaches the wire: success / cancel / panel close / regenerate (yes), normal VS Code shutdown (usually, modulo the grace window), force-quit / OS kill / extension host crash (no). Future maintainers should not promote this to at-least-once semantics without first choosing a backing store + dedupe key. Documents the design decision made in the PR review.

2. Refresh stale references to 'getQueryInsightsStage3' and 'transformAIResponseForUI' (both deleted in the buffered-path cleanup). The streaming subscription is now the only Stage 3 entry point; the JSDoc on the procedure and the queryContext-builder comment reflect that. The CompletionTelemetry interface preamble still mentions the old procedure as the historical key-set source, since that is precisely the contract telemetry consumers care about.
…oad-bearing

Expands the inline comment next to the 'requestKey = crypto.randomUUID()' allocation in handleGetAISuggestions to spell out the exact 3-step race the guard exists to defuse (Cancel → unsubscribe → new subscribe → late trailing callback from the old subscription). The comment now explicitly tells future maintainers: do NOT remove the 'if (prev.stage3RequestKey !== requestKey) return prev;' check from the reducer in a 'because the framework promises cleanup' refactor. tRPC subscription cleanup is not strictly synchronous to unsubscribe().
A small new component that renders a Fluent Spinner next to a short localized label (e.g. 'Analyzing…'). Replaces the two-line shimmer ('streaming-content-lines') we used to put inside streaming cards. The shimmer competed with the streamed markdown for the user's attention and did not read as 'this is still working' on first glance — a spinner does.

Used in three places by upcoming commits: inside MarkdownCard while a streamed value is still growing, inside ImprovementCardShell while a recommendation is being written, and inside the Stage-3 pre-reserved slot placeholders before any structured event has arrived.

Accessibility: role='status' + aria-live='polite'. The spinner is decorative (aria-hidden); the label carries the semantic meaning.
…+ recommendation cards

MarkdownCard.inFlight no longer renders the two-line 'streaming-content-lines' skeleton; it now renders the new StreamingInlineProgress (spinner + label). Same for ImprovementCardShell, which used to wrap a StreamingPlaceholder. The spinner reads as 'still working' more clearly than the shimmer, and it stops competing with the streamed markdown for the user's attention.

MarkdownCard gains an optional 'inFlightLabel' prop so each caller can pick a descriptive verb instead of a generic 'Writing…'. Today's two callers in QueryInsightsTab pass 'Analyzing…' (for the analysis card) and 'Writing explanation…' (for the educational card).

Drops the dead '.streaming-content-lines' SCSS block from StreamingPlaceholder.scss; nothing references those class names any more.
tnaum-ms added 7 commits June 8, 2026 18:29
… (F3)

Three Stage 3 placeholder slots (analysis / first recommendation /
educational) mount simultaneously when the user starts Stage 3, each
rendering StreamingInlineProgress as role=status / aria-live=polite, on
top of Stage3AnalyzingCard's own Announcer. That queues 4+ polite
announcements per click.

Drop the implicit live-region semantics and add an opt-in `announce`
prop. No call site currently sets it: Stage3AnalyzingCard's Announcer is
already the single global "AI is working" signal.

Refs: PR #711 review finding F3.
The router header used `...queryInsights.getQueryInsightsStage3` as an
example rpc-event path, but that buffered procedure was removed in favor
of the `streamStage3` subscription. Drop the example outright; the
surrounding sentence still conveys that the path gained a queryInsights
segment.

Refs: PR #711 review finding F4 (Copilot comment 3374066078).
The two appendLine calls in readQueryInsightsDebugFile use raw,
non-localized strings with emoji prefixes, unlike the rest of the
output-channel writes in this file which go through l10n.t(). Per author
direction, this stays as-is until the broader Query Insights output-
channel localization sweep. Annotate both call sites so reviewers don't
re-flag the inconsistency.

Refs: PR #711 review finding F5 (Copilot comment 3374066190).
…teImprovementCard (F6)

The JSDoc described the streaming source as one domain object per
`recommendations[]` item, but the canonical Stage 3 JSON schema and
`AIOptimizationResponse` use `improvements[]`. Pure documentation fix.

Refs: PR #711 review finding F6 (Copilot comment 3374066252).
…writes (F8)

prefetchQueryInsights's .then and .catch unconditionally folded the
resolved/failed Stage 1 payload into the current state. If a newer query
had already kicked off a fresh pipeline by the time the in-flight call
resolved, the stale data could clobber it (the reducer's request-key
guard only covers Stage 3).

Guard both handlers with `prev.queryInsights.kind !== 's1Loading'` and
no-op if it doesn't match. Doesn't fully close a second-s1Loading race —
that would need request-key plumbing — but Stage 1 is short-lived enough
in practice that the simple guard covers the common case.

Refs: PR #711 review finding F8.
The Stage3AnalyzingCard's Announcer was locked to "AI is analyzing…",
which after F3 (StreamingInlineProgress no longer self-announces) became
the only signal screen-reader users got — and it never narrated progress.

Stack three Announcers gated on the current phase (connecting / submitted
/ receiving). Each fires once when its phase becomes active (Announcer
re-announces on `when` false→true), giving a short three-step narrative
that follows real model progress with natural spacing — no extra
throttling needed.

Refs: PR #711 review finding F9.
…ring (F10)

CardStack's store-derived-state snapshot (`items !== lastNonEmpty` guard
around an in-render setState) converges in a single render only because
the parent passes a memoized `items` array. The pattern is correct (no
infinite loop, as confirmed in the earlier Copilot thread response), but
the dependency is invisible from either file in isolation. Cross-
reference both sides so a future refactor that drops the `useMemo` in
QueryInsightsTab doesn't reintroduce per-render churn.

Refs: PR #711 review finding F10 (Copilot comment 3362126847).
@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Review finding F3 (new on the 2026-06-08 re-review pass) — fixed in 399766b.

Three Stage 3 placeholder slots (analysis / first recommendation / educational) mount simultaneously when the user starts Stage 3. Each rendered StreamingInlineProgress as role="status" aria-live="polite", on top of Stage3AnalyzingCard's own Announcer — queuing 4+ polite announcements per click. This is the same class of issue the earlier Copilot pass flagged on the now-deleted StreamingPlaceholder, resurfacing on the live component.

Dropped the implicit live-region semantics from StreamingInlineProgress and added an opt-in announce prop. No call site currently sets it; Stage3AnalyzingCard's Announcer is already the single global "AI is working" signal (and F9 makes that announcer phase-aware).

Commit: 399766b0

@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Review finding F8 (new on the 2026-06-08 re-review pass) — fixed in 4c71057.

prefetchQueryInsights's .then and .catch unconditionally folded the resolved / failed Stage 1 payload into the current state via setCurrentContext. If a newer query had already kicked off a fresh pipeline by the time the in-flight call resolved, the stale data could overwrite live state — Stage 1 had no request-key guard like Stage 3.

Per author direction (Stage 1 is short-lived enough in practice that the simple guard covers the common case), added prev.queryInsights.kind !== 's1Loading' bail-outs in both handlers. Doesn't fully close a second-s1Loading race — that would need request-key plumbing — but the perf cost vs. the engineering cost is right-sized.

Commit: 4c710571

@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Review finding F1 (new on the 2026-06-08 re-review pass) — fixed in 639935d.

streamStage3 flushed the dedicated completion event (documentDB.queryInsights.stage3.completed) from finally and sampled abortController.signal.aborted to set aborted='true'/'false'. The controller is unconditionally aborted earlier in the same finally, so every successful stream was logged as aborted — dashboards couldn't tell success from cancel/shutdown.

Switched to an explicit three-valued outcome ('success' | 'cancel' | 'error') tracked across the generator body and stamped before cleanup runs. aborted is retained as a derived back-compat property (outcome === 'cancel') so existing dashboard queries keep working.

Commit: 639935df

@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Review finding F9 (new on the 2026-06-08 re-review pass) — fixed in f02c2c4.

After F3 (StreamingInlineProgress no longer self-announces), Stage3AnalyzingCard's fixed-string Announcer ("AI is analyzing…") became the only signal screen-reader users got — and it never narrated progress, so the previously-flagged "for a11y review" caveat became material.

Stacked three Announcers gated on the current phase (connecting / submitted / receiving). Each fires once when its phase becomes active (Announcer re-announces on when false→true), giving a short three-step narrative that follows real model progress with natural spacing — no extra throttling needed since phase transitions are already paced by the model.

Commit: f02c2c4b

@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Review finding F7 (new on the 2026-06-08 re-review pass) — fixed in 8a44403.

The streamStage3 generator had only try { ... } finally { ... }. An exception escaping from streamHandle.fragments or streamHandle.completion propagated to the framework while the completion event was flushed with the default outcome ('cancel' after F1) and no error metadata — so error analytics for the streaming Stage 3 path was effectively blind.

Added a catch (err) that stamps outcome='error', records errorKind / errorMessage, then rethrows so the framework still surfaces onError to the webview reducer.

Commit: 8a44403c

The dedupe check in prefetchQueryInsights sampled `currentContext.queryInsights`
from the render's closure. When the user ran a 2nd/3rd/etc. query, the
reset effect's `setState` (queryInsights -> idle) hadn't committed by
the time `runFindQuery.then` fired this callback, so the closure still
saw the previous run's terminal state (e.g. s3Success) and the early
`return` skipped the prefetch. The user saw a Stage 1 spinner on tab
switch instead of pre-warmed data.

Move the bail-and-claim into a functional setCurrentContext updater that
reads `prev.queryInsights` (always the latest committed state). A
captured `claimed` boolean carries the outcome back out so the network
call only fires when we actually flipped the pipeline to s1Loading.

The updater stays pure-enough for StrictMode / concurrent re-invocation:
both invocations see the same `prev`, compute the same result, and
`claimed=true` is idempotent. The network call sits outside the updater,
so it fires exactly once regardless of how many times React re-runs the
reducer.

Refs: PR #711 review finding F2.
@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Review finding F2 (new on the 2026-06-08 re-review pass) — fixed in 4d398dd.

The dedupe check in prefetchQueryInsights sampled currentContext.queryInsights from the render's closure. When the user ran a 2nd/3rd/etc. query, the reset effect's setState (queryInsights → idle) hadn't committed by the time runFindQuery.then fired the callback, so the closure still saw the previous run's terminal state (e.g. s3Success) and the early return skipped the prefetch entirely. Real user impact: every query after the first paid the Stage 1 fetch latency on tab switch instead of seeing pre-warmed data — the streaming UX is unaffected, but the prefetch optimization silently no-op'd.

Moved the bail-and-claim into a functional setCurrentContext updater that reads prev.queryInsights (always the latest committed state). A captured claimed boolean carries the outcome back out so the network call only fires when we actually flipped the pipeline to s1Loading. The updater stays safe under StrictMode / concurrent re-invocation: both invocations see the same prev, compute the same result, and claimed=true is idempotent. The network call is outside the updater, so it fires exactly once.

Commit: 4d398dd1

tnaum-ms added 4 commits June 8, 2026 21:55
Adds docs/ai-and-plans/PRs/711-stream-query-insights/review-2026-06-08.md
documenting the 2026-06-08 re-review pass: 10 findings (F1-F10), each
with original-claim verification, independent severity assessment,
proposed solutions with pros/cons, and an inline Resolution block
linking the commit that landed and explaining why that option was
chosen over the alternatives. Closing summary table cross-references
all 10 commits.
The functional-setState approach landed in 4d398dd was broken in
practice: `setState((prev) => ...)` *queues* the updater — React runs
it on the next render, not synchronously inside the call. The captured
`claimed` boolean was therefore always false when the gating
The updater itself still committed though, leaving state pinned at
`s1Loading` so the Query Insights tab's fallback fetch saw "someone is
already loading" and waited forever.

Symptom reported by user: Stage 1 prefetch silently no-ops on every
query, and opening the Query Insights tab does nothing (no trace
output, no UI advance past pre-Stage-1).

Replaced with the ref-based read (option #2 from the F2 analysis):
sample `currentContextRef.current.queryInsights` for the gate. By the
time `runFindQuery.then` fires this function (network call, tens to
hundreds of ms), React has long since committed the reset effect and
the ref-updating effect has copied the post-reset state into the ref —
so the one-commit ref lag I called out is not observable for this
caller in practice.

Also updated the F2 Resolution + summary table in
docs/ai-and-plans/PRs/711-stream-query-insights/review-2026-06-08.md
to document the regression, the revert, and why ref-based read is
appropriate here.

Refs: PR #711 review finding F2 (hotfix on top of 4d398dd).
@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

F2 — regression hotfix (on top of 4d398dd).

The functional-setState approach landed in 4d398dd was broken in practice and the user hit it immediately: Stage 1 prefetch silently no-op'd on every query, and opening the Query Insights tab did nothing (no trace output, no UI advance past pre-Stage-1).

Root cause: setState((prev) => ...) queues the updater — React runs it on the next render, not synchronously inside the call. The captured claimed boolean was therefore always false when the gating if (!claimed) return; line ran, so the network call never fired. The updater itself still committed though, leaving state pinned at s1Loading — the Query Insights tab's fallback fetch then saw "someone else is loading" and waited forever. This is the standard "side-effect from setState callback" anti-pattern; the StrictMode-purity caveat I called out earlier was the right shape of concern but the wrong root cause.

Replaced with option #2 from the original F2 analysis (ref-based read) in 4a31ac1. By the time runFindQuery.then fires this function (network call, tens to hundreds of ms), React has long since committed the reset effect and the currentContextRef-updating effect has copied the post-reset state into the ref — so the one-commit ref lag I noted earlier is not observable for this caller in practice.

Updated F2's Resolution block + summary table in docs/ai-and-plans/PRs/711-stream-query-insights/review-2026-06-08.md to document the regression and the revert (commit bbec466).

Commits:

tnaum-ms added 2 commits June 8, 2026 22:24
…r (F2)

PR #711 finding F2. The prefetch gate read the pipeline state through the
render closure (original) or a ref (attempt 2, 4a31ac1) — both can observe
the pre-reset terminal state and skip the warm-up on the 2nd+ query.

Read no pipeline state in the closure at all. Gate the prefetch *call* on
executionIntent (initial/refresh), which comes from the activeQuery that
triggered this run and is therefore never stale for it, and which mirrors the
reset effect's own condition. Perform the idle->s1Loading claim *inside* the
functional setCurrentContext updater, whose `prev` composes with the reset
effect's `-> idle` update. The reset is queued synchronously in the effect
phase, strictly before this promise microtask, so `prev` is idle on a fresh
query by event-loop ordering — not by network-latency timing.

Also add an idle guard to the claim so a stale-but-passing gate can no longer
reset an already-advanced pipeline (a latent risk in the unguarded attempt-2
claim).
…claim fix

Capture the full F2 arc (PR #711): attempt #1 (4d398dd) regressed production
because the captured `claimed` flag is read before the queued setState updater
runs; attempt #2 (4a31ac1) works but is timing-dependent on a one-commit ref
lag; the final fix (968858a) gates the prefetch call on executionIntent and
claims inside the functional updater, whose `prev` composes with the reset's
already-queued `-> idle` update — a structural ordering guarantee rather than a
network-latency window. Updates the Resolutions summary row to match. Proposed-
solutions section left unchanged.
@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

F2 (Stage 1 prefetch race) — re-assessed and re-fixed

I re-derived F2 from source and concluded the shipping ref-based fix (attempt #2, 4a31ac18) is correct in production but timing-dependent, so I replaced it with a fix whose guarantee is structural rather than empirical.

Why the prior attempts were each unsatisfying

  • Attempt #1 (4d398dd1, reverted) regressed production. It carried a claimed boolean out of a functional setState updater to gate the request. setState((prev) => …) queues the updater, so claimed was always false when the gate ran → the request never fired, yet the updater still committed s1Loading, and the tab fallback (which also bails on s1Loading) then waited forever. Root cause is React's setState-batching contract — not the StrictMode/purity caveat the original note reached for.
  • Attempt #2 (4a31ac18) is timing-dependent. It reads the gate from currentContextRef.current. That ref is mirrored from state by a separate useEffect, so it lags one commit; the fix only holds because a real runFindQuery round-trip outlasts that lag. Under synchronous/mocked resolution, or a future synchronous caller, the stale read (and F2) returns. The claim was also unguarded, so a stale-but-idle ref could reset an already-advanced pipeline back to s1Loading.

The fix — 968858a7

Read no pipeline state through a closure or ref:

  1. Gate the prefetch call on executionIntent (initial/refresh) at the .then site. executionIntent belongs to the activeQuery that triggered this run's effect, so it can't be stale for this run, and it mirrors the reset effect's own condition — prefetch now fires exactly when the pipeline is being reset to idle (and no longer wastes a fetch attempt on pagination).
  2. Claim idle → s1Loading inside the functional updater, reading prev. React applies queued updaters in order, and the reset effect's → idle is queued synchronously in the same effect phase, strictly before this promise .then microtask — so prev is idle here by event-loop ordering, not network latency. An if (prev.queryInsights.kind !== 'idle') return prev; guard prevents clobbering a fallback that already claimed.

The request still fires unconditionally after the claim (threading "did we claim?" out of an updater is exactly the attempt-#1 trap). In the narrow both-paths race the fallback may also fetch; the existing F8 guards drop the second result and the stage1Succeeded guard ensures ≤ one Stage 2 — and getQueryInsightsStage1 is an idempotent cache read, so the worst case is one redundant queryPlanner read, no worse than attempt #2.

Failure modes traced by hand

  • q1 → q2 → q3: each resets to idle, claim sees idle, prefetch fires every time. ✅
  • Open Insights tab before prefetch resolves: prefetch claimed s1Loading; fallback observes it and bails (no duplicate). ✅
  • StrictMode double-invoke: prefetch runs from a .then, not an effect, so StrictMode doesn't double it; the run effect's pre-existing double-run is absorbed by the F8 + stage1Succeeded guards (≤ one Stage 2). ✅
  • Synchronous runFindQuery resolution: reset is still queued before the .then microtask → prev is idle regardless of timing. ✅

npm run lint, npm run build, and npx jest --no-coverage (2039 tests) all pass. Doc trail (full arc + summary table) updated in review-2026-06-08.md under F2 in 600ba854.

…r-format F2 review doc

- Remove two stale eslint-disable-next-line react-hooks/set-state-in-effect
  comments in CollectionView (the rule is not active in flat config, so the
  directives were unused; lint stays clean without them).
- Reflow review-2026-06-08.md to Prettier code style (emphasis markers and
  summary-table column alignment); no content changes.
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

✅ Code Quality Checks

Check Status How to fix
Localization (l10n) ✅ Passed
ESLint ✅ Passed
Prettier formatting ✅ Passed

This comment is updated automatically on each push.

@tnaum-ms tnaum-ms added the on-hold Approved but not merging yet. Reason in comments. Remove + click Ready for review to release. label Jun 8, 2026
@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Approved.

@tnaum-ms

tnaum-ms commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

On hold for the 0.9.0 release

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

📦 Build Size Report

Metric Base (main) PR Delta
VSIX (vscode-documentdb-0.9.0-beta.vsix) 7.60 MB 7.60 MB ⬆️ +2 KB (+0.0%)
Webview bundle (views.js) 5.88 MB 5.88 MB ⬆️ +3 KB (+0.1%)

Download artifact · updated automatically on each push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

on-hold Approved but not merging yet. Reason in comments. Remove + click Ready for review to release.

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

Query Insights Stage 3: stream AI response progressively instead of waiting for the full answer

2 participants