feat(query-insights): streaming UX for Stage 3 AI recommendations by tnaum-ms · Pull Request #711 · microsoft/vscode-documentdb

tnaum-ms · 2026-05-29T16:41:46Z

Problem

After PR #690 shipped AI-powered Query Insights, the dominant pain point was a ~15 second blank wait: the user clicked "Get AI Performance Insights", watched a spinner for ~15 s, and the entire result appeared in one drop. Time-to-first-token from the model was ~3 s; the LLM streamed for ~10–12 s, but we buffered everything before parsing and rendering once. We were strictly worse than what the model was already producing.

What changed

Phase 1 — Plumb the stream

CopilotService gains a streamMessage API that exposes the LLM response as an AsyncIterable<string> of fragments plus a completion promise (CopilotStreamHandle). The existing buffered sendMessage path survives for non-streaming callers; both share a single private streamToModel primitive to prevent drift.

A new collectionView.queryInsights.* sub-router carries the Stage 3 procedures (split out of the monolithic collectionViewRouter.ts). The streamStage3 procedure is a tRPC subscription — the first real consumer of the framework's subscription transport in src/. It is an async function* that iterates streamHandle.fragments and yields domain events directly, with automatic backpressure, linear error propagation, and one-path AbortSignal cancellation.

Phase 2 — Tolerant incremental parser

StreamingResponseParser is a char-by-char state machine that consumes the cumulative JSON buffer and emits domain events as soon as it can prove them complete:

summary / educational events with cumulative markdown on every \n boundary.
recommendationStarted{index} on each improvements[] element open, recommendation{index} on close (string-and-escape-aware brace counting).
finalize() runs a canonical JSON.parse over the full buffer for reconciliation and emits the terminal complete event.

Ships with 25 unit tests covering: byte-at-a-time feeding, escape boundaries split across fragments, multiple improvements, zero improvements, truncated stream, malformed JSON fallback, double-finalize guards, and out-of-order keys.

Phase 3 — Progressive rendering

The webview swaps from .query() to .subscribe() and routes each event into a per-stream QueryInsightsStreamingState — the sole source of truth for Stage 3 cards.

Option A layout: all three Stage 3 slots (Analysis → Recommendations → Educational) are pre-reserved in canonical order the moment the user clicks "Get AI Performance Insights". Each slot starts with a spinner placeholder and is filled in place as its event arrives — layout never reorders. Earlier behaviour let cards arrive in LLM emission order, landing in arbitrary positions and pushing content down.

CardStack replaces AnimatedCardList at the Stage 3 call site — a lighter-weight container for "cards only ever added, group disappears as a unit". Per-item <Collapse appear visible> (or <Fade> via per-item opt-out) for enter, outer <Fade> for group exit. AnimatedCardList is kept in the tree for scenarios that need per-item enter/exit.

Stage3AnalyzingCard — a slim one-line card (Spinner + "AI is analyzing…" + Cancel button) shown during loading. The full GetPerformanceInsightsCard collapses out entirely while the stream is active and animates back in on cancel or completion. A single race-free reducer commit handles all three terminal states (complete / error / cancelled), so isStage3Loading and card visibility flip atomically.

Phase 4 — Telemetry preservation

A new documentDB.queryInsights.stage3.completed event is flushed from the subscription's finally block. It carries the same 17 properties/measurements the old buffered procedure used to record, plus durationMs (wall-clock from request to flush) and aborted. The auto rpc event still fires for subscription-create-rate metrics. No data is lost.

Cleanup + dead code removal

After the streaming path was stable:

Removed the buffered getQueryInsightsStage3 procedure and its transformAIResponseForUI family (−392 LoC, no callers since WI-6).
Removed TipsCard / "DocumentDB Performance Tips" stalling card (−238 LoC, existed only to entertain during the 15 s wait).
Dropped the stage3Promise field (always null on the streaming path) and the streaming OR stage3Data dual-source fallback render.
Added inline "do not simplify" comments on load-bearing non-obvious patterns (pendingEnter rAF flip, requestKey staleness guard, telemetry delivery guarantees, etc.).

Model simplification — hard-target `copilot-utility`

PR #690 introduced a per-feature preferred-model chain (gpt-4.1 → gpt-4o → copilot-utility). Two things changed: GPT-4.1 retired 2026-06-01, and GitHub's utility-models guidance confirmed that only requests routed through the copilot-utility alias avoid consuming premium request units. Any direct family-name target (gpt-4.1, gpt-4o) from a third-party extension bills the user's premium budget.

The chain is replaced by a single selectUtilityModel() that calls vscode.lm.selectChatModels({ vendor: 'copilot', family: 'copilot-utility' }) with no fallback to a billed model. CopilotService.isAvailable() uses the same filter. The four per-feature family constants, getPreferredFamilies, selectBestModel, and the "preferred family was not used" warnings are all deleted.

modelName and modelVersion are now recorded on every shared telemetry event so per-feature events attribute the actual backing model (the alias resolves to an opaque backing model at runtime).

Telemetry mapping

Old key	New key	Notes
`documentDB.rpc.query.collectionView.getQueryInsightsStage3`	removed	Stage 3 path is now exclusively `streamStage3`
`…getQueryInsightsStage3 → properties.` / `measurements.` (17 keys)	`documentDB.queryInsights.stage3.completed → same keys`	Identical population
(none)	`…stage3.completed → measurements.durationMs`	New: wall-clock duration
(none)	`…stage3.completed → properties.aborted`	New: terminal abort state

The auto rpc event documentDB.rpc.subscription.collectionView.queryInsights.streamStage3 still fires per subscription (useful for create-rate metrics) but carries ~0 duration and no custom measurements. Switch telemetry queries targeting Stage 3 completion to documentDB.queryInsights.stage3.completed.

Original plan: docs/ai-and-plans/PRs/711-stream-query-insights/original-plan.md — 11-WI plan with per-WI outcome blocks and deviation log.
PR feat: AI model transparency for Query Insights panel #690 — added the model byline and cost-neutral disclosure that this PR's UX choreography integrates with.
PR Webview API package preview hardening and consumer reshape #676 — established the tRPC subscription transport that streamStage3 is the first real consumer of.
Tracking issue: Query Insights Stage 3: stream AI response progressively instead of waiting for the full answer #665.

…pper (WI-1) Introduce a shared StreamingPlaceholder shimmer row (variants: standalone/inline, optional icon/label/elapsed/chars meta, role=status/aria-live=polite) under src/webviews/.../queryInsightsTab/components/streamingPlaceholder/, and a client-side StreamingProgressStepper that drives a four-step checklist ('Analyzing plan' → 'Identifying issues' → 'Generating recommendations' → 'Finalizing') using the placeholder for the active step. Wire the stepper into GetPerformanceInsightsCard's loading state in place of the previous Spinner + 'AI is analyzing…' row; keep the Cancel button and the existing Announcer. No backend change. Tips/error card behavior preserved. Phase 0 / WI-1 of the Stage 3 progressive-streaming plan.

Add CopilotService.streamMessage(messages, options) that exposes the LLM response as a pull-based AsyncIterable<string> of fragments alongside a completion promise carrying the full response metadata (modelId, family, displayName, durationMs, best-effort token usage). Implementation factors the model iteration loop into a private streamToModel primitive shared with the existing sendToModel, so the buffered and streamed paths cannot drift apart. The producer runs in the background under a single callWithTelemetryAndErrorHandling('vscode-documentdb.copilot.streamMessage') wrapper and pushes each fragment into a private single-consumer FragmentChannel (supports return() for early break). The AbortSignal -> CancellationTokenSource bridge is preserved so aborting the iterator stops the model call. Adds src/services/copilotService.test.ts covering: fragments in order + usage/duration on completion, mid-stream abort rejects iteration and completion with UserCancelledError, and missing model rejects with the explanatory error. WI-2 of the Stage 3 progressive-streaming plan.

…hestration (WI-3) Add optimizeQueryStreaming(...) alongside optimizeQuery in indexAdvisorCommands.ts. Extracted a shared prepareOptimizationRequest helper (cluster info + explain + stats + prompt build + message construction + pre-LLM telemetry) plus a finalizeOptimizationResponse helper so the buffered and streaming variants cannot drift apart — only the LLM-call line differs (CopilotService.sendMessage vs. streamMessage). Add AIOptimizationStreamHandle and QueryInsightsAIService.getOptimizationRecommendationsStreaming(...) that delegates to optimizeQueryStreaming and resolves completion with the fully-parsed AIOptimizationResponse (single JSON.parse at end — WI-7/WI-8 will introduce incremental parsing). Buffered callers are unchanged: optimizeQuery and getOptimizationRecommendations still produce identical results. Adds QueryInsightsAIService.streaming.test.ts covering: end-to-end fragment iteration + parsed completion (model identity carried through), and malformed-JSON rejection on completion. WI-3 of the Stage 3 progressive-streaming plan.

…(WI-4) Create src/webviews/documentdb/collectionView/queryInsights/queryInsightsRouter.ts and move Stage 1/2/3 + executeQueryInsightsAction procedures (and the readQueryInsightsDebugFile helper) into it as a pure relocation — procedure bodies are unchanged. Mount the new sub-router under collectionView.queryInsights so the four procedures are now reachable at collectionView.queryInsights.* in the tRPC client (e.g. collectionView.queryInsights.getQueryInsightsStage3.query). Per D12 / the package README convention, the streaming subscription landing in WI-5 will live in a sibling queryInsightsEventsRouter.ts merged into queryInsightsRouter — keeping 'things the webview calls' separate from 'things the host pushes'. Update all four webview call sites (QueryInsightsTab.tsx + CollectionView.tsx) to use the new paths. Remove now-unused imports from collectionViewRouter.ts. ⚠️ Telemetry: rpc event paths for these procedures now include a 'queryInsights' segment (e.g. documentDB.rpc.query.collectionView.queryInsights.getQueryInsightsStage3 — previously documentDB.rpc.query.collectionView.getQueryInsightsStage3). Telemetry queries that hard-coded the old path must be updated. WI-4 of the Stage 3 progressive-streaming plan.

Add QueryInsightsStreamEvent discriminated union in types/queryInsightsStream.ts (coarse subset for WI-5: status + result; WI-8 will extend with per-domain events). Add queryInsightsEventsRouter.ts exporting a queryInsightsEventsRoutes record. Procedures are spread into queryInsightsRouter so the webview-visible paths stay flat (collectionView.queryInsights.streamStage3). This keeps push-style procedures in a sibling file per D12 / the package README convention, without depending on the (currently non-re-exported) t.mergeRouters helper. streamStage3 is an async function* that: 1. yields status: connecting, 2. builds queryContext + staticAnalysisSummary like the buffered procedure, 3. calls getOptimizationRecommendationsStreaming (Option A — no TypedEventSink), 4. iterates fragments emitting throttled (250ms) status: receiving carrying elapsedMs + cumulative charsReceived, 5. yields status: parsing then awaits the parsed completion, 6. yields a single result carrying today's QueryInsightsStage3Response payload. A per-subscription AbortController forwards ctx.signal aborts down to the LLM call; finally aborts on iterator.return() (panel dispose / subscription.stop). Webview migration is deferred to WI-6. WI-5 of the Stage 3 progressive-streaming plan.

Convert QueryInsightsTab from the buffered collectionView.queryInsights.getQueryInsightsStage3.query() call to the streaming collectionView.queryInsights.streamStage3.subscribe() added in WI-5. Behavior preserved: - requestKey staleness guard kept on every onData/onError path so late or raced callbacks from the framework's unsubscribe path are silently discarded. - Cancel button now calls sub.unsubscribe() which the framework translates into subscription.stop -> server-side AbortController abort + iterator.return() on the generator (per package README), so the underlying LLM call is cancelled in lock step. - Unmount cleanup uses unsubscribe() for the same reason. WI-6 deliberately ignores the coarse type: 'status' events emitted by streamStage3; only the terminal type: 'result' event drives the final UI (writing stage3Data + transitionToStage(3, 'success')), keeping the rendered Stage 3 output bit-identical to today. WI-9 will start consuming the status events for progressive rendering. stage3Promise field is kept on the context type for now (always set to null in this code path); it was write-only state, and removing it would be a wider refactor outside this WI. WI-6 of the Stage 3 progressive-streaming plan.

@deprecated

Add src/documentdb/queryInsights/streamingResponseParser.ts: a pure, char-by-char state-machine tokenizer that consumes fragments of the Stage 3 LLM response (single top-level JSON object) and emits structured QueryInsightsStreamEvent's without waiting for the response to complete. Emits: - summary / educational with cumulative markdown at \n\n paragraph boundaries (complete:false) and once on string close (complete:true); - recommendationStarted on each improvements[] item open and a fully-parsed recommendation on item close, with monotonic indices; - verification (once on finalize) sourced from the reconciled JSON.parse rather than from per-item streaming extraction (see Deviation Log). On finalize() the parser runs JSON.parse on the full buffer and returns {events, parsed, parseError}; parsed is the canonical view per plan D6 (zero-regression vs. the buffered path). Handles truncated mid-string streams (flushes a final complete:true), malformed / empty buffers (parseError set, parsed=null), unknown top-level keys (object / array / number / bool / null all skip-consumed), and fragment boundaries inside escapes (between \\ and the escaped char, mid-\u, mid-hex). Extended QueryInsightsStreamEvent (types/queryInsightsStream.ts) with the structured variants: summary, educational, recommendationStarted, recommendation, verification, complete (+ a webview-safe QueryInsightsStreamUsage mirror of CopilotTokenUsage). The WI-5 transitional 'result' variant is kept and marked @deprecated; WI-8 will remove it from the subscription path once the parser is wired in. Added 25 Jest cases in streamingResponseParser.test.ts covering: basic happy path, byte-at-a-time feeding, progressive emission, simple + unicode escapes incl. fragment boundaries, multiple improvements with stream-order indices, braces / brackets inside shellCommand / justification / risks (no false-positive boundary), nested arrays in indexOptions, empty improvements, verification reconciliation, unknown extra keys, truncation tolerance, malformed / empty / whitespace-only buffer, double-finalize + post-finalize-feed guards, out-of-order key arrival. WI-7 of the Stage 3 progressive-streaming plan.

@deprecated

…n (WI-8) Replace the coarse-only output of the WI-5 streamStage3 subscription with parser-driven structured domain events. The subscription now creates a per-call StreamingResponseParser (WI-7) and feeds each fragment to it; the parser's structured events (summary / educational with cumulative markdown at paragraph boundaries, recommendationStarted / recommendation per improvements[] item) are yielded ahead of the throttled status: receiving for the same fragment so progressive UI gets first priority. After the stream completes, the subscription yields status: parsing, then the parser's trailing events via parser.finalize() (final summary / educational with complete: true for truncation tolerance, plus verification sourced from the reconciled JSON.parse), then a terminal complete carrying modelDisplayName / modelId / usage from streamHandle.completion's parsed response. Removed the transformAIResponseForUI call from this subscription: the per-recommendation transform that turns AIIndexRecommendation into the UI ImprovementCard moves to the webview in WI-9 (per plan §4 / D7 "letting the webview own card-component choice"; the subscription speaks domain only). Also dropped the now-unused clusterId destructure and the transformations import. Transient by design: the result event is no longer emitted (the union still carries it as @deprecated for one more WI so WI-6's event.type === 'result' check stays type-safe), so the WI-6 webview handler will not populate stage3Data until WI-9 wires the structured events into UI state. This matches the WI-8 acceptance criterion ("events arrive in stream order"; webview reconstruction is WI-9's job). WI-8 of the Stage 3 progressive-streaming plan.

…ab (WI-9) Wire the WI-8 structured streaming events into progressive UI. Context: add QueryInsightsStreamingState (summary, educational, recommendations[], verification) plus stage3Streaming on the QueryInsightsState. The subscription's onData handler routes each structured event into the matching slot via setQueryInsightsStateHelper (initialising the per-stream object lazily on the first event; recommendationStarted pushes a null sentinel at the right index, recommendation fills it in), and on the terminal complete event a new local synthesizeStage3Data helper materialises a full QueryInsightsStage3Response into stage3Data so byline / collapse / "powered by" code paths reading stage3Data keep working unchanged. transitionToStage and handleCancelAI clear stage3Streaming alongside stage3Data on phase=1 reset, phase=2 loading, phase=3 loading, and cancel. Components: new ImprovementCardShell (reuses ImprovementCard's outer Card + ArrowTrendingSparkleRegular icon per D11 so card identity never changes when content arrives), and new utils/createImprovementCard.ts (pure webview-side per-recommendation transform mirroring the server-side createImprovementCard in transformations.ts byte-for-byte). Extended QueryInsightsStreamEvent's complete variant with modelFamily so stage3Data.modelFamily parity is preserved for the byline / WI-10 telemetry. Render path: phase-3 cards now read from streaming first (summary / educational grow paragraph-by-paragraph as their cumulative markdown events arrive; each recommendation index renders ImprovementCardShell while streaming.recommendations[i] is null, then the filled ImprovementCard once it fills) and fall back to stage3Data for the post-complete window / legacy buffered callers. Every card uses a stable ${keyPrefix}rec-${index} key so the shell-to-filled transition is in-place (no remount). The GetPerformanceInsightsCard collapses the moment the first structured event arrives. OPEN-1 (layout jump under R1): accepted the plan's default "one downward shift" mitigation — the educational block fills first (LLM emits educationalContent before analysis) and is pushed down once summary arrives. AnimatedCardList's existing CollapseRelaxed insert animation softens the transition. Logged in Deviation Log. WI-9 of the Stage 3 progressive-streaming plan.

Emit documentDB.queryInsights.stage3.completed from the Stage 3 streaming subscription's finally block via callWithTelemetryAndErrorHandling so its lifetime spans the whole stream and is unaffected by the trpcToTelemetry subscription-timing trap (auto rpc event resolves at generator-creation time and carries ~0 duration / no custom keys). Per-subscription CompletionTelemetry accumulator ({ properties, measurements }) is populated incrementally throughout the body: - platform from session.getClient().getClusterMetadata() (best-effort, catches metadata fetch failures and falls back to 'unknown' just like the buffered procedure), - hasCachedExecutionPlan from the cached explain output, - hasStaticAnalysisSummary / staticAnalysisSummaryLength / staticAnalysisSummaryError / staticAnalysisSummaryErrorKind from the buildStaticAnalysisSummary path, - recommendationCount + actionableRecommendationCount + per-action counters (create/drop/modify) from aiResponse.improvements, - aiModelDisclosed / aiModelFamily from aiResponse.model*, - promptTokens / responseTokens / totalTokens / maxInputTokens / promptUtilizationPct from aiResponse.usage. A flushCompletionEvent() helper — idempotent, fire-and-forget — wraps callWithTelemetryAndErrorHandling, sets errorHandling.suppressDisplay, copies the accumulator into context.telemetry, and adds two NEW keys: - measurements.durationMs (wall-clock from request start to flush), - properties.aborted ('true'/'false', set from the per-subscription AbortController). It's called from finally so it fires exactly once per stream on success, abort (panel dispose / subscription.stop / user Cancel), and the rare throw path. vscode-documentdb.copilot.streamMessage and vscode-documentdb.queryInsights.getOptimizationRecommendationsStreaming continue to fire from their existing callWithTelemetryAndErrorHandling wrappers (unchanged in WI-10). Old → new key mapping table for the PR description is in the plan under WI-10's outcome block. All Stage 3 properties / measurements that the buffered procedure carried are preserved 1:1 on the new event with identical semantics; the new event adds durationMs and aborted. No data lost. WI-10 of the Stage 3 progressive-streaming plan.

Tick WI-11 in the plan with an outcome block recording the final PR-checklist run results (npm run l10n / prettier-fix / lint / jest / build all green on a clean tree after WI-10) and notes for the reviewer: - Manual verification on a live slow query (progress < 2s, paragraph reveal, shells fill, mid-stream cancel clears partial UI, final state byte-identical to the buffered procedure) is deferred to the reviewer — it requires a live Copilot subscription and a slow query against a real cluster, neither of which the agent has access to. Recommends exercising: success, mid-stream Cancel, regenerate-while-loading, and verifying documentDB.queryInsights.stage3.completed fires with the right aborted flag in both paths. - getQueryInsightsStage3 (buffered) is no longer called from the webview after WI-6 but stays on the router per WI-3's contract ("keep the existing buffered path working for any non-streaming caller"); removing it is a follow-up. - stage3Promise on QueryInsightsState is dead write-only state after WI-6; removing it is a wider context-shape refactor not required by the streaming work. All WI-1 → WI-11 boxes are now ticked. Three Deviation Log entries recorded (WI-7 verification-from-reconcile, WI-7 \n\n-only progressive trigger, WI-9 OPEN-1 accept-one-shift). The WI-10 outcome block holds the old → new telemetry mapping table for the PR description. WI-11 of the Stage 3 progressive-streaming plan.

… add yield delay Three review-driven fixes from the WI-11 manual verification run. 1) Per-yield tracing on the Stage 3 streaming subscription. Every yield (connecting/receiving/parsing status, parser-emitted summary / educational / recommendationStarted / recommendation, trailing verification, terminal complete) now logs a single compact line via ext.outputChannel.trace in the form: [Query Insights Stage 3 stream] [+1234ms] yield: <desc> (requestKey: <key>) with a low-cardinality describeEvent() summary so the full stream is scannable in the output channel. 2) Fix: the GetPerformanceInsightsCard progress stepper used to disappear the moment the first streamed event arrived (because the CollapseRelaxed visibility condition was extended to only: the inner stepper is shown via isLoading=(phase===3 && status==='loading') and that stays true until the terminal complete event materialises stage3Data, so the progress indicator persists throughout the stream and collapses only once the data is fully in. 3) Debug aid: DEBUG_YIELD_DELAY_MS (currently 1000) sleeps before every yield to the webview so the progressive UI is observable when the LLM responds quickly. Fast-path no-op when 0 — set back to 0 before shipping. delayYield() is awaited before each yield in the subscription generator; abort is re-checked after the sleep so a user-clicked Cancel still ends iteration promptly. Verified: l10n / prettier / lint / jest (2014 ✓) / build all pass.

…ulse bar in stepper Two UI tweaks driven by the WI-11 manual run. 1) MarkdownCard gains an inFlight prop. When true, two thin skeleton lines (80 percent then 30 percent width) render below the markdown content with the same shimmer animation as StreamingPlaceholder. QueryInsightsTab passes markdown cards show a clear 'more is coming' affordance during the gaps between paragraph-boundary updates. Lines disappear as soon as the parser emits complete: true. 2) StreamingPlaceholder gains barPosition (leading vs trailing) and barStyle (shimmer vs pulse) props. Trailing/shimmer remains the default for in-card placeholders. StreamingProgressStepper's active step now uses leading/pulse so the bar sits in the same horizontal column as the bullet markers (visually replacing the bullet) and pulsates with a calm opacity animation instead of a traveling highlight. The SparkleRegular icon is dropped from the active step since the bar already occupies the marker slot. Both animations honor prefers-reduced-motion. Verified: prettier / lint / jest (2014 tests pass) / build all green.

1) Restore the StreamingProgressStepper active step to its original left-aligned layout: spark icon, label, trailing shimmer bar, then the elapsed-time meta. The barPosition=leading + barStyle=pulse experiment from the previous commit competed with the surrounding pending/done bullet markers visually; the bullet style is clearer. 2) Add per-fragment tracing on the Stage 3 streaming subscription. Every iteration of the streamHandle.fragments loop now logs a line like: [Query Insights Stage 3 stream] [+1234ms] fragment: len=42, totalChars=512, preview="…" (requestKey: …) via a new previewFragment() helper that JSON-encodes a 80-char preview with newlines/tabs/backslashes escaped. This surfaces the LLM's actual chunking so we can tell whether the parser's paragraph-boundary detection (current trigger: cumulative value ends in \n\n) is granular enough — the screenshot showed two paragraphs rendering, then a large gap, then the rest appearing all at once, which suggests the LLM is sending big chunks that span several paragraphs and the parser's per-char advance still only emits at the last \n\n boundary. Verified: l10n / prettier / lint / jest (2014 ✓) / build all pass.

Restore the original main-branch loading layout (Fluent Spinner with 'AI is analyzing…' label + Cancel button) and add ONE sub-line below the label showing the current action (Analyzing query plan… / Identifying issues… / Generating recommendations… / Finalizing…). Replaces the four-step StreamingProgressStepper checklist that was introduced in WI-1 with a much lighter visual: ⠋ AI is analyzing… Identifying issues… [Cancel] New CurrentActionLine component (sibling of StreamingProgressStepper, exported from the same barrel) keeps the same client-side timer + 4-step derivation the stepper had, just rendering a single line of description-foreground text instead of the full bullet list. Verified: l10n / prettier / lint / jest (2014 ✓) / build all pass.

…ng UI Two changes. 1) StreamingResponseParser: emit summary/educational events at every single \n in the cumulative markdown value, not only at \n\n paragraph boundaries. Each list item, heading, or blank-line break now shows up as its own progressive event — the LLM's chunking spans many lines per fragment, so the previous per-paragraph policy emitted only ~5 events across the entire markdown value; the new per-line policy raises that by roughly 5×, which is what makes the cards visibly grow in the UI. Updated the affected parser tests: - basic happy path: "Para 1.\n\nPara 2." → 3 events (was 2) - explicit policy test: "P1.\n\nP2.\n\nP3." → 5 events (was 3) - consecutive-\n test: now asserts a distinct event per \n. All 25 parser tests + full 2014-test suite green. 2) GetPerformanceInsightsCard: revert the Stage 3 loading state to the exact main-branch layout — Fluent Spinner + 'AI is analyzing…' label + Cancel button, all on a single horizontal row. The StreamingProgressStepper introduced in WI-1 and the CurrentActionLine added in the previous commit are both removed (deleted along with StreamingProgressStepper.scss); they kept breaking visually and the simple spinner row was already good enough. StreamingPlaceholder itself is kept because ImprovementCardShell still uses it for the recommendation shell body. Verified: l10n / prettier / lint / jest (2014 ✓) / build all pass.

Two changes wrapped in one commit because they are both responses to the\nsame 'streaming feels static' investigation:\n\n1. AnimatedCardList learns a per-item 'inFlight' flag. Items with\n 'inFlight: true' enter via Fade (200 ms opacity-only) instead of\n CollapseRelaxed (400 ms maxHeight + overflow:hidden). CollapseRelaxed\n captures element.scrollHeight once at mount (when the streaming card is\n nearly empty) and then clips with overflow:hidden for 400 ms, hiding\n most of the early stream and producing a 'title-only -> suddenly full'\n two-frame pop. Fade does not clip, so the markdown grows visibly as\n chunks arrive. The motion choice is captured on first mount and frozen\n for the item's lifetime so the wrapper component never swaps (which\n would remount and reset the inner card).\n\n QueryInsightsTab marks the analysis ('Query Performance Analysis') and\n educational ('Understanding Your Query Execution Plan') cards as\n inFlight while their progressive 'complete: false' is still in effect.\n Recommendation shells, error card, tips, and the post-success snapshot\n re-render continue to use CollapseRelaxed unchanged.\n\n2. Drop the experimental Stage 3 stream diagnostics added in WI-5: the\n per-stream StreamDiagnostics accumulator, the info-level Diagnostics\n summary, both UX SIGNAL heuristics, every per-event / per-fragment\n outputChannel.trace line, and the debug-only DEBUG_YIELD_DELAY_MS\n artificial delay. These served their purpose (proved the per-newline\n parser emission is healthy and the LLM does not buffer). What is left\n in queryInsightsEventsRouter.ts is the minimum: the STATUS_EVENT_INTERVAL_MS\n throttle, the per-stream AbortController, the WI-10 completion telemetry\n accumulator, and the two existing trace lines bracketing the stream\n (Started / Completed). Net -143 lines.\n\nNo behaviour change to the stream protocol or telemetry contract.

…nt in AnimatedCardList Two latent bugs in AnimatedCardList that combined to make the Stage 3\nstreaming cards feel 'frozen until they snap to complete':\n\n1. Enter animation was silently skipped on every mount. Items mounted\n with visible={true} on first render, and Fluent's createPresenceComponent\n factory defaults appear={false} — so the framework treated the very\n first mount as 'already in' and ran NO enter motion. Fix: add a\n pendingEnter flag. New items first render with visible={false}, and a\n requestAnimationFrame on the next frame flips them to visible={true}.\n The presence component now sees the false → true transition it needs\n and the enter motion (CollapseRelaxed for non-streaming cards, Fade\n for streaming ones) actually plays.\n\n2. Same-key component updates were silently dropped. The useEffect\n early-returned when no keys had been added or removed — but the\n parent (QueryInsightsTab during streaming) hands us a fresh items\n array every render, including when only an existing card's component\n reference changes to carry new streamed markdown. With the early\n return in place, the new component reference never landed in\n displayItems, so the rendered tree kept showing the stale ReactNode\n and the card looked frozen even though its source content was\n changing. Fix: drop the early return. The general case below already\n handles 'no additions, no removals' correctly — it walks items in\n order, updates component for matched keys, and produces a fresh\n displayItems array. Removals are still detected because we consume\n displayMap entries as we walk; leftovers in displayMap are the items\n to exit.\n\nBoth fixes are independent but ship together because the streaming UX\nneeds both: enter motion makes the card animate IN, the dropped early\nreturn makes the markdown grow visibly while it's IN.

Standalone dev-only webview wired to a new testing command\n(vscode-documentdb.command.testing.openMotionSandbox, sibling of the\nexisting startDemoTask command). Pure UI playground — no tRPC, no data,\nno telemetry, no localised strings. Never bound to any user-facing UI\nsurface; opens only from the Command Palette.\n\nThree independent labs:\n\n- CleanCollapseLab — single Collapse + Show/Hide + appear switch +\n Remount. The smallest possible test of Fluent's enter/exit motion,\n isolated from any queue/reducer logic. Toggle appear OFF to\n reproduce the original 'no enter animation' bug.\n\n- StaggerLab — wraps N motion-wrapped cards in @fluentui/react-motion-\n components-preview's new Stagger choreography helper (≥ 0.15.0).\n Exposes itemDelay, reversed, hideMode (visibleProp /\n visibilityStyle / unmount), and delayMode (timing / delayProp). Group\n presence — fits 'show/hide a whole list together', NOT 'prepend one\n new item to an existing list'.\n\n- AddFromTopLab — the production scenario. 'Add at TOP' prepends to\n the list; siblings get pushed down. Confirms (visually) that only\n the Collapse* variants animate height — every other variant lets the\n newly-mounted item grab its natural height instantly, causing\n existing siblings to jump.\n\nWiring:\n- Registered as 'motionSandbox' in WebviewRegistry.\n- MotionSandboxController extends WebviewControllerBase with the\n minimum router context (dbExperience + webviewName).\n- Command handler in ClustersExtension lazy-imports the controller so\n the dev tool has zero cost when not opened.\n- Command entry added to package.json under the 'DocumentDB' category.

…iments" This reverts commit d4979c5.

Stage 3 has been served by the 'streamStage3' subscription since WI-6;\nthe buffered 'getQueryInsightsStage3' query has had no caller in 'src/'\nfor several commits. Removing it removes 386 lines of dead code and\nshrinks the surface a future contributor could accidentally re-wire.\n\n- Delete the 'getQueryInsightsStage3' tRPC query and its now-unused\n imports ('QueryObject', 'buildStaticAnalysisSummary',\n 'QueryInsightsStage3Response') from queryInsightsRouter.ts. The\n subscription 'streamStage3' (in queryInsightsEventsRouter.ts, spread\n in via 'queryInsightsEventsRoutes') is unchanged.\n\n- Delete 'transformAIResponseForUI' and the five private helpers it\n used (createImprovementCard, getPrimaryButtonLabel, getCardTitle,\n getPrimaryActionId, generateIndexExplanation) from\n src/documentdb/queryInsights/transformations.ts. They were only ever\n consumed by the deleted procedure. The webview-side equivalent in\n utils/createImprovementCard.ts is now the single source of truth for\n the AIIndexRecommendation -> ImprovementCardConfig shape (also drives\n the WI-9 progressive recommendation cards).\n\n- Drop the dead 'stage3Promise' field from QueryInsightsState. The\n field was always written as 'null' along the streaming path and never\n read by any UI code; it was a leftover from the buffered '.query()' /\n promise-based path. Five dead writes removed in QueryInsightsTab.tsx,\n the field + default removed from collectionViewContext.ts.\n\nNo behaviour change. No telemetry change.\n'documentDB.queryInsights.stage3.completed' (the WI-10 dedicated\ncompletion event) remains the canonical Stage 3 telemetry source.

…treaming state Nothing in the UI surfaces verification items today, and stashing them\nin 'QueryInsightsStreamingState.verification' invited bugs of the form\n'why is this field populated but never shown'.\n\n- Remove the 'verification: string[] | null' slot from\n 'QueryInsightsStreamingState'.\n- Remove the reducer case that wrote to it; replace with an explicit\n comment that the 'verification' event is intentionally ignored.\n- Drop the now-empty join in 'synthesizeStage3Data'; explicit comment\n there too.\n\nThe parser is unchanged: it still emits a 'verification' event from\n'finalize()' (sourced from the canonical 'JSON.parse'), so a future\ncard that wants to render verification items can wire it up without a\nprotocol change. To restore: re-add the 'verification' slot to the\nstreaming state and a matching reducer case.\n\nNo wire change. No telemetry change.

Eliminates the 'streaming OR stage3Data fallback' dual-source pattern in the card render path. After cleanup #2 removed the buffered 'getQueryInsightsStage3' procedure, the fallback to 'stage3Data' was dead — 'stage3Streaming' is populated from the very first structured event and preserved past the terminal 'complete' event (the reducer writes both stage3Data and stage3Streaming together). - Analysis Card: render gated only on streaming?.summary. - Improvement Cards: drop the entire else-if (... stage3Data?.improvementCards) fallback branch (~25 lines) including the MarkdownCard-without-buttons edge case that mapped to the old buffered transform shape. - Educational Card: render gated only on streaming?.educational. stage3Data still has two legitimate readers (documented in the rewritten comment block above the render): (a) the GetPerformanceInsightsCard collapse condition uses (not stage3Data) as a 'has succeeded at least once' sentinel; (b) the byline / model-disclosure footer reads stage3Data.modelDisplayName. No visible behaviour change; one less branch on every Stage-3 render.

…lification' in AnimatedCardList Adds an explicit 'do NOT simplify this to appear={true} visible={true}' note next to the pendingEnter flag on ItemState. The two-step (mount with visible={false}, flip to true on rAF) exists because items are added from a parent reducer that re-renders the list synchronously: by the time Fluent's presence component effect runs, visible={true} is already the initial value, and the framework only animates on a subsequent visible change. The note saves a future maintainer the 'I tried appear=true and it didn't help, why does this trick exist' loop.

…ngle source of truth The server-side twin (transformAIResponseForUI + createImprovementCard in src/documentdb/queryInsights/transformations.ts) was deleted as part of the buffered-Stage-3-path cleanup. Updates the file's preamble JSDoc to (a) drop the 'kept in sync with the server twin' contract that no longer applies and (b) explicitly tell future contributors: if you change this file, no server-side counterpart needs the same edit; if you ever reintroduce a server-side renderer, port this file rather than diverging from it.

…mment refresh Two doc-only changes to queryInsightsEventsRouter.ts: 1. The 'flushCompletionEvent' helper now spells out, line by line, when the dedicated 'documentDB.queryInsights.stage3.completed' event actually reaches the wire: success / cancel / panel close / regenerate (yes), normal VS Code shutdown (usually, modulo the grace window), force-quit / OS kill / extension host crash (no). Future maintainers should not promote this to at-least-once semantics without first choosing a backing store + dedupe key. Documents the design decision made in the PR review. 2. Refresh stale references to 'getQueryInsightsStage3' and 'transformAIResponseForUI' (both deleted in the buffered-path cleanup). The streaming subscription is now the only Stage 3 entry point; the JSDoc on the procedure and the queryContext-builder comment reflect that. The CompletionTelemetry interface preamble still mentions the old procedure as the historical key-set source, since that is precisely the contract telemetry consumers care about.

…oad-bearing Expands the inline comment next to the 'requestKey = crypto.randomUUID()' allocation in handleGetAISuggestions to spell out the exact 3-step race the guard exists to defuse (Cancel → unsubscribe → new subscribe → late trailing callback from the old subscription). The comment now explicitly tells future maintainers: do NOT remove the 'if (prev.stage3RequestKey !== requestKey) return prev;' check from the reducer in a 'because the framework promises cleanup' refactor. tRPC subscription cleanup is not strictly synchronous to unsubscribe().

A small new component that renders a Fluent Spinner next to a short localized label (e.g. 'Analyzing…'). Replaces the two-line shimmer ('streaming-content-lines') we used to put inside streaming cards. The shimmer competed with the streamed markdown for the user's attention and did not read as 'this is still working' on first glance — a spinner does. Used in three places by upcoming commits: inside MarkdownCard while a streamed value is still growing, inside ImprovementCardShell while a recommendation is being written, and inside the Stage-3 pre-reserved slot placeholders before any structured event has arrived. Accessibility: role='status' + aria-live='polite'. The spinner is decorative (aria-hidden); the label carries the semantic meaning.

…+ recommendation cards MarkdownCard.inFlight no longer renders the two-line 'streaming-content-lines' skeleton; it now renders the new StreamingInlineProgress (spinner + label). Same for ImprovementCardShell, which used to wrap a StreamingPlaceholder. The spinner reads as 'still working' more clearly than the shimmer, and it stops competing with the streamed markdown for the user's attention. MarkdownCard gains an optional 'inFlightLabel' prop so each caller can pick a descriptive verb instead of a generic 'Writing…'. Today's two callers in QueryInsightsTab pass 'Analyzing…' (for the analysis card) and 'Writing explanation…' (for the educational card). Drops the dead '.streaming-content-lines' SCSS block from StreamingPlaceholder.scss; nothing references those class names any more.

… (F3) Three Stage 3 placeholder slots (analysis / first recommendation / educational) mount simultaneously when the user starts Stage 3, each rendering StreamingInlineProgress as role=status / aria-live=polite, on top of Stage3AnalyzingCard's own Announcer. That queues 4+ polite announcements per click. Drop the implicit live-region semantics and add an opt-in `announce` prop. No call site currently sets it: Stage3AnalyzingCard's Announcer is already the single global "AI is working" signal. Refs: PR #711 review finding F3.

The router header used `...queryInsights.getQueryInsightsStage3` as an example rpc-event path, but that buffered procedure was removed in favor of the `streamStage3` subscription. Drop the example outright; the surrounding sentence still conveys that the path gained a queryInsights segment. Refs: PR #711 review finding F4 (Copilot comment 3374066078).

The two appendLine calls in readQueryInsightsDebugFile use raw, non-localized strings with emoji prefixes, unlike the rest of the output-channel writes in this file which go through l10n.t(). Per author direction, this stays as-is until the broader Query Insights output- channel localization sweep. Annotate both call sites so reviewers don't re-flag the inconsistency. Refs: PR #711 review finding F5 (Copilot comment 3374066190).

…teImprovementCard (F6) The JSDoc described the streaming source as one domain object per `recommendations[]` item, but the canonical Stage 3 JSON schema and `AIOptimizationResponse` use `improvements[]`. Pure documentation fix. Refs: PR #711 review finding F6 (Copilot comment 3374066252).

…writes (F8) prefetchQueryInsights's .then and .catch unconditionally folded the resolved/failed Stage 1 payload into the current state. If a newer query had already kicked off a fresh pipeline by the time the in-flight call resolved, the stale data could clobber it (the reducer's request-key guard only covers Stage 3). Guard both handlers with `prev.queryInsights.kind !== 's1Loading'` and no-op if it doesn't match. Doesn't fully close a second-s1Loading race — that would need request-key plumbing — but Stage 1 is short-lived enough in practice that the simple guard covers the common case. Refs: PR #711 review finding F8.

The Stage3AnalyzingCard's Announcer was locked to "AI is analyzing…", which after F3 (StreamingInlineProgress no longer self-announces) became the only signal screen-reader users got — and it never narrated progress. Stack three Announcers gated on the current phase (connecting / submitted / receiving). Each fires once when its phase becomes active (Announcer re-announces on `when` false→true), giving a short three-step narrative that follows real model progress with natural spacing — no extra throttling needed. Refs: PR #711 review finding F9.

…ring (F10) CardStack's store-derived-state snapshot (`items !== lastNonEmpty` guard around an in-render setState) converges in a single render only because the parent passes a memoized `items` array. The pattern is correct (no infinite loop, as confirmed in the earlier Copilot thread response), but the dependency is invisible from either file in isolation. Cross- reference both sides so a future refactor that drops the `useMemo` in QueryInsightsTab doesn't reintroduce per-render churn. Refs: PR #711 review finding F10 (Copilot comment 3362126847).

tnaum-ms · 2026-06-08T16:34:29Z

Review finding F3 (new on the 2026-06-08 re-review pass) — fixed in 399766b.

Three Stage 3 placeholder slots (analysis / first recommendation / educational) mount simultaneously when the user starts Stage 3. Each rendered StreamingInlineProgress as role="status" aria-live="polite", on top of Stage3AnalyzingCard's own Announcer — queuing 4+ polite announcements per click. This is the same class of issue the earlier Copilot pass flagged on the now-deleted StreamingPlaceholder, resurfacing on the live component.

Dropped the implicit live-region semantics from StreamingInlineProgress and added an opt-in announce prop. No call site currently sets it; Stage3AnalyzingCard's Announcer is already the single global "AI is working" signal (and F9 makes that announcer phase-aware).

Commit: 399766b0

tnaum-ms · 2026-06-08T16:34:29Z

Review finding F8 (new on the 2026-06-08 re-review pass) — fixed in 4c71057.

prefetchQueryInsights's .then and .catch unconditionally folded the resolved / failed Stage 1 payload into the current state via setCurrentContext. If a newer query had already kicked off a fresh pipeline by the time the in-flight call resolved, the stale data could overwrite live state — Stage 1 had no request-key guard like Stage 3.

Per author direction (Stage 1 is short-lived enough in practice that the simple guard covers the common case), added prev.queryInsights.kind !== 's1Loading' bail-outs in both handlers. Doesn't fully close a second-s1Loading race — that would need request-key plumbing — but the perf cost vs. the engineering cost is right-sized.

Commit: 4c710571

tnaum-ms · 2026-06-08T16:34:29Z

Review finding F1 (new on the 2026-06-08 re-review pass) — fixed in 639935d.

streamStage3 flushed the dedicated completion event (documentDB.queryInsights.stage3.completed) from finally and sampled abortController.signal.aborted to set aborted='true'/'false'. The controller is unconditionally aborted earlier in the same finally, so every successful stream was logged as aborted — dashboards couldn't tell success from cancel/shutdown.

Switched to an explicit three-valued outcome ('success' | 'cancel' | 'error') tracked across the generator body and stamped before cleanup runs. aborted is retained as a derived back-compat property (outcome === 'cancel') so existing dashboard queries keep working.

Commit: 639935df

tnaum-ms · 2026-06-08T16:34:29Z

Review finding F9 (new on the 2026-06-08 re-review pass) — fixed in f02c2c4.

After F3 (StreamingInlineProgress no longer self-announces), Stage3AnalyzingCard's fixed-string Announcer ("AI is analyzing…") became the only signal screen-reader users got — and it never narrated progress, so the previously-flagged "for a11y review" caveat became material.

Stacked three Announcers gated on the current phase (connecting / submitted / receiving). Each fires once when its phase becomes active (Announcer re-announces on when false→true), giving a short three-step narrative that follows real model progress with natural spacing — no extra throttling needed since phase transitions are already paced by the model.

Commit: f02c2c4b

tnaum-ms · 2026-06-08T16:34:29Z

Review finding F7 (new on the 2026-06-08 re-review pass) — fixed in 8a44403.

The streamStage3 generator had only try { ... } finally { ... }. An exception escaping from streamHandle.fragments or streamHandle.completion propagated to the framework while the completion event was flushed with the default outcome ('cancel' after F1) and no error metadata — so error analytics for the streaming Stage 3 path was effectively blind.

Added a catch (err) that stamps outcome='error', records errorKind / errorMessage, then rethrows so the framework still surfaces onError to the webview reducer.

Commit: 8a44403c

The dedupe check in prefetchQueryInsights sampled `currentContext.queryInsights` from the render's closure. When the user ran a 2nd/3rd/etc. query, the reset effect's `setState` (queryInsights -> idle) hadn't committed by the time `runFindQuery.then` fired this callback, so the closure still saw the previous run's terminal state (e.g. s3Success) and the early `return` skipped the prefetch. The user saw a Stage 1 spinner on tab switch instead of pre-warmed data. Move the bail-and-claim into a functional setCurrentContext updater that reads `prev.queryInsights` (always the latest committed state). A captured `claimed` boolean carries the outcome back out so the network call only fires when we actually flipped the pipeline to s1Loading. The updater stays pure-enough for StrictMode / concurrent re-invocation: both invocations see the same `prev`, compute the same result, and `claimed=true` is idempotent. The network call sits outside the updater, so it fires exactly once regardless of how many times React re-runs the reducer. Refs: PR #711 review finding F2.

tnaum-ms · 2026-06-08T19:52:15Z

Review finding F2 (new on the 2026-06-08 re-review pass) — fixed in 4d398dd.

The dedupe check in prefetchQueryInsights sampled currentContext.queryInsights from the render's closure. When the user ran a 2nd/3rd/etc. query, the reset effect's setState (queryInsights → idle) hadn't committed by the time runFindQuery.then fired the callback, so the closure still saw the previous run's terminal state (e.g. s3Success) and the early return skipped the prefetch entirely. Real user impact: every query after the first paid the Stage 1 fetch latency on tab switch instead of seeing pre-warmed data — the streaming UX is unaffected, but the prefetch optimization silently no-op'd.

Moved the bail-and-claim into a functional setCurrentContext updater that reads prev.queryInsights (always the latest committed state). A captured claimed boolean carries the outcome back out so the network call only fires when we actually flipped the pipeline to s1Loading. The updater stays safe under StrictMode / concurrent re-invocation: both invocations see the same prev, compute the same result, and claimed=true is idempotent. The network call is outside the updater, so it fires exactly once.

Commit: 4d398dd1

Adds docs/ai-and-plans/PRs/711-stream-query-insights/review-2026-06-08.md documenting the 2026-06-08 re-review pass: 10 findings (F1-F10), each with original-claim verification, independent severity assessment, proposed solutions with pros/cons, and an inline Resolution block linking the commit that landed and explaining why that option was chosen over the alternatives. Closing summary table cross-references all 10 commits.

The functional-setState approach landed in 4d398dd was broken in practice: `setState((prev) => ...)` *queues* the updater — React runs it on the next render, not synchronously inside the call. The captured `claimed` boolean was therefore always false when the gating The updater itself still committed though, leaving state pinned at `s1Loading` so the Query Insights tab's fallback fetch saw "someone is already loading" and waited forever. Symptom reported by user: Stage 1 prefetch silently no-ops on every query, and opening the Query Insights tab does nothing (no trace output, no UI advance past pre-Stage-1). Replaced with the ref-based read (option #2 from the F2 analysis): sample `currentContextRef.current.queryInsights` for the gate. By the time `runFindQuery.then` fires this function (network call, tens to hundreds of ms), React has long since committed the reset effect and the ref-updating effect has copied the post-reset state into the ref — so the one-commit ref lag I called out is not observable for this caller in practice. Also updated the F2 Resolution + summary table in docs/ai-and-plans/PRs/711-stream-query-insights/review-2026-06-08.md to document the regression, the revert, and why ref-based read is appropriate here. Refs: PR #711 review finding F2 (hotfix on top of 4d398dd).

tnaum-ms · 2026-06-08T20:03:50Z

F2 — regression hotfix (on top of 4d398dd).

The functional-setState approach landed in 4d398dd was broken in practice and the user hit it immediately: Stage 1 prefetch silently no-op'd on every query, and opening the Query Insights tab did nothing (no trace output, no UI advance past pre-Stage-1).

Root cause: setState((prev) => ...) queues the updater — React runs it on the next render, not synchronously inside the call. The captured claimed boolean was therefore always false when the gating if (!claimed) return; line ran, so the network call never fired. The updater itself still committed though, leaving state pinned at s1Loading — the Query Insights tab's fallback fetch then saw "someone else is loading" and waited forever. This is the standard "side-effect from setState callback" anti-pattern; the StrictMode-purity caveat I called out earlier was the right shape of concern but the wrong root cause.

Replaced with option #2 from the original F2 analysis (ref-based read) in 4a31ac1. By the time runFindQuery.then fires this function (network call, tens to hundreds of ms), React has long since committed the reset effect and the currentContextRef-updating effect has copied the post-reset state into the ref — so the one-commit ref lag I noted earlier is not observable for this caller in practice.

Updated F2's Resolution block + summary table in docs/ai-and-plans/PRs/711-stream-query-insights/review-2026-06-08.md to document the regression and the revert (commit bbec466).

Commits:

Hotfix: 4a31ac18
Doc SHA wire-up: bbec4666

…r (F2) PR #711 finding F2. The prefetch gate read the pipeline state through the render closure (original) or a ref (attempt 2, 4a31ac1) — both can observe the pre-reset terminal state and skip the warm-up on the 2nd+ query. Read no pipeline state in the closure at all. Gate the prefetch *call* on executionIntent (initial/refresh), which comes from the activeQuery that triggered this run and is therefore never stale for it, and which mirrors the reset effect's own condition. Perform the idle->s1Loading claim *inside* the functional setCurrentContext updater, whose `prev` composes with the reset effect's `-> idle` update. The reset is queued synchronously in the effect phase, strictly before this promise microtask, so `prev` is idle on a fresh query by event-loop ordering — not by network-latency timing. Also add an idle guard to the claim so a stale-but-passing gate can no longer reset an already-advanced pipeline (a latent risk in the unguarded attempt-2 claim).

…claim fix Capture the full F2 arc (PR #711): attempt #1 (4d398dd) regressed production because the captured `claimed` flag is read before the queued setState updater runs; attempt #2 (4a31ac1) works but is timing-dependent on a one-commit ref lag; the final fix (968858a) gates the prefetch call on executionIntent and claims inside the functional updater, whose `prev` composes with the reset's already-queued `-> idle` update — a structural ordering guarantee rather than a network-latency window. Updates the Resolutions summary row to match. Proposed- solutions section left unchanged.

tnaum-ms · 2026-06-08T20:29:52Z

F2 (Stage 1 prefetch race) — re-assessed and re-fixed

I re-derived F2 from source and concluded the shipping ref-based fix (attempt #2, 4a31ac18) is correct in production but timing-dependent, so I replaced it with a fix whose guarantee is structural rather than empirical.

Why the prior attempts were each unsatisfying

Attempt #1 (4d398dd1, reverted) regressed production. It carried a claimed boolean out of a functional setState updater to gate the request. setState((prev) => …) queues the updater, so claimed was always false when the gate ran → the request never fired, yet the updater still committed s1Loading, and the tab fallback (which also bails on s1Loading) then waited forever. Root cause is React's setState-batching contract — not the StrictMode/purity caveat the original note reached for.
Attempt #2 (4a31ac18) is timing-dependent. It reads the gate from currentContextRef.current. That ref is mirrored from state by a separate useEffect, so it lags one commit; the fix only holds because a real runFindQuery round-trip outlasts that lag. Under synchronous/mocked resolution, or a future synchronous caller, the stale read (and F2) returns. The claim was also unguarded, so a stale-but-idle ref could reset an already-advanced pipeline back to s1Loading.

The fix — `968858a7`

Read no pipeline state through a closure or ref:

Gate the prefetch call on executionIntent (initial/refresh) at the .then site. executionIntent belongs to the activeQuery that triggered this run's effect, so it can't be stale for this run, and it mirrors the reset effect's own condition — prefetch now fires exactly when the pipeline is being reset to idle (and no longer wastes a fetch attempt on pagination).
Claim idle → s1Loading inside the functional updater, reading prev. React applies queued updaters in order, and the reset effect's → idle is queued synchronously in the same effect phase, strictly before this promise .then microtask — so prev is idle here by event-loop ordering, not network latency. An if (prev.queryInsights.kind !== 'idle') return prev; guard prevents clobbering a fallback that already claimed.

The request still fires unconditionally after the claim (threading "did we claim?" out of an updater is exactly the attempt-#1 trap). In the narrow both-paths race the fallback may also fetch; the existing F8 guards drop the second result and the stage1Succeeded guard ensures ≤ one Stage 2 — and getQueryInsightsStage1 is an idempotent cache read, so the worst case is one redundant queryPlanner read, no worse than attempt #2.

Failure modes traced by hand

q1 → q2 → q3: each resets to idle, claim sees idle, prefetch fires every time. ✅
Open Insights tab before prefetch resolves: prefetch claimed s1Loading; fallback observes it and bails (no duplicate). ✅
StrictMode double-invoke: prefetch runs from a .then, not an effect, so StrictMode doesn't double it; the run effect's pre-existing double-run is absorbed by the F8 + stage1Succeeded guards (≤ one Stage 2). ✅
Synchronous runFindQuery resolution: reset is still queued before the .then microtask → prev is idle regardless of timing. ✅

npm run lint, npm run build, and npx jest --no-coverage (2039 tests) all pass. Doc trail (full arc + summary table) updated in review-2026-06-08.md under F2 in 600ba854.

…r-format F2 review doc - Remove two stale eslint-disable-next-line react-hooks/set-state-in-effect comments in CollectionView (the rule is not active in flat config, so the directives were unused; lint stays clean without them). - Reflow review-2026-06-08.md to Prettier code style (emphasis markers and summary-table column alignment); no content changes.

github-actions · 2026-06-08T20:51:07Z

✅ Code Quality Checks

Check	Status	How to fix
Localization (`l10n`)	✅ Passed
ESLint	✅ Passed
Prettier formatting	✅ Passed

This comment is updated automatically on each push.

tnaum-ms · 2026-06-08T20:53:11Z

Approved.

tnaum-ms · 2026-06-08T20:53:23Z

On hold for the 0.9.0 release

github-actions · 2026-06-08T20:54:30Z

📦 Build Size Report

Metric	Base (`main`)	PR	Delta
VSIX (`vscode-documentdb-0.9.0-beta.vsix`)	7.60 MB	7.60 MB	⬆️ +2 KB (+0.0%)
Webview bundle (`views.js`)	5.88 MB	5.88 MB	⬆️ +3 KB (+0.1%)

Download artifact · updated automatically on each push.

…in query analysis templates

…ing the default _id_ index in recommendations

tnaum-ms added 30 commits May 29, 2026 09:52

initial plan - reviewed, iterated over, and approved

84e7152

Revert "feat(dev): add Motion Sandbox webview for Fluent motion exper…

6862999

…iments" This reverts commit d4979c5.

tnaum-ms added 7 commits June 8, 2026 18:29

tnaum-ms added 4 commits June 8, 2026 21:55

fix: prettier / l10n

c521153

docs(review): wire F2 hotfix SHA (4a31ac1) into review-2026-06-08.md

bbec466

tnaum-ms added 2 commits June 8, 2026 22:24

tnaum-ms added the on-hold Approved but not merging yet. Reason in comments. Remove + click Ready for review to release. label Jun 8, 2026

tnaum-ms added 2 commits June 9, 2026 18:36

fix(prompts): add protection instructions for the default _id_ index …

d913a58

…in query analysis templates

feat(query-insights): implement protection against dropping or modify…

e5a52fa

…ing the default _id_ index in recommendations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(query-insights): streaming UX for Stage 3 AI recommendations#711

feat(query-insights): streaming UX for Stage 3 AI recommendations#711
tnaum-ms wants to merge 110 commits into
mainfrom
dev/tnaum/stream-query-insights

tnaum-ms commented May 29, 2026 •

edited

Loading

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tnaum-ms commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

What changed

Phase 1 — Plumb the stream

Phase 2 — Tolerant incremental parser

Phase 3 — Progressive rendering

Phase 4 — Telemetry preservation

Cleanup + dead code removal

Model simplification — hard-target copilot-utility

Telemetry mapping

Related

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

F2 (Stage 1 prefetch race) — re-assessed and re-fixed

Why the prior attempts were each unsatisfying

The fix — 968858a7

Failure modes traced by hand

Uh oh!

github-actions Bot commented Jun 8, 2026

✅ Code Quality Checks

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

tnaum-ms commented Jun 8, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

📦 Build Size Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tnaum-ms commented May 29, 2026 •

edited

Loading

Model simplification — hard-target `copilot-utility`

The fix — `968858a7`