feat: AI model transparency for Query Insights panel by tnaum-ms · Pull Request #690 · microsoft/vscode-documentdb

tnaum-ms · 2026-05-28T14:16:20Z

Summary

Adds model transparency and cost-neutral disclosure to the Query Insights AI feature. Users now see which model processed their request and are informed upfront that this feature uses a utility model (a Copilot tier that does not count against the premium request quota).

What was changed

Model disclosure pipeline (backend → webview)

copilotService.ts: CopilotResponse now carries modelUsed (the id of the selected LanguageModelChat instance).
indexAdvisorCommands.ts → QueryInsightsAIService.ts → transformations.ts: modelUsed is threaded through each layer and surfaced to the webview via QueryInsightsStage3Response.
collectionViewRouter.ts (Stage 3 router): emits aiModelDisclosed telemetry property when modelUsed is present.

Cost-neutral disclosure UI

Pre-invocation card (GetPerformanceInsightsCard): a persistent info row beneath the action buttons reads "No additional cost for most GitHub Copilot subscribers. [Learn more about the utility model used.]" — shown before the user clicks, and during loading.
Post-response byline (QueryInsightsTab): after a successful Stage 3 response, two lines appear:
1. "No additional cost for most GitHub Copilot subscribers. [Learn more about the utility model used.]"
2. "Powered by {modelId} via GitHub Copilot" — concrete model attribution.
Both "Learn more" links open https://aka.ms/vscode-documentdb-copilot-utility-model (⚠️ slug must be registered before shipping).

Token usage tracking (trace/telemetry only, not in UI)

copilotService.ts: parallel countTokens calls measure prompt tokens, response tokens, total, and utilization percentage. All measurements are emitted to telemetry and written to the trace output channel (formatTokenCount helper for compact K/M notation).
CopilotTokenUsage interface added; CopilotResponse carries an optional usage field. The measurements flow all the way to the indexOptimization telemetry event.
Token counts are intentionally not rendered in the UI (see design decisions below).

Model selection tracing

selectBestModel: traces each candidate model as requested / accepted / rejected so diagnostics show the full selection chain.
sendMessage: records modelPreferenceChain, modelsAvailable, modelSelectionOutcome, and modelsAvailableCount in telemetry.
dumpModelMetadata: traces all stable model fields (id, vendor, family, version, name, maxInputTokens) plus any additional own enumerable non-function properties — diagnostic only.

Model fallback chain

promptTemplates.ts: FALLBACK_MODELS extended with copilot-utility as the final fallback after gpt-4o and gpt-4o-mini.

Minor fixes

Link components inside Text size={200} rows now carry style={{ fontSize: tokens.fontSizeBase200, lineHeight: tokens.lineHeightBase200 }} to override the default 14px Fluent v9 fui-Link class.

Design decisions

Why credits used are NOT shown

GitHub Copilot's billing model assigns a credit cost per model request. The stable VS Code Language Model API (vscode.lm) does not expose pricing or credit data. A proposed API (vscode.proposed.languageModelPricing.d.ts) exists in the VS Code source but:

It is a proposed (pre-release) API — subject to breaking changes without notice.
Shipping an extension that depends on proposed APIs requires special opt-in (enabledApiProposals in package.json) and is not permitted for extensions published to the Marketplace without Microsoft sign-off.

Decision: credits are not surfaced in the UI. The extension stays entirely on stable VS Code APIs. If the pricing API graduates to stable in a future VS Code release, this feature can be revisited. A GitHub issue has been filed to track this. Token counts (via countTokens, which is stable) are captured in trace/telemetry as a proxy for cost awareness without making any binding cost claim to users.

Why token counts are not in the UI

countTokens gives a token count, not a cost. Displaying raw token numbers to users risks misleading them (tokens ≠ credits, and the conversion ratio is model-dependent and may change). The appropriate audience for token numbers is telemetry and diagnostic traces. Keeping them out of the UI avoids a support burden around questions like "why did this cost 2400 tokens?".

Why "most GitHub Copilot subscribers" not "all"

Copilot utility model access is documented as included for subscribers, but enterprise agreements and custom billing arrangements can differ. "Most" is a deliberate hedge that is accurate without overpromising.

Why `aka.ms/vscode-documentdb-copilot-utility-model` and not the learn.microsoft.com index-advisor URL

The index-advisor docs page covers the feature end-to-end; it does not specifically explain the utility model tier or its billing implications. A dedicated aka.ms redirect allows the docs team to point this link at the most relevant GitHub Copilot pricing/model-tier page without a code change.

Commits (newest first)

Hash	Message
`0a012266`	chore: regenerate l10n bundle
`2f481b8c`	feat(ui): refine cost-neutral disclosure wording and split post-response byline
`43b260b4`	chore: regenerate l10n bundle
`71c5aaea`	fix(ui): match Link font size to parent Text size in disclosure rows
`c41e9d2b`	feat: enhance token usage tracking and model metadata logging in Copilot service
`7b822b68`	wip: feat(ui): add Utility Model badge to AI Performance Insights card and update powered-by text
`8fa152fe`	chore: regenerate l10n bundle
`7c077e8c`	feat(ui): expand Powered-by byline with icon, cost-neutral wording, and token usage
`54b3d32e`	feat: capture token usage from Copilot responses and emit measurements
`aa3791d5`	feat: log AI model selection chain (requested/accepted/rejected)
`272d21b1`	chore: regenerate l10n bundle
`39006b5b`	feat(ui): add post-response Powered-by byline + shared learn-more handler
`67cd3349`	feat(ui): add cost-neutral disclosure row to AI Performance Insights card
`0477b4e3`	feat: surface AI model id to Query Insights webview
`9046c857`	chore: add copilot-utility as final AI fallback model

Pre-merge checklist

Register https://aka.ms/vscode-documentdb-copilot-utility-model in the Microsoft URL shortener
Verify disclosure wording with GitHub Copilot billing docs team
Squash the wip: commit (7b822b68) before merge
Run full CI

Append the Copilot internal 'copilot-utility' alias to the Query Insights AI fallback chain so the feature falls back to whichever chat model CAPI marks as is_chat_fallback when gpt-4o and gpt-4o-mini are unavailable. This keeps the AI Performance Insights flow on a model intended to be cost-neutral for GitHub Copilot subscribers.

Propagate the language model id returned by CopilotService through AIOptimizationResponse, transformAIResponseForUI, and QueryInsightsStage3Response so the webview can disclose which model actually produced the AI Performance Insights response. Also record the disclosed id under the aiModelDisclosed telemetry property on the Stage 3 router event for correlation with UI exposure.

…card Adds a small InfoRegular + Text caption beneath the action buttons in GetPerformanceInsightsCard explaining that the feature uses a utility model intended to be cost-neutral for GitHub Copilot subscribers, plus an inline Learn more link that reuses the existing onLearnMore callback. A new optional modelHint prop (defaults to 'GPT-4o') labels the model so the disclosure can adapt if the preferred model ever changes.

…dler Renders a small caption beneath the AI suggestions list once Stage 3 succeeds, surfacing the actual model id returned by CopilotService so users see when fallbacks (gpt-4o-mini, copilot-utility) kick in. The doc URL and openUrl call are extracted into a single handleLearnMore callback so the brand card button, the cost-disclosure row, and the new byline all open the same page.

Picks up the new strings introduced by the AI Performance Insights model-transparency UI ('Uses a utility model …', 'Powered by {0} via GitHub Copilot.', 'Learn more').

Refactors CopilotService.selectBestModel to emit a structured trace of the model selection process: it logs the available models from VS Code, the requested preference chain, accepted/rejected status per preferred id, and the final selection. When the Copilot vendor returns no models at all the no-models-available branch is logged as a warning so users debugging 'AI insights unavailable' see the root cause without enabling verbose logging in the LM API itself. Adds three telemetry properties (modelPreferenceChain, modelsAvailable, modelSelectionOutcome) and one measurement (modelsAvailableCount) on the copilot.sendMessage event to allow offline monitoring of how often each fallback level is hit.

Adds a CopilotTokenUsage type and computes prompt/response/context-window token counts client-side via LanguageModelChat.countTokens after each request. Counts are best-effort: failures fall back to undefined so telemetry never blocks the user flow. The usage object is propagated through OptimizationResult, AIOptimizationResponse, transformAIResponseForUI, and QueryInsightsStage3Response so the webview can surface it. Telemetry measurements (promptTokens, responseTokens, totalTokens, maxInputTokens, promptUtilizationPct) are emitted on three events for offline monitoring: copilot.sendMessage, indexOptimization (inside optimizeQuery), and the Stage 3 router.

…nd token usage Aligns the post-response byline with the pre-invocation card: prefixes the line with InfoRegular, mirrors the 'utility model intended to be cost-neutral for GitHub Copilot subscribers' wording, and appends a localised token-usage summary built from QueryInsightsStage3Response.usage. The token summary degrades gracefully across three cases (prompt + response + utilisation %, prompt + response only, prompt only) and is omitted entirely when countTokens did not return anything.

Picks up the strings added by the model-selection trace logging, token-usage measurement, and the expanded Powered-by byline ('[Copilot] Available models...', 'Used {0} prompt + {1} response tokens...', etc.).

…d and update powered-by text

…lot service

Fluent v9 Link does not inherit font size from its parent Text component. Both the pre-invocation cost-neutral disclosure row in GetPerformanceInsightsCard and the post-response Powered-by byline in QueryInsightsTab rendered the 'Learn more' link at the default 14px instead of the surrounding Text size={200} 12px. Fixed by adding an explicit inline style with tokens.fontSizeBase200 and tokens.lineHeightBase200 to each Link, overriding the fui-Link class.

…nse byline - Pre-invocation disclosure: 'No additional cost for most GitHub Copilot subscribers. Learn more about the utility model used.' (link goes to dedicated utility model doc page via aka.ms slug) - Post-response byline split into two lines: cost-neutral disclosure (line 1) + 'Powered by {model} via GitHub Copilot' attribution (line 2) - Add onLearnMoreUtilityModel prop to GetPerformanceInsightsCard; remove unused modelHint prop - Add utilityModelUrl constant and handleLearnMoreUtilityModel callback in QueryInsightsTab

…st disclosure and token tracking

New user-manual page docs/user-manual/ai-utility-model.md explains what utility models are in the GitHub Copilot context, which models the extension uses and in what order, what each billing tier means for users (paid, Free, enterprise, and the June 2026 usage-based billing transition), how prompts are optimised to stay cost-neutral, and how to find the model attribution in the post-response byline. Registered in the user manual index under a new AI Features section. Links to the canonical GitHub Copilot billing and plans documentation.

Copilot

Pull request overview

This PR adds transparency around the AI model used by the Query Insights “AI Performance Insights” feature and introduces cost-neutral disclosure messaging, while also enhancing diagnostic/telemetry signals for model selection and token usage.

Changes:

Thread modelUsed (and best-effort token usage metrics) from the Copilot service through the optimization pipeline to the webview and telemetry.
Add pre-invocation and post-response UI disclosures including a utility-model “Learn more” link and a “Powered by {modelId}” byline.
Add diagnostics for model selection (requested/accepted/rejected) and token usage counting/tracing; extend fallback model chain to include copilot-utility.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/webviews/documentdb/collectionView/types/queryInsights.ts	Extends Stage 3 webview response type with `modelUsed` and `usage`.
src/webviews/documentdb/collectionView/components/queryInsightsTab/QueryInsightsTab.tsx	Adds disclosure links/handlers and renders the post-response “Powered by” byline.
src/webviews/documentdb/collectionView/components/queryInsightsTab/components/optimizationCards/custom/GetPerformanceInsightsCard.tsx	Adds persistent cost-neutral disclosure row and a utility-model learn-more callback.
src/webviews/documentdb/collectionView/collectionViewRouter.ts	Mirrors `modelUsed` and token usage into Stage 3 telemetry properties/measurements.
src/utils/formatTokenCount.ts	New helper to compact-format token counts for trace output.
src/services/copilotService.ts	Adds token usage measurement, model selection tracing, metadata dumping, and telemetry fields.
src/services/ai/types.ts	Threads `modelUsed` and `usage` through AI optimization response types.
src/services/ai/QueryInsightsAIService.ts	Carries `modelUsed`/`usage` from optimization results into parsed response.
src/documentdb/queryInsights/transformations.ts	Surfaces `modelUsed`/`usage` into the UI transform result.
src/commands/llmEnhancedCommands/promptTemplates.ts	Extends fallback model chain to include `copilot-utility`.
src/commands/llmEnhancedCommands/indexAdvisorCommands.ts	Propagates token usage and records token usage measurements in telemetry.
l10n/bundle.l10n.json	Regenerates localization bundle with new UI/trace strings.
docs/user-manual/ai-utility-model.md	Adds documentation for model selection and billing/cost rationale.
docs/index.md	Adds the new AI documentation page to the docs index.
docs/ai-and-plans/PRs/690-ai-model-transparency.md	Adds internal PR notes describing rationale, pipeline, and decisions.

…tility

…pt-4o -> copilot-utility Addresses PR #690 review Low #5. The 'We fall back gracefully' section still advertised GPT-4o -> GPT-4o-mini -> copilot-utility, but the implemented chain in promptTemplates.ts is GPT-4.1 -> GPT-4o -> copilot-utility. Update the manual so the cost-disclosure page does not ship a stale fallback policy.

tnaum-ms · 2026-05-28T17:49:57Z

Re: Low finding #5 (docs disagree with implemented fallback chain) — addressed in 125a234.

docs/user-manual/ai-utility-model.md advertised GPT-4o → GPT-4o-mini → copilot-utility, but the implemented chain in promptTemplates.ts is gpt-4.1 → gpt-4o → copilot-utility. Updated the cost-disclosure page to match the code so we don't ship a stale fallback policy on the page users are most likely to consult.

Addresses GitHub Copilot reviewer comment 3318538081 on PR #690. The JSDoc on QueryInsightsStage3Response.usage said the values are 'Surfaced in the post-response byline', but the byline component renders only the model display name. The token-usage fields are forwarded purely so the extension host can mirror them onto telemetry alongside Stage-3 properties without a second event. Rewrote the JSDoc to say so.

Addresses GitHub Copilot reviewer comment 3318538176 on PR #690. The inline comment on the cost-neutral disclosure row claimed the 'Learn more' link 'reuses onLearnMore so the doc URL stays in one place'. In fact the disclosure-row link is wired to onLearnMoreUtilityModel (the utility-model cost-disclosure page) and is intentionally separate from the feature's general onLearnMore. Rewrote the comment to reflect that the two URLs are kept separate by design.

Addresses GitHub Copilot reviewer comment 3318538227 on PR #690. The JSDoc on aiInsightsDocsUrl claimed the brand-card 'Learn more' button, the cost-disclosure row, and the byline all open the same page. They do not: only the brand-card button uses aiInsightsDocsUrl; the disclosure row uses utilityModelUrl, and the byline currently has no link. Rewrote the JSDoc to explain the two URLs are intentionally separate.

…y size limit Addresses PR #690 review Additional Low finding. The 'modelsAvailable' property previously joined every available LanguageModelChat.id verbatim with commas. With long ids like 'copilot-gpt-4o-mini-2024-07-18' and Copilot extensions surfacing 10+ models, the property routinely exceeded downstream telemetry size caps and got truncated, hiding the very data the field was meant to expose. Switch to LanguageModelChat.family (well-known short names like 'gpt-4o'), dedupe with a Set, sort for stable ordering, cap to 8 entries, and append a '+N-more' suffix when truncated so analytics can still see that the list was capped.

tnaum-ms · 2026-05-28T17:52:43Z

Re: Additional Low finding (unbounded modelsAvailable telemetry property) — addressed in 4e7c17d.

The modelsAvailable property previously joined every LanguageModelChat.id verbatim with commas. With long ids like copilot-gpt-4o-mini-2024-07-18 and Copilot extensions returning 10+ models, the property routinely exceeded downstream telemetry property-size caps and got truncated, hiding the very data the field was meant to expose. Now:

Use LanguageModelChat.family (short well-known names like gpt-4o) instead of opaque ids.
Dedupe via Set, sort for stable ordering.
Cap at 8 entries; append a +N-more suffix when truncated so analytics still sees the list was capped.
Keep modelsAvailableCount as a measurement for the true count.

Addresses PR #690 review Additional Low finding. dumpModelMetadata emitted the same static metadata block to the trace output channel on every Copilot request (per-message tokens + own-property enumeration). For a given LanguageModelChat.id the metadata is static for the lifetime of the extension host, so emit it at most once per id. Keeps the trace stream readable without losing first-seen visibility into what the runtime exposes.

tnaum-ms · 2026-05-28T17:53:20Z

Re: Additional Low finding (dumpModelMetadata runs on every request) — addressed in 5f51b3d.

dumpModelMetadata emitted the same static stable-fields block and own-property enumeration to the trace output channel on every Copilot request. For a given LanguageModelChat.id the metadata is static for the extension-host lifetime, so it's now memoised: the first time we see a new id we dump it; subsequent calls early-return. Keeps the trace stream readable without losing first-seen visibility.

Addresses PR #690 review Additional Nit finding. When selectChatModels returns an empty array because the user dismissed VS Code's one-time language-model access consent prompt, the previous error message only suggested checking Copilot install/subscription status, leading users on a wild goose chase. Add explicit mention of the consent prompt so users know to re-run the feature to re-trigger it.

tnaum-ms · 2026-05-28T17:53:42Z

Re: Additional Nit finding (no-model error message doesn't mention consent) — addressed in 65bb97c.

selectChatModels returns an empty array when the user has dismissed VS Code's one-time language-model access consent prompt. The previous error message only nudged users toward checking Copilot install/subscription, leading them on a wild goose chase. The message now explicitly mentions the consent prompt and notes that re-running the feature will re-trigger it.

Addresses PR #690 review Additional Nit finding. The cancellation trace inside CopilotService used the '[Query Insights AI]' prefix used by the index-advisor caller, even though CopilotService is shared across query generation and any future AI feature. Use '[Copilot]' inside the service so the prefix is consistent with the other service- internal traces (token usage, model metadata) and so the message stays accurate when query generation or another caller is the source.

tnaum-ms · 2026-05-28T17:54:08Z

Re: Additional Nit finding (inconsistent trace prefix) — addressed in dfdb0a1.

The cancellation trace inside CopilotService used [Query Insights AI], the index-advisor caller's prefix, even though the service is shared across query generation and any future AI feature. Switched the service-internal trace to [Copilot] so it's consistent with the other service-internal traces (token usage, model metadata) and stays accurate regardless of which feature triggered the request. Caller-side [Query Insights AI] / [Query Generation] prefixes are unchanged.

Adds a 'Review-feedback follow-up' section to the PR design doc covering the significant changes made after the initial push to address PR #690 review findings: - CopilotResponse split into modelId / modelFamily / modelDisplayName - Per-feature model constants + featureSource telemetry plumbing - Manual softened: post-response token measurement, display-name byline prose, fallback chain update - modelsAvailable telemetry capped and deduped - dumpModelMetadata memoised per model id - Consent-aware no-model error message - Service-internal trace prefix unified to [Copilot] Comment-only / wording-only fixes are deliberately not re-listed here — they are recorded in the per-fix PR comments.

tnaum-ms · 2026-05-28T17:55:08Z

Re: PR description update (PRs/690-ai-model-transparency.md) — addressed in 9964d21.

Added a "Review-feedback follow-up" section to the PR design doc covering the significant code/contract/doc changes from this review pass (CopilotResponse split, per-feature constants + featureSource telemetry, manual softening, fallback-chain update, modelsAvailable cap/dedupe, dumpModelMetadata memoisation, consent-aware error message, unified trace prefix). Comment-only and wording-only fixes are intentionally not re-listed in the design doc — they live in their per-fix PR comments and reviewer-thread replies.

Picks up the two user-facing string changes from this review pass: - Updated 'No suitable language model' error message to mention the language-model access consent prompt. - Renamed the cancellation trace key from '[Query Insights AI] Copilot call cancelled during streaming' to '[Copilot] Call cancelled during streaming' to match the unified service-internal trace prefix.

Resolves PR #690 review High finding #1. Renames the preferred/fallback model surface so the unit of selection is unambiguously LanguageModelChat.family rather than LanguageModelChat.id: promptTemplates.ts: INDEX_OPTIMIZATION_PREFERRED_MODEL -> INDEX_OPTIMIZATION_PREFERRED_FAMILY INDEX_OPTIMIZATION_FALLBACK_MODELS -> INDEX_OPTIMIZATION_FALLBACK_FAMILIES QUERY_GENERATION_PREFERRED_MODEL -> QUERY_GENERATION_PREFERRED_FAMILY QUERY_GENERATION_FALLBACK_MODELS -> QUERY_GENERATION_FALLBACK_FAMILIES CopilotMessageOptions: preferredModel -> preferredFamily fallbackModels -> fallbackFamilies CopilotService: getPreferredModels -> getPreferredFamilies selectBestModel matcher: m.id === preferredId -> m.family === preferredFamily OptimizeQueryContext: preferredModel -> preferredFamily fallbackModels -> fallbackFamilies Why family, not id: - LanguageModelChat.id is documented as opaque and can change between Copilot extension versions (or carry date-stamped suffixes like 'copilot-gpt-4o-mini-2024-07-18'). - LanguageModelChat.family is the documented stable well-known name and is what the official VS Code LM API examples use. - copilot-utility safely matches via family too: verified against the Copilot Chat extension source that alias entries are registered with the alias string used as BOTH id and family. So gpt-4.1, gpt-4o, and copilot-utility all match via family with no special-case needed. Before this commit, selectBestModel matched on m.id === preferredId, so 'gpt-4.1' / 'gpt-4o' chain entries never matched (real ids are 'copilot-gpt-4.1' / 'copilot-gpt-4o') and the code silently fell through to availableModels[0]. The copilot-utility entry coincidentally matched because aliases register the alias string as the id, which masked the bug in casual testing. Warning-toast checks in indexAdvisorCommands.ts / queryGenerationCommands.ts now also compare strictly on family. The earlier defensive '|| modelId === ...' branch was removed once aliases were confirmed to register family alongside id — it could not fire in practice for any entry in our chain. Trace messages updated to say 'family' instead of 'model' in the selection log so the output channel reflects what is actually being matched. PR design doc (docs/ai-and-plans/PRs/690-ai-model-transparency.md) extended with a 'Family-based model selection' section recording the rationale and the alias-registration evidence.

tnaum-ms · 2026-05-29T05:55:11Z

Re: High finding #1 (family-vs-id model matching) — fix landed in 0f098aa.

Took the direction recorded in the earlier research note and made family-based selection unambiguous throughout the contract:

promptTemplates.ts:
- INDEX_OPTIMIZATION_PREFERRED_MODEL → INDEX_OPTIMIZATION_PREFERRED_FAMILY
- INDEX_OPTIMIZATION_FALLBACK_MODELS → INDEX_OPTIMIZATION_FALLBACK_FAMILIES
- QUERY_GENERATION_PREFERRED_MODEL → QUERY_GENERATION_PREFERRED_FAMILY
- QUERY_GENERATION_FALLBACK_MODELS → QUERY_GENERATION_FALLBACK_FAMILIES
CopilotMessageOptions.preferredModel → preferredFamily; fallbackModels → fallbackFamilies.
CopilotService.selectBestModel now matches m.family === preferredFamily. The id-based matcher is gone.
Warning-toast checks in indexAdvisorCommands.ts / queryGenerationCommands.ts also compare strictly on family — the earlier defensive || modelId === ... branch was dropped once aliases were confirmed to register family alongside id (it could not fire in practice for any entry in our chain).
Trace messages updated to say family instead of model in the selection log so the output channel reflects what is actually being matched.

Why family

LanguageModelChat.id is documented as opaque and can change between Copilot extension versions or carry date-stamped suffixes like copilot-gpt-4o-mini-2024-07-18. LanguageModelChat.family is the well-known stable name and is what the official VS Code LM API examples use:

const [model] = await vscode.lm.selectChatModels({ vendor: 'copilot', family: 'gpt-4o' });

Why this is safe for copilot-utility

Confirmed directly against the Copilot Chat extension source (microsoft/vscode-copilot-chat, src/extension/conversation/vscode-node/languageModelAccess.ts): alias entries are registered with the alias string used as both id and family. So gpt-4.1, gpt-4o, and copilot-utility all match via family with no special-case needed.

The bug this closes

The old m.id === preferredId matcher never matched gpt-4.1 / gpt-4o (real ids are copilot-gpt-4.1 / copilot-gpt-4o) and silently fell through to availableModels[0]. The copilot-utility entry coincidentally matched because aliases register the alias string as the id, which masked the bug in casual testing.

The PR design doc (docs/ai-and-plans/PRs/690-ai-model-transparency.md) now has a Family-based model selection section that records the rationale and the alias-registration evidence for future reference. The 5-step PR checklist passes (l10n regenerated, prettier clean, lint clean, 1984/1984 jest, build clean).

…annel warnings The preferred-model-not-used warning was previously surfaced via vscode.window.showWarningMessage as a notification toast in both indexAdvisorCommands.ts and queryGenerationCommands.ts. The fallback to the next available family is automatic and there is nothing the user can act on, so a popup only adds confusion. Surface the same information as ext.outputChannel.warn entries with '[Query Insights AI]' / '[Query Generation]' prefixes, including both the requested family and the actually-used family alongside the display name. The data is still captured in telemetry via modelSelectionOutcome on the shared sendMessage event for analytics follow-up.

tnaum-ms · 2026-05-29T06:36:12Z

Follow-up: demoted "preferred model not used" warnings to output-channel entries in b650f0b.

Both indexAdvisorCommands.ts and queryGenerationCommands.ts previously surfaced the "not using preferred model" warning as a vscode.window.showWarningMessage notification toast. The fallback to the next available family is fully automatic and there is nothing the user can act on, so the popup was adding confusion rather than value.

Same information is now an ext.outputChannel.warn line (with the existing [Query Insights AI] / [Query Generation] prefixes), recording both the requested family and the actually-used family alongside the display name. The data is still captured in telemetry via modelSelectionOutcome on the shared sendMessage event for analytics follow-up.

5-step PR checklist passes: l10n regenerated, prettier clean, lint clean, 1984/1984 jest, build clean.

…cing section Restructures and rewrites the manual page to match the conversational second-person style of the other user-manual pages. Key changes: - Retitles page 'Model and Pricing' (was 'Model and Billing') - Removes the standalone 'What is a utility model?' intro section; the utility-model concept is now woven into 'Which model does the extension use?' as a natural first paragraph - Renames 'How we optimize prompts for the utility model' to 'How we keep prompts lean' and tightens the prose - Adds a note about fallback diagnostics going to the output channel (not a popup) in the fallback section - Updates 'Which model was actually used?' to mention the copilot-utility byline case and removes raw API property names from body text - Adds a dedicated 'Pricing' section covering: what GitHub documents about 0x multiplier models, a per-plan table, the 'most subscribers' hedge explanation, and a note on billing evolution — all sourced to official GitHub docs - Fixes header breadcrumb dash to — to match other pages docs(manual): drop technical section, remove dashes, add testing note docs(manual): restructure ai-utility-model for all AI features + pricing first docs(manual): improve clarity and formatting in AI utility model documentation

github-actions · 2026-05-29T08:02:23Z

✅ Code Quality Checks

Check	Status	How to fix
Localization (`l10n`)	✅ Passed
ESLint	✅ Passed
Prettier formatting	✅ Passed

This comment is updated automatically on each push.

github-actions · 2026-05-29T08:06:16Z

📦 Build Size Report

Metric	Base (`main`)	PR	Delta
VSIX (`vscode-documentdb-0.8.0.vsix`)	7.53 MB	7.54 MB	⬆️ +2 KB (+0.0%)
Webview bundle (`views.js`)	5.88 MB	5.88 MB	⬆️ +1 KB (+0.0%)

Download artifact · updated automatically on each push.

tnaum-ms added 15 commits May 28, 2026 11:14

chore: regenerate l10n bundle

272d21b

Picks up the new strings introduced by the AI Performance Insights model-transparency UI ('Uses a utility model …', 'Powered by {0} via GitHub Copilot.', 'Learn more').

chore: regenerate l10n bundle

8fa152f

Picks up the strings added by the model-selection trace logging, token-usage measurement, and the expanded Powered-by byline ('[Copilot] Available models...', 'Used {0} prompt + {1} response tokens...', etc.).

wip: feat(ui): add Utility Model badge to AI Performance Insights car…

7b822b6

…d and update powered-by text

feat: enhance token usage tracking and model metadata logging in Copi…

c41e9d2

…lot service

chore: regenerate l10n bundle

43b260b

chore: regenerate l10n bundle

0a01226

tnaum-ms linked an issue May 28, 2026 that may be closed by this pull request

feat: show which LLM model is used by AI features in the UI #681

Closed

4 tasks

tnaum-ms added this to the 0.8.1 milestone May 28, 2026

tnaum-ms added 3 commits May 28, 2026 14:19

feat: implement AI model transparency in Query Insights panel with co…

cceaf9f

…st disclosure and token tracking

docs: update PR #690 summary

3ecdb79

tnaum-ms force-pushed the dev/tnaum/query-insights-model-transparency branch from 990cf71 to 3ecdb79 Compare May 28, 2026 14:29

Merge branch 'main' into dev/tnaum/query-insights-model-transparency

cbea605

tnaum-ms marked this pull request as ready for review May 28, 2026 14:30

tnaum-ms requested a review from a team as a code owner May 28, 2026 14:30

Copilot AI review requested due to automatic review settings May 28, 2026 14:30

Copilot started reviewing on behalf of tnaum-ms May 28, 2026 14:30 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

tnaum-ms added 3 commits May 28, 2026 14:38

chore: update preferred model chain to gpt-4.1 -> gpt-4o -> copilot-u…

2a892f2

…tility

fix: downgrade fallback-model trace from warn to trace

539c2fa

feat(ui): show model display name in Powered-by byline instead of raw id

e8fe379

tnaum-ms added 4 commits May 28, 2026 17:50

tnaum-ms added 2 commits May 28, 2026 17:58

tnaum-ms force-pushed the dev/tnaum/query-insights-model-transparency branch from 69eebd3 to 9acb245 Compare May 29, 2026 07:58

sajeetharan approved these changes May 29, 2026

View reviewed changes

tnaum-ms merged commit 413dbcd into main May 29, 2026
8 checks passed

tnaum-ms deleted the dev/tnaum/query-insights-model-transparency branch May 29, 2026 08:07

tnaum-ms linked an issue May 29, 2026 that may be closed by this pull request

Track LLM token usage #341

Closed

This was referenced Jun 2, 2026

docs: release notes and changelog for v0.8.1 #727

Merged

feat(query-insights): streaming UX for Stage 3 AI recommendations #711

Open

Conversation

tnaum-ms commented May 28, 2026

Summary

What was changed

Model disclosure pipeline (backend → webview)

Cost-neutral disclosure UI

Token usage tracking (trace/telemetry only, not in UI)

Model selection tracing

Model fallback chain

Minor fixes

Design decisions

Why credits used are NOT shown

Why token counts are not in the UI

Why "most GitHub Copilot subscribers" not "all"

Why aka.ms/vscode-documentdb-copilot-utility-model and not the learn.microsoft.com index-advisor URL

Commits (newest first)

Pre-merge checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 29, 2026

Uh oh!

tnaum-ms commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

✅ Code Quality Checks

Uh oh!

github-actions Bot commented May 29, 2026

📦 Build Size Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Why `aka.ms/vscode-documentdb-copilot-utility-model` and not the learn.microsoft.com index-advisor URL