microsoft
diff --git a/‎docs/ai-and-plans/PRs/690-ai-model-transparency.md‎
Lines changed: 312 additions & 0 deletions b/‎docs/ai-and-plans/PRs/690-ai-model-transparency.md‎
Lines changed: 312 additions & 0 deletions
diff --git a/‎docs/index.md‎
Lines changed: 4 additions & 0 deletions b/‎docs/index.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/user-manual/ai-utility-model.md‎
Lines changed: 87 additions & 0 deletions b/‎docs/user-manual/ai-utility-model.md‎
Lines changed: 87 additions & 0 deletions
diff --git a/‎l10n/bundle.l10n.json‎
Lines changed: 22 additions & 6 deletions b/‎l10n/bundle.l10n.json‎
Lines changed: 22 additions & 6 deletions
@@ -69,6 +69,10 @@ The User Manual provides guidance on using DocumentDB for VS Code. It contains d
 - [Interactive Shell](./user-manual/interactive-shell)
 - [How It Works Behind the Scenes](./user-manual/query-runtime)
 
+### AI Features
+
+- [AI Performance Insights: Model and Billing](./user-manual/ai-utility-model)
+
 ## Release Notes
 
 Explore the history of updates and improvements to the DocumentDB for VS Code extension. Each release brings new features, enhancements, and fixes to improve your experience.
 
@@ -0,0 +1,87 @@
+> **User Manual** - [Back to User Manual](../index#user-manual)
+
+---
+
+# AI Features: Utility Model and Pricing
+
+The AI-assisted features in this extension, such as AI Performance Insights in the Query Insights panel, use GitHub Copilot to analyse your queries and return optimization recommendations. All of these features are designed to be **cost-neutral for most GitHub Copilot subscribers**: they exclusively use utility (included) models that do not consume your monthly premium request allowance.
+
+**Table of Contents**
+
+- [Utility Model Pricing](#utility-model-pricing)
+- [Which model does the extension use?](#which-model-does-the-extension-use)
+- [Which model was actually used?](#which-model-was-actually-used)
+- [Further reading](#further-reading)
+
+---
+
+## Utility Model Pricing
+
+The extension targets utility models specifically because they are **included in all paid GitHub Copilot plans at no extra charge**.
+
+### What GitHub documents about utility model pricing
+
+GitHub designates a set of models as **included models** with a request multiplier of **0x**. Requests to these models do not consume your monthly premium request allowance. As of mid-2026, GPT-4o, GPT-4.1, and GPT-5 mini are all listed as 0x multiplier models.
+
+> **Source**: [Requests in GitHub Copilot - Model multipliers](https://docs.github.com/en/copilot/concepts/billing/copilot-requests#model-multipliers)
+
+Under the usage-based billing model that GitHub began rolling out in June 2026, included models continue to have a 0x per-token rate, effectively free to paid subscribers, as they were under the earlier request model.
+
+> **Source**: [Usage-based billing for individuals](https://docs.github.com/en/copilot/concepts/billing/usage-based-billing-for-individuals)
+
+### What this means for each plan
+
+| Plan                                        | Cost of AI features                                                                                                                         |
+| ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
+| **Copilot Pro, Pro+, Business, Enterprise** | No additional cost; all preferred models are 0x multiplier                                                                                  |
+| **Copilot Free**                            | Counts against your 50 monthly premium requests; the extension tries to use 0x models, but Copilot Free may route some requests differently |
+| **Enterprise / custom agreements**          | Follows your organization's Copilot billing policy; check with your Copilot administrator if unsure                                         |
+
+### The disclosure label in the extension
+
+The cost-neutral disclosure row that appears in AI results panels ("No additional cost for most GitHub Copilot subscribers") reflects the 0x multiplier designation described above. We say "most" rather than "all" because enterprise agreements, educational licences, and reseller-managed accounts may have different terms.
+
+### A note on billing changes
+
+GitHub's billing model is evolving. If GitHub changes the pricing tier for any of the models in the preference chain, we will update both the extension and this page. Follow the [Further reading](#further-reading) links below for the latest GitHub documentation.
+
+---
+
+## Which model does the extension use?
+
+The extension always targets **utility (included) models**: the tier of GitHub Copilot models that are covered by a standard subscription and do not consume your monthly premium request allowance. These models are fast, efficient, and well-suited to structured, bounded tasks like query analysis.
+
+Rather than hard-coding a single model, the extension tries to pick the best utility model available in your environment. It works through a preference chain, stopping at the first match:
+
+| Priority | Model family      | Notes                                                                |
+| -------- | ----------------- | -------------------------------------------------------------------- |
+| 1        | `gpt-4.1`         | Preferred; strong reasoning, 0x multiplier on paid plans             |
+| 2        | `gpt-4o`          | Fallback; also a 0x multiplier model on paid plans                   |
+| 3        | `copilot-utility` | Last resort; the generic utility alias exposed by the VS Code LM API |
+
+All AI-assisted features in the extension are optimized and tested against the preferred model (`gpt-4.1`). If a fallback is used, quality may vary slightly. If the preferred model family is not available, the extension automatically steps down to the next entry in the chain without showing a popup; if a fallback fires, it is recorded in the extension's output channel (under **DocumentDB** in the Output panel) for diagnostics.
+
+If none of these families is available in your environment (for example, GitHub Copilot is not signed in, or your organization has disabled LM API access for third-party extensions), the AI features will not activate.
+
+---
+
+## Which model was actually used?
+
+After a successful AI analysis, a small **"Powered by"** byline appears at the bottom of the results panel:
+
+> _No additional cost for most GitHub Copilot subscribers. [Learn more about the utility model used.](https://aka.ms/vscode-documentdb-copilot-utility-model)_
+> _Powered by GPT-4o via GitHub Copilot_
+
+The name shown is the human-readable display name of the model that actually produced the response, not a pre-invocation guess. The stable internal identifier (e.g. `copilot-gpt-4o`) is captured in the extension's output channel and telemetry for diagnostics.
+
+If the byline reads `copilot-utility` or shows an unfamiliar name, the extension used the generic utility fallback rather than one of the preferred families. This typically means the preferred models were not available at the time of the request.
+
+---
+
+## Further reading
+
+- [Requests in GitHub Copilot](https://docs.github.com/en/copilot/concepts/billing/copilot-requests): GitHub's authoritative reference for request types, model multipliers, and plan allowances
+- [Plans for GitHub Copilot](https://docs.github.com/en/copilot/get-started/plans): Compare what is included in each plan
+- [Usage-based billing for individuals](https://docs.github.com/en/copilot/concepts/billing/usage-based-billing-for-individuals): How AI Credits work under the June 2026 billing model
+- [Usage-based billing for organizations and enterprises](https://docs.github.com/en/copilot/concepts/billing/usage-based-billing-for-organizations-and-enterprises): The same for Business and Enterprise plans
+- [VS Code Language Model API](https://code.visualstudio.com/api/extension-guides/language-model): The stable API this extension uses to talk to Copilot
@@ -14,6 +14,20 @@
   "[BatchSizeAdapter] Success: Growing batch size {0} → {1} (mode: {2}, growth: {3}%)": "[BatchSizeAdapter] Success: Growing batch size {0} → {1} (mode: {2}, growth: {3}%)",
   "[BatchSizeAdapter] Throttle with no progress: Halving batch size {0} → {1}": "[BatchSizeAdapter] Throttle with no progress: Halving batch size {0} → {1}",
   "[BatchSizeAdapter] Throttle: Adjusting batch size {0} → {1} (proven capacity: {2})": "[BatchSizeAdapter] Throttle: Adjusting batch size {0} → {1} (proven capacity: {2})",
+  "[Copilot] Available models from VS Code: {0}": "[Copilot] Available models from VS Code: {0}",
+  "[Copilot] Call cancelled during streaming": "[Copilot] Call cancelled during streaming",
+  "[Copilot] Could not enumerate model own properties: {0}": "[Copilot] Could not enumerate model own properties: {0}",
+  "[Copilot] countTokens failed; usage metric will be omitted: {0}": "[Copilot] countTokens failed; usage metric will be omitted: {0}",
+  "[Copilot] Model family preference chain: {0}": "[Copilot] Model family preference chain: {0}",
+  "[Copilot] No family preferences supplied; using first available: {0}": "[Copilot] No family preferences supplied; using first available: {0}",
+  "[Copilot] No language models available from vendor \"copilot\". Aborting selection.": "[Copilot] No language models available from vendor \"copilot\". Aborting selection.",
+  "[Copilot] No preferred family matched; falling back to first available: {0}": "[Copilot] No preferred family matched; falling back to first available: {0}",
+  "[Copilot] Requested family \"{0}\" → accepted (matched id: {1})": "[Copilot] Requested family \"{0}\" → accepted (matched id: {1})",
+  "[Copilot] Requested family \"{0}\" → rejected (no available model in this family)": "[Copilot] Requested family \"{0}\" → rejected (no available model in this family)",
+  "[Copilot] Selected model metadata: {0}": "[Copilot] Selected model metadata: {0}",
+  "[Copilot] Selected model own enumerable properties: {0}": "[Copilot] Selected model own enumerable properties: {0}",
+  "[Copilot] Selected model: {0}": "[Copilot] Selected model: {0}",
+  "[Copilot] Tokens: prompt={0}, response={1}, total={2}, context={3}, utilization={4}%": "[Copilot] Tokens: prompt={0}, response={1}, total={2}, context={3}, utilization={4}%",
   "[CopyPasteTask] onProgress: {0}% ({1}/{2} docs) - {3}": "[CopyPasteTask] onProgress: {0}% ({1}/{2} docs) - {3}",
   "[DocumentDbStreamingWriter/Abort Strategy] Conflict for document with _id: {0}. {1}": "[DocumentDbStreamingWriter/Abort Strategy] Conflict for document with _id: {0}. {1}",
   "[DocumentDbStreamingWriter/Abort Strategy] Operation aborted due to {0} duplicate key conflict(s)": "[DocumentDbStreamingWriter/Abort Strategy] Operation aborted due to {0} duplicate key conflict(s)",
@@ -25,10 +39,11 @@
   "[KeepAlive] Background read: count={0}, buffer length={1}": "[KeepAlive] Background read: count={0}, buffer length={1}",
   "[KeepAlive] Read from buffer, remaining: {0} documents": "[KeepAlive] Read from buffer, remaining: {0} documents",
   "[KeepAlive] Skipped: only {0}s since last read (interval: {1}s)": "[KeepAlive] Skipped: only {0}s since last read (interval: {1}s)",
-  "[Query Generation] Calling Copilot (model: {model})...": "[Query Generation] Calling Copilot (model: {model})...",
+  "[Query Generation] Calling Copilot (family: {family})...": "[Query Generation] Calling Copilot (family: {family})...",
   "[Query Generation] Completed successfully": "[Query Generation] Completed successfully",
   "[Query Generation] Copilot response received in {ms}ms (model: {model})": "[Query Generation] Copilot response received in {ms}ms (model: {model})",
   "[Query Generation] listCollections completed in {ms}ms ({count} collections)": "[Query Generation] listCollections completed in {ms}ms ({count} collections)",
+  "[Query Generation] Preferred model family \"{preferredFamily}\" was not available; used \"{actualModel}\" (family: {actualFamily}) instead. Results may vary.": "[Query Generation] Preferred model family \"{preferredFamily}\" was not available; used \"{actualModel}\" (family: {actualFamily}) instead. Results may vary.",
   "[Query Generation] Schema sampling completed in {ms}ms": "[Query Generation] Schema sampling completed in {ms}ms",
   "[Query Generation] Schema sampling for {collection} completed in {ms}ms": "[Query Generation] Schema sampling for {collection} completed in {ms}ms",
   "[Query Generation] Started: type={type}, targetQueryType={targetQueryType}": "[Query Generation] Started: type={type}, targetQueryType={targetQueryType}",
@@ -49,20 +64,20 @@
   "[Query Insights Action] Modify index action error: {error}": "[Query Insights Action] Modify index action error: {error}",
   "[Query Insights Action] Modify index action failed: {error}": "[Query Insights Action] Modify index action failed: {error}",
   "[Query Insights Action] Session ID is required": "[Query Insights Action] Session ID is required",
-  "[Query Insights AI] Calling Copilot (model: {model})...": "[Query Insights AI] Calling Copilot (model: {model})...",
+  "[Query Insights AI] Calling Copilot (family: {family})...": "[Query Insights AI] Calling Copilot (family: {family})...",
   "[Query Insights AI] Cancelled after getCollectionStats": "[Query Insights AI] Cancelled after getCollectionStats",
   "[Query Insights AI] Cancelled after listIndexes": "[Query Insights AI] Cancelled after listIndexes",
   "[Query Insights AI] Cancelled before calling Copilot": "[Query Insights AI] Cancelled before calling Copilot",
   "[Query Insights AI] Cancelled before fetching stats": "[Query Insights AI] Cancelled before fetching stats",
   "[Query Insights AI] Cancelled before running explain queries": "[Query Insights AI] Cancelled before running explain queries",
   "[Query Insights AI] Cancelled before starting optimization": "[Query Insights AI] Cancelled before starting optimization",
-  "[Query Insights AI] Copilot call cancelled during streaming": "[Query Insights AI] Copilot call cancelled during streaming",
   "[Query Insights AI] Copilot response received in {ms}ms (model: {model})": "[Query Insights AI] Copilot response received in {ms}ms (model: {model})",
   "[Query Insights AI] explain({commandType}) completed in {ms}ms": "[Query Insights AI] explain({commandType}) completed in {ms}ms",
   "[Query Insights AI] Failed to retrieve collection/index stats (non-critical): {message}": "[Query Insights AI] Failed to retrieve collection/index stats (non-critical): {message}",
   "[Query Insights AI] getCollectionStats completed in {ms}ms": "[Query Insights AI] getCollectionStats completed in {ms}ms",
   "[Query Insights AI] getIndexStats completed in {ms}ms": "[Query Insights AI] getIndexStats completed in {ms}ms",
   "[Query Insights AI] listIndexes completed in {ms}ms": "[Query Insights AI] listIndexes completed in {ms}ms",
+  "[Query Insights AI] Preferred model family \"{preferredFamily}\" was not available; used \"{actualModel}\" (family: {actualFamily}) instead. Recommendations may be less optimal.": "[Query Insights AI] Preferred model family \"{preferredFamily}\" was not available; used \"{actualModel}\" (family: {actualFamily}) instead. Recommendations may be less optimal.",
   "[Query Insights AI] Using preloaded collection stats": "[Query Insights AI] Using preloaded collection stats",
   "[Query Insights AI] Using preloaded execution plan": "[Query Insights AI] Using preloaded execution plan",
   "[Query Insights AI] Using preloaded index stats": "[Query Insights AI] Using preloaded index stats",
@@ -650,7 +665,6 @@
   "Index key covers {0}% of the collection per bucket": "Index key covers {0}% of the collection per bucket",
   "Index modification cancelled": "Index modification cancelled",
   "Index Name": "Index Name",
-  "Index optimization is using model \"{actualModel}\" instead of preferred \"{preferredModel}\". Recommendations may be less optimal.": "Index optimization is using model \"{actualModel}\" instead of preferred \"{preferredModel}\". Recommendations may be less optimal.",
   "Index Options": "Index Options",
   "Index used": "Index used",
   "Index Used": "Index Used",
@@ -708,6 +722,7 @@
   "Learn more about integrating your cloud provider.": "Learn more about integrating your cloud provider.",
   "Learn more about integrating your cloud providers.": "Learn more about integrating your cloud providers.",
   "Learn more about local connections.": "Learn more about local connections.",
+  "Learn more about the utility model used.": "Learn more about the utility model used.",
   "Learn more…": "Learn more…",
   "Length must be greater than 1": "Length must be greater than 1",
   "Level up": "Level up",
@@ -772,6 +787,7 @@
   "Next tip": "Next tip",
   "No": "No",
   "No Action": "No Action",
+  "No additional cost for most GitHub Copilot subscribers.": "No additional cost for most GitHub Copilot subscribers.",
   "No authenticated tenants found. Use \"Manage Azure Accounts\" in the Discovery View to sign in to tenants.": "No authenticated tenants found. Use \"Manage Azure Accounts\" in the Discovery View to sign in to tenants.",
   "No authentication method selected.": "No authentication method selected.",
   "No authentication methods available for \"{cluster}\".": "No authentication methods available for \"{cluster}\".",
@@ -807,7 +823,7 @@
   "No sorting required": "No sorting required",
   "No subscriptions found": "No subscriptions found",
   "No subscriptions found for the selected tenants. Please adjust your tenant selection or check your Azure permissions.": "No subscriptions found for the selected tenants. Please adjust your tenant selection or check your Azure permissions.",
-  "No suitable language model found. Please ensure GitHub Copilot is installed and you have an active subscription.": "No suitable language model found. Please ensure GitHub Copilot is installed and you have an active subscription.",
+  "No suitable language model is available. Please ensure GitHub Copilot is installed and signed in with an active subscription, and that you accepted the language-model access consent prompt the first time this feature was used (you can re-trigger it by running the feature again).": "No suitable language model is available. Please ensure GitHub Copilot is installed and signed in with an active subscription, and that you accepted the language-model access consent prompt the first time this feature was used (you can re-trigger it by running the feature again).",
   "No target node selected.": "No target node selected.",
   "No tenants available": "No tenants available",
   "No tenants available for this account": "No tenants available for this account",
@@ -872,6 +888,7 @@
   "Port number is required": "Port number is required",
   "Port number must be a number": "Port number must be a number",
   "Port number must be between 1 and 65535": "Port number must be between 1 and 65535",
+  "Powered by {0} via GitHub Copilot": "Powered by {0} via GitHub Copilot",
   "Press Escape to exit editor": "Press Escape to exit editor",
   "Previous tip": "Previous tip",
   "Privacy Statement": "Privacy Statement",
@@ -893,7 +910,6 @@
   "Query filters on a boolean field, which has only two distinct values": "Query filters on a boolean field, which has only two distinct values",
   "Query generation failed": "Query generation failed",
   "Query generation failed with the error: {0}": "Query generation failed with the error: {0}",
-  "Query generation is using model \"{actualModel}\" instead of preferred \"{preferredModel}\". Results may vary.": "Query generation is using model \"{actualModel}\" instead of preferred \"{preferredModel}\". Results may vary.",
   "query insights": "query insights",
   "Query Insights APIs not initialized. Client may not be properly connected.": "Query Insights APIs not initialized. Client may not be properly connected.",
   "Query Insights feature is in preview": "Query Insights feature is in preview",