perf(tree): throttle background document-count fetches by tnaum-ms · Pull Request #685 · microsoft/vscode-documentdb

tnaum-ms · 2026-05-28T08:10:36Z

Summary

When a database node is expanded in the tree, every CollectionItem fires its own estimateDocumentCount request in parallel (fire-and-forget, from DatabaseItem.getChildren). For databases with many collections this produces a burst of concurrent requests that:

opens many sockets in the MongoDB driver connection pool (default maxPoolSize: 100), and
competes with foreground operations (queries, the collection view) for pool slots and server resources.

This PR adds a small in-house concurrency limiter and applies it to the background count fetches so this work stays unobtrusive.

Changes

New utility src/utils/concurrencyLimiter.ts exposing createConcurrencyLimiter({ concurrency, interTaskDelayMs }). Caps in-flight tasks and optionally inserts a delay before dispatching the next queued task after one completes.
CollectionItem.fetchAndUpdateCount now runs through a per-cluster limiter (keyed by clusterId) with concurrency = 5 and interTaskDelayMs = 250. Each cluster gets its own pool so different clusters never share a queue.

Behaviour: tree expansion still returns immediately, descriptions still fill in as counts arrive, but at most 5 count requests are in flight per cluster and the next dispatch waits 250 ms after a completion. UX impact is negligible (counts trickle in slightly later for the 6th and later collections), pool/server impact is dramatically reduced.

Why not `p-limit`?

p-limit is the de-facto standard for this and the obvious first choice. It is, however, pure ESM since v4.0.0. This extension is built and bundled as CommonJS (tsconfig module: "commonjs", webpack-bundled dist/extension.js, VS Code extension host loads via require). Using current p-limit in CJS code requires either:

Pin to p-limit@3.1.0 (last CJS release, Oct 2020). Still works but stale, and we would still need to wrap it to add the inter-task delay used by low-priority background work.
Use dynamic import('p-limit') at every call site. Awkward in synchronous code paths and adds an async boundary at module load.
Migrate the whole codebase to ESM. Touches webpack output format, tsconfig, jest/ts-jest, every relative import (ESM requires .js extensions), eslint config. Not worth it for one dependency.

The in-house implementation is ~80 lines including JSDoc, has no runtime dependency, fits the CJS bundle, and gives us a clean place to add the interTaskDelayMs knob (which p-limit does not provide). If the extension ever migrates to ESM we can drop this and swap to p-limit mechanically.

Test plan

npm run prettier-fix, npm run lint, npm run build all clean.
npx jest --no-coverage: 1900/1900 tests pass. (Three test suites were SIGKILL'd by the OS in CI-like conditions; unrelated to this change.)
No user-facing strings changed, so no npm run l10n needed.

Follow-ups (out of scope)

Apply the same limiter to other tree-driven background loads (e.g. lazy index counts) once they exist.
Add cancellation on tree collapse via the limiter's queue (would need a clearQueue() method similar to p-limit's).

Copilot

Pull request overview

This PR reduces load spikes caused by tree expansion by introducing a small in-house concurrency limiter and routing background per-collection estimateDocumentCount calls through a per-cluster limiter, keeping the work low-priority and less disruptive to foreground operations.

Changes:

Added createConcurrencyLimiter({ concurrency, interTaskDelayMs }) utility to cap in-flight async tasks and optionally pace dispatch.
Applied a per-clusterId limiter (concurrency 5, 250ms delay) to CollectionItem.fetchAndUpdateCount background count fetches.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`src/utils/concurrencyLimiter.ts`	Adds a reusable promise concurrency limiter with optional inter-task pacing.
`src/tree/documentdb/CollectionItem.ts`	Routes background document count fetches through a per-cluster limiter to reduce request bursts.

Cap concurrent estimateDocumentCount calls per cluster and add a small delay between dispatches so lazy tree metadata loads do not monopolize the MongoDB driver connection pool or burst the server. - Add createConcurrencyLimiter() in src/utils/concurrencyLimiter.ts (in-house, CJS-friendly alternative to p-limit). - Wrap fetchAndUpdateCount() in CollectionItem with a per-cluster limiter (concurrency=5, interTaskDelayMs=250).

Drop interBatchDelayMs from the CollectionItem fetch. The batch-then-rest pattern was overkill: the slowest count in a batch held up the next batch, producing visible 'first N, then M' gaps in the tree. A plain semaphore (concurrency: 5) keeps the pipe smoothly busy without ever exceeding the cap, which is the right shape for slow background work where individual task latencies vary. The interTaskDelayMs and interBatchDelayMs knobs remain available on the limiter for callers that genuinely want trickle or burst-rest behaviour. Also clarify the sort-then-enqueue contract in DatabaseItem and expand the concurrencyLimiter JSDoc with mode-selection guidance and examples.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

Remove interTaskDelayMs and interBatchDelayMs from ConcurrencyLimiterOptions and from the implementation. Both were unused by the only caller (per-cluster document-count fetches use a plain semaphore) and the delayed dispatch paths had two known bugs: - F1: the timer-based release path decremented 'active' before the delay, so new callers could observe 'active < concurrency' and start during the delay window. When the queued waiter resumed it also incremented 'active', exceeding the configured cap. - F7: interBatchDelayMs documentation described batch semantics, but the implementation refilled one slot per completion, behaving like continuous refill. The batch delay almost never fired. We can re-add a pacing knob later with proper slot-reservation semantics and tests if a real use case appears.

Math.floor(NaN) is NaN, and Math.max(1, NaN) is also NaN. With concurrency set to NaN, 'active >= concurrency' is always false, so the limiter silently stops limiting. Guard with Number.isFinite and fall back to 1. The only current caller passes a literal 5, so this is pure hardening of an exported utility.

…ks (N5) The release path resumes the next waiter inside a try/catch. The current body cannot throw, but a future change (telemetry, logging, an extra callback) could. If release ever threw, the queued waiter would never be resumed and the limiter would deadlock for the lifetime of the process. Swallowing here is the right tradeoff: a misbehaving callback should not wedge the whole limiter.

tnaum-ms · 2026-05-28T11:45:57Z

N5 (defensive dispatch guard): addressed in 7635a56.

Action: wrapped the waiter-resume step inside release() in a try/catch that swallows.

Reason: the current body cannot throw, but if a future change ever adds a throwing call (telemetry, logging, etc.) inside release, the queued waiter would never be resumed and the limiter would deadlock for the rest of the process lifetime. The cost of the guard is a few lines and one extra try block. The benefit is that a misbehaving callback can never wedge the limiter.

Covers: - concurrency cap is never exceeded for synchronous and asynchronous task shapes - concurrency is clamped to at least 1 (0 and negative values) - fractional concurrency is floored - FIFO dispatch order matches enqueue order - a rejected task releases its slot and queued tasks proceed - the cap is preserved across mixed success / rejection workloads - non-finite concurrency (NaN, +/-Infinity) falls back to 1 instead of silently disabling the limit These tests would have caught the F1 and F5 issues before review.

Before this change, when the user refreshed, collapsed, or re-expanded a database, DatabaseItem.getChildren constructed fresh CollectionItem instances. The old instances were dropped from the tree but their queued or in-flight estimateDocumentCount work continued, eventually writing to documentCount on the stale instance and firing notifyChildrenChanged on ids that no longer mattered. That work also competed with foreground operations for connection pool slots, which is what this PR is supposed to prevent. Approach: DatabaseItem maintains a monotonic expansionGeneration counter, bumped on each getChildren call. The current generation value is captured into a closure that is handed to CollectionItem as isCurrent(). The CollectionItem checks isCurrent() twice: 1. At dispatch time inside the limiter task: if stale, return null without issuing the estimateDocumentCount request at all. 2. After the await returns: if stale, do not write back documentCount and do not call notifyChildrenChanged. The constructor parameter defaults to () => true so any direct callers of CollectionItem are unaffected.

tnaum-ms · 2026-05-28T11:50:45Z

F2 (stale-tree-item guard for queued document counts): addressed in 38a78e1.

Action: DatabaseItem now keeps a monotonic expansionGeneration counter that is bumped on every getChildren call. The current value is captured into an isCurrent() closure passed to each CollectionItem. The collection's background fetch checks isCurrent() twice: once at dispatch time inside the limiter task (skip the request entirely if stale), and once after the await returns (skip the writeback and the tree refresh notification).

Reason: before this change, queued or in-flight count work on stale CollectionItem instances would run, hit the server, and write to the now-dropped instance's documentCount field. That defeats the throttling intent (work piles up across refreshes) and consumes connection pool slots that should serve foreground operations. The generation token is the cheapest fix that addresses both: no signal plumbing, no limiter API change, default () => true keeps direct callers of CollectionItem unaffected.

The inner task we hand to the document-count limiter previously captured `this`, transitively pinning the CollectionItem (and through it the TreeCluster, DatabaseItemModel, and CollectionItemModel) until the queued work either ran or the outer chain completed. Hoist clusterId, dbName, collName, and the isCurrent closure into local variables before the await. The inner task now captures only those few strings plus the small isCurrent closure. The outer async frame still references `this` (it is an instance method) but that frame is short once the post-await stale-check fires. Behaviour is unchanged.

tnaum-ms · 2026-05-28T11:51:41Z

N2 (capture primitives in queued limiter closure): addressed in f9380b7.

Action: hoisted clusterId, dbName, collName, and the isCurrent closure into locals at the top of fetchAndUpdateCount and used those locals inside the limiter task. The inner closure no longer references this.

Reason: the closure handed to the limiter is alive for as long as the await is pending. Previously it captured this, transitively pinning the TreeCluster, DatabaseItemModel, and CollectionItemModel. With this change the queued task pins only a few strings plus the small isCurrent closure. Behavior is unchanged.

… (F8) DatabaseItem: shorten the comment above the alphabetical sort. The old wording promised counts would 'populate predictably from the top of the visible list downward'. With concurrency > 1, request latency variance makes completion order non-deterministic even though dispatch order is FIFO. State only what is true: sorting fixes the dispatch (request) order; completion order may still differ. CollectionItem: remove the reference to DOCUMENT_COUNT_CONCURRENCY (the constant was inlined in an earlier commit) and trim the surrounding prose. The limit value '5' is now stated directly in the comment so it matches the code.

CollectionItem and DatabaseItem each carried their own copy of an escapeMarkdown helper. There is already an exported version in src/webviews/utils/escapeMarkdown.ts with its own tests. Replace both local copies with imports of the shared util. The shared regex escapes a slightly larger set of characters (adds <, >, &) which is strictly safer for MarkdownString tooltips. Behavior on tooltip text containing only the previously-handled characters is unchanged. Two other duplicates remain in DocumentDBClusterItem.ts and PlaygroundHoverProvider.ts; those files are outside this PR's scope and can be consolidated in a follow-up.

tnaum-ms · 2026-05-28T11:53:49Z

N4 (consolidate duplicated escapeMarkdown): addressed in 81ceabc.

Action: replaced the local escapeMarkdown copies in CollectionItem.ts and DatabaseItem.ts with imports from src/webviews/utils/escapeMarkdown.ts (which already has tests).

Reason: identical helper, two copies, no upside. The shared util's regex covers a slightly larger character set (adds <, >, &), which is strictly safer for MarkdownString tooltips. Two duplicates remain (DocumentDBClusterItem.ts, PlaygroundHoverProvider.ts); those are outside this PR's scope and noted in the commit for a follow-up.

github-actions · 2026-05-28T11:57:31Z

✅ Code Quality Checks

Check	Status	How to fix
Localization (`l10n`)	✅ Passed
ESLint	✅ Passed
Prettier formatting	✅ Passed

This comment is updated automatically on each push.

github-actions · 2026-05-28T12:01:06Z

📦 Build Size Report

Metric	Base (`main`)	PR	Delta
VSIX (`vscode-documentdb-0.8.0.vsix`)	7.53 MB	7.53 MB	⬆️ +0 KB (+0.0%)
Webview bundle (`views.js`)	5.88 MB	5.88 MB	✅ 0 KB (0.0%)

Download artifact · updated automatically on each push.

Copilot AI review requested due to automatic review settings May 28, 2026 08:10

tnaum-ms requested a review from a team as a code owner May 28, 2026 08:10

Copilot started reviewing on behalf of tnaum-ms May 28, 2026 08:10 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

Comment thread src/utils/concurrencyLimiter.ts Outdated

Comment thread src/utils/concurrencyLimiter.ts Outdated

Comment thread src/utils/concurrencyLimiter.ts

Comment thread src/tree/documentdb/CollectionItem.ts Outdated

tnaum-ms force-pushed the dev/tnaum/limit-concurrent-counts branch from f64db45 to d18180e Compare May 28, 2026 08:16

This was referenced May 28, 2026

perf: investigate schema sampling fan-out in LLM query generation #686

Open

build: migrate from CommonJS to ESM #687

Open

perf: throttle parallel Azure tenant sign-in checks #688

Open

tnaum-ms added this to the 0.8.1 milestone May 28, 2026

tnaum-ms added this to DocumentDB for VSCode: Release Plan May 28, 2026

docs(concurrencyLimiter): trim verbose comments to essentials

4f616ea

tnaum-ms requested a review from Copilot May 28, 2026 10:47

Copilot started reviewing on behalf of tnaum-ms May 28, 2026 10:47 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

tnaum-ms added 4 commits May 28, 2026 13:15

fix: replace constant DOCUMENT_COUNT_CONCURRENCY with literal value 5

e7e5fd8

tnaum-ms added 2 commits May 28, 2026 13:47

tnaum-ms added 2 commits May 28, 2026 13:52

languy approved these changes May 28, 2026

View reviewed changes

tnaum-ms merged commit 1199e68 into main May 28, 2026
8 checks passed

tnaum-ms deleted the dev/tnaum/limit-concurrent-counts branch May 28, 2026 13:15

github-project-automation Bot moved this to Done in DocumentDB for VSCode: Release Plan May 28, 2026

hanhan761 mentioned this pull request May 29, 2026

perf(azure): throttle parallel tenant sign-in checks with shared concurrency limiter #694

Open

3 tasks

tnaum-ms mentioned this pull request Jun 2, 2026

docs: release notes and changelog for v0.8.1 #727

Merged

Conversation

tnaum-ms commented May 28, 2026

Summary

Changes

Why not p-limit?

Test plan

Follow-ups (out of scope)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

tnaum-ms commented May 28, 2026

Uh oh!

github-actions Bot commented May 28, 2026

✅ Code Quality Checks

Uh oh!

github-actions Bot commented May 28, 2026

📦 Build Size Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Why not `p-limit`?