Skip to content

perf: exclude node_modules from content globbing in monorepos#12199

Draft
srpatcha wants to merge 1 commit into
facebook:mainfrom
srpatcha:fix/globby-ignore-node-modules
Draft

perf: exclude node_modules from content globbing in monorepos#12199
srpatcha wants to merge 1 commit into
facebook:mainfrom
srpatcha:fix/globby-ignore-node-modules

Conversation

@srpatcha

Copy link
Copy Markdown
Contributor

Motivation

Fixes #12128

In monorepo setups with hoisted/linked workspace dependencies (pnpm, yarn workspaces), Globby scans through
ode_modules\ directories when sourcing docs/blog/pages content. This causes extreme startup slowness — the reporter measured 34 seconds reduced to 330ms after adding node_modules to the ignore list.

Changes

  1. \packages/docusaurus-utils/src/globUtils.ts**: Add '/node_modules/**'\ to \GlobExcludeDefault, which is used by the docs, blog, and pages plugins when globbing content files.

  2. \packages/docusaurus-plugin-content-docs/src/sidebars/index.ts**: Add \ignore: ['/node_modules/**']\ to
    eadCategoriesMetadata()\ — this Globby call had no ignore patterns at all, meaning it scanned every _category_.{json,yml,yaml}\ inside
    ode_modules.

  3. **\packages/docusaurus-utils/src/tests/globUtils.test.ts**: Add test coverage for node_modules exclusion in both \createMatcher\ and \createAbsoluteFilePathMatcher.

Test Plan

  • Added unit tests verifying node_modules paths are excluded by the default glob patterns
  • No behavioral change for users without node_modules inside their content directories

In monorepo setups with hoisted/linked dependencies, Globby scans
through node_modules directories when sourcing content, causing
extreme startup slowness (34s to 330ms in the reported case).

Changes:
- Add **/node_modules/** to GlobExcludeDefault in globUtils.ts
- Add ignore pattern to readCategoriesMetadata() in sidebars/index.ts
  which had no ignore patterns at all
- Add test coverage for node_modules exclusion

Fixes facebook#12128
@meta-cla meta-cla Bot added the CLA Signed Signed Facebook CLA label Jun 25, 2026
@netlify

netlify Bot commented Jun 25, 2026

Copy link
Copy Markdown

[V2]

Built without sensitive environment variables

Name Link
🔨 Latest commit 5799513
🔍 Latest deploy log https://app.netlify.com/projects/docusaurus-2/deploys/6a3ccd285c6d6a00084366fb
😎 Deploy Preview https://deploy-preview-12199--docusaurus-2.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

async function readCategoriesMetadata(contentPath: string) {
const categoryFiles = await Globby('**/_category_.{json,yml,yaml}', {
cwd: contentPath,
ignore: ['**/node_modules/**'],

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment here: #12129 (comment)

This will make another legit use-case impossible.

We'd like to optimize, without preventing the content path being node_modules/@myCompany/docs. I'd like to see a test covering this edge case that keeps passing after the performance optimization.

@slorber slorber marked this pull request as draft June 25, 2026 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Signed Facebook CLA

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docusaurus slow on a monorepo site due to globbing node_modules

2 participants