Skip to content

feat(parser-pipeline): create @endo/parser-pipeline#3158

Draft
boneskull wants to merge 1 commit into
boneskull/export-find-canonical-names2from
boneskull/ast-service
Draft

feat(parser-pipeline): create @endo/parser-pipeline#3158
boneskull wants to merge 1 commit into
boneskull/export-find-canonical-names2from
boneskull/ast-service

Conversation

@boneskull

@boneskull boneskull commented Apr 4, 2026

Copy link
Copy Markdown
Member

Introduces @endo/parser-pipeline — a new package that eliminates redundant
Babel AST parsing for @endo/compartment-mapper consumers. Tools like LavaMoat
have historically parsed each module 2–3 times (once for import/export analysis,
once for evasive transforms, once for policy-globals analysis). This package
composes all those passes into a single parse → analyze → transform → generate
cycle per module.

Core API

createParsers(configs, options?) is the primary entry point. It accepts a
language-keyed config map and returns { sync, async } parser maps compatible
with @endo/compartment-mapper's parserForLanguage option. The
@endo/module-source analysis step is handled implicitly by the pipeline based on the
source language (mjs or cjs); consumers layer in user-defined analyzer and
transform factories.

moduleSourceConfigs(extras?) builds a pre-wired config map for both ESM and
CJS by calling analyzeModule() / analyzeCjs() from @endo/module-source
internally. Extra factories and per-language finalizeRecord hooks are merged
on top.

createEvasiveTransformPass(options?) wraps @endo/evasive-transform's
visitor as a pipeline-compatible TransformPass owned by this package (so
structural compatibility is verified at compile time, not by duck typing).

Workers

runPipelineInWorker(port, configs) powers the async path. Consumer worker
scripts call this to handle parse messages dispatched by WorkerParserPool.
Workers are unref()'d at spawn time; an idle timeout terminates them after
inactivity. Pending dispatches hold the event loop via their main-thread
promises, so the process exits cleanly once the last dispatch settles — no
explicit terminate() call required.

WorkerParserPool manages worker spawning, queueing when at capacity, idle
timeouts, error handling, and result routing. Tests cover pool exhaustion,
worker crash, unexpected exit, and malformed response shapes.

Internals

runPipeline(params) is the shared core (used by both sync and async paths)
that owns the Babel parse → implicit module-source analyze → user analyzers →
user transforms → implicit module-source transform → generate → buildRecord
sequence.

The package exports AnalyzerFactory, TransformFactory, AnalyzerPass,
TransformPass, PipelineConfigs, CreateParsersOptions, CreateParsersResult,
and the worker-pool types.

@boneskull

boneskull commented Apr 4, 2026

Copy link
Copy Markdown
Member Author

Warning

This PR is part of a stack and targets branch boneskull/export-find-canonical-names2, not master.
DO NOT MERGE until feat(compartment-mapper): expose findUnknownCanonicalNames #3221 is merged into master.

📚 Pull Request Stack


Managed by gh-stack

@changeset-bot

changeset-bot Bot commented Apr 4, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 11a3feb

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@endo/parser-pipeline Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@boneskull

Copy link
Copy Markdown
Member Author

TODO: figure out why the docs build is failing

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new @endo/parser-pipeline workspace package to avoid redundant Babel AST parsing by composing multiple analyzer/transform visitor passes into a single parse→traverse→generate cycle, and adds an async worker-pool-backed parser option for parallel policy-generation workflows. It also updates @endo/compartment-mapper’s parser typing and mapping logic to support asynchronous parsers alongside synchronous ones.

Changes:

  • Add @endo/parser-pipeline with createComposedParser (sync) and createWorkerParser (async worker pool), plus worker-side runPipelineInWorker.
  • Refactor @endo/compartment-mapper parser types and makeMapParsers to support async parsers and async module transforms.
  • Add tests for composed parsing and worker-pool behavior.

Reviewed changes

Copilot reviewed 23 out of 25 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
yarn.lock Adds new workspace package and Babel-related deps/types.
typedoc.json Formatting tweaks and excludes the new package from typedoc output.
packages/parser-pipeline/package.json Defines the new package, dependencies, and exports.
packages/parser-pipeline/index.js Public entrypoint exporting types + parser constructors.
packages/parser-pipeline/src/composed-parser.js Sync composed parser implementation.
packages/parser-pipeline/src/worker-parser.js Async parser wrapper backed by worker pool.
packages/parser-pipeline/src/worker-pool.js Worker thread pool and dispatch/queue/terminate logic.
packages/parser-pipeline/src/worker-runner.js Worker-side message listener that runs the pipeline per task.
packages/parser-pipeline/src/types/external.ts Public types for pipeline configuration and worker protocol.
packages/parser-pipeline/src/external.types.* Type-export shim for JS consumers.
packages/parser-pipeline/test/* AVA tests for composed parser, worker runner, and worker pool.
packages/parser-pipeline/README.md New package documentation and usage examples.
packages/parser-pipeline/SECURITY.md Package security policy doc.
packages/parser-pipeline/LICENSE Apache-2.0 license file.
packages/parser-pipeline/tsconfig*.json Typechecking/build config for the new package.
packages/compartment-mapper/src/types/external.ts Splits sync vs async parser types and updates ParserForLanguage typing.
packages/compartment-mapper/src/types/internal.ts Adjusts internal operator typings to accept ParseFn or AsyncParseFn.
packages/compartment-mapper/src/map-parser.js Refactors parser selection/trampolines for sync+async support.
packages/compartment-mapper/src/import-hook.js Adds a sync-parse type guard for importNow/dynamic-require constraints.
packages/compartment-mapper/src/link.js Simplifies heuristicImports access with updated parser types.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/parser-pipeline/src/worker-runner.js
Comment thread packages/parser-pipeline/src/worker-runner.js Outdated
Comment thread packages/parser-pipeline/src/worker-runner.js Outdated
Comment thread packages/parser-pipeline/src/worker-pool.js
Comment thread packages/compartment-mapper/src/map-parser.js
Comment thread packages/parser-pipeline/README.md Outdated
Comment thread packages/parser-pipeline/README.md Outdated
Comment thread packages/parser-pipeline/package.json Outdated
Comment thread packages/parser-pipeline/src/worker-parser.js Outdated
Comment thread packages/parser-pipeline/SECURITY.md
@boneskull boneskull force-pushed the boneskull/ast-service branch 3 times, most recently from 9abf181 to 460895c Compare April 14, 2026 02:22
@boneskull boneskull force-pushed the boneskull/ast-service branch from 460895c to 1c799b4 Compare April 16, 2026 01:36
@boneskull boneskull changed the base branch from master to boneskull/fix-module-source-types April 16, 2026 01:37
@boneskull boneskull force-pushed the boneskull/fix-module-source-types branch from 41cb080 to 79aae05 Compare April 16, 2026 01:40
@boneskull boneskull force-pushed the boneskull/ast-service branch from 1c799b4 to 941d4ae Compare April 16, 2026 01:40
@boneskull boneskull force-pushed the boneskull/fix-module-source-types branch from 79aae05 to 6184e36 Compare April 16, 2026 01:46
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from 63dc745 to 698be08 Compare April 30, 2026 02:06
@boneskull boneskull force-pushed the boneskull/ast-service branch from 72b6396 to 2a5c3f3 Compare April 30, 2026 02:06
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from 698be08 to 1249834 Compare May 5, 2026 18:00
@boneskull boneskull force-pushed the boneskull/ast-service branch from 2a5c3f3 to 5d3399a Compare May 5, 2026 18:00
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from 1249834 to 4ad0448 Compare May 5, 2026 20:07
@boneskull boneskull force-pushed the boneskull/ast-service branch 4 times, most recently from cf98d6b to e391226 Compare May 7, 2026 17:40
@boneskull boneskull changed the base branch from boneskull/export-find-canonical-names2 to boneskull/cjs-ast-parser2 May 7, 2026 17:45
@boneskull boneskull changed the base branch from boneskull/cjs-ast-parser2 to boneskull/export-find-canonical-names2 May 7, 2026 17:49
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from 68db8b2 to 24aa05e Compare May 12, 2026 23:27
@boneskull boneskull force-pushed the boneskull/ast-service branch from e391226 to 4e36dcc Compare May 12, 2026 23:27
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from 24aa05e to 40fbe27 Compare May 14, 2026 18:59
@boneskull boneskull force-pushed the boneskull/ast-service branch from 4e36dcc to aded8fc Compare May 14, 2026 18:59
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from 40fbe27 to 3bb2910 Compare May 14, 2026 19:00
@boneskull boneskull force-pushed the boneskull/ast-service branch from aded8fc to c769875 Compare May 14, 2026 19:00
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from 3bb2910 to b78ae44 Compare May 14, 2026 19:03
@boneskull boneskull force-pushed the boneskull/ast-service branch from c769875 to b1d71b9 Compare May 14, 2026 19:03
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from b78ae44 to 34550d5 Compare May 14, 2026 19:06
@boneskull boneskull force-pushed the boneskull/ast-service branch from b1d71b9 to cdb6dc1 Compare May 14, 2026 19:06
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from 34550d5 to c4db6f6 Compare May 14, 2026 19:15
@boneskull boneskull force-pushed the boneskull/ast-service branch from cdb6dc1 to c7830b2 Compare May 14, 2026 19:15
@boneskull boneskull force-pushed the boneskull/export-find-canonical-names2 branch from c4db6f6 to 9f9df2c Compare May 20, 2026 01:21
@boneskull boneskull force-pushed the boneskull/ast-service branch from c7830b2 to 06b06c2 Compare May 20, 2026 01:21
Introduces `@endo/parser-pipeline`, a new package that eliminates redundant Babel AST parsing when multiple consumers need to analyze or transform the same JavaScript module source.

The core problem: tools built on `@endo/compartment-mapper` (such as LavaMoat) have historically parsed each module two or three times — once for import/export analysis, once for evasive transforms, and once for policy-relevant globals analysis. This package composes those passes into a single parse-traverse-generate cycle.

**`createParsers(config?)`** is the primary entry point. It accepts a single flat configuration object that combines pipeline options (`analyzerFactories`, `transformFactories`, per-language `mjs`/`cjs`/`mts` overrides, lifecycle hooks) with worker-pool options (`workerScript`, `workerData`, `maxWorkers`, `idleTimeout`). It returns `{ sync, async }` parser maps that are drop-in replacements for `parserForLanguage` in `@endo/compartment-mapper`. The module-source analysis step is handled implicitly by the pipeline; consumers only supply user-defined analyzer and transform factories.

Async-only consumers (e.g. policy generation) need only supply the worker/pool options and lifecycle hooks — they do not need to pass factory configs that only run inside the worker.

**`runPipelineInWorker(port, config)`** powers the async path. It accepts the same pre-merge `PipelineConfig` shape as `createParsers`, performing the merge internally. Consumer-provided worker scripts call this to listen for parse tasks dispatched by the worker pool, run the full pipeline in a worker thread, and post results back. The worker pool (`WorkerParserPool`) manages spawning, queuing, idle timeouts, and unref'd workers so the process can exit cleanly once all in-flight dispatches settle.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants