feat(parser-pipeline): create @endo/parser-pipeline#3158
Conversation
|
Warning This PR is part of a stack and targets branch 📚 Pull Request Stack
Managed by gh-stack |
🦋 Changeset detectedLatest commit: 11a3feb The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
d56e26d to
62012e4
Compare
62012e4 to
7fbfdd1
Compare
|
TODO: figure out why the docs build is failing |
7fbfdd1 to
f54da12
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces a new @endo/parser-pipeline workspace package to avoid redundant Babel AST parsing by composing multiple analyzer/transform visitor passes into a single parse→traverse→generate cycle, and adds an async worker-pool-backed parser option for parallel policy-generation workflows. It also updates @endo/compartment-mapper’s parser typing and mapping logic to support asynchronous parsers alongside synchronous ones.
Changes:
- Add
@endo/parser-pipelinewithcreateComposedParser(sync) andcreateWorkerParser(async worker pool), plus worker-siderunPipelineInWorker. - Refactor
@endo/compartment-mapperparser types andmakeMapParsersto support async parsers and async module transforms. - Add tests for composed parsing and worker-pool behavior.
Reviewed changes
Copilot reviewed 23 out of 25 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| yarn.lock | Adds new workspace package and Babel-related deps/types. |
| typedoc.json | Formatting tweaks and excludes the new package from typedoc output. |
| packages/parser-pipeline/package.json | Defines the new package, dependencies, and exports. |
| packages/parser-pipeline/index.js | Public entrypoint exporting types + parser constructors. |
| packages/parser-pipeline/src/composed-parser.js | Sync composed parser implementation. |
| packages/parser-pipeline/src/worker-parser.js | Async parser wrapper backed by worker pool. |
| packages/parser-pipeline/src/worker-pool.js | Worker thread pool and dispatch/queue/terminate logic. |
| packages/parser-pipeline/src/worker-runner.js | Worker-side message listener that runs the pipeline per task. |
| packages/parser-pipeline/src/types/external.ts | Public types for pipeline configuration and worker protocol. |
| packages/parser-pipeline/src/external.types.* | Type-export shim for JS consumers. |
| packages/parser-pipeline/test/* | AVA tests for composed parser, worker runner, and worker pool. |
| packages/parser-pipeline/README.md | New package documentation and usage examples. |
| packages/parser-pipeline/SECURITY.md | Package security policy doc. |
| packages/parser-pipeline/LICENSE | Apache-2.0 license file. |
| packages/parser-pipeline/tsconfig*.json | Typechecking/build config for the new package. |
| packages/compartment-mapper/src/types/external.ts | Splits sync vs async parser types and updates ParserForLanguage typing. |
| packages/compartment-mapper/src/types/internal.ts | Adjusts internal operator typings to accept ParseFn or AsyncParseFn. |
| packages/compartment-mapper/src/map-parser.js | Refactors parser selection/trampolines for sync+async support. |
| packages/compartment-mapper/src/import-hook.js | Adds a sync-parse type guard for importNow/dynamic-require constraints. |
| packages/compartment-mapper/src/link.js | Simplifies heuristicImports access with updated parser types. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
9abf181 to
460895c
Compare
460895c to
1c799b4
Compare
41cb080 to
79aae05
Compare
1c799b4 to
941d4ae
Compare
79aae05 to
6184e36
Compare
63dc745 to
698be08
Compare
72b6396 to
2a5c3f3
Compare
698be08 to
1249834
Compare
2a5c3f3 to
5d3399a
Compare
1249834 to
4ad0448
Compare
cf98d6b to
e391226
Compare
68db8b2 to
24aa05e
Compare
e391226 to
4e36dcc
Compare
24aa05e to
40fbe27
Compare
4e36dcc to
aded8fc
Compare
40fbe27 to
3bb2910
Compare
aded8fc to
c769875
Compare
3bb2910 to
b78ae44
Compare
c769875 to
b1d71b9
Compare
b78ae44 to
34550d5
Compare
b1d71b9 to
cdb6dc1
Compare
34550d5 to
c4db6f6
Compare
cdb6dc1 to
c7830b2
Compare
c4db6f6 to
9f9df2c
Compare
c7830b2 to
06b06c2
Compare
Introduces `@endo/parser-pipeline`, a new package that eliminates redundant Babel AST parsing when multiple consumers need to analyze or transform the same JavaScript module source.
The core problem: tools built on `@endo/compartment-mapper` (such as LavaMoat) have historically parsed each module two or three times — once for import/export analysis, once for evasive transforms, and once for policy-relevant globals analysis. This package composes those passes into a single parse-traverse-generate cycle.
**`createParsers(config?)`** is the primary entry point. It accepts a single flat configuration object that combines pipeline options (`analyzerFactories`, `transformFactories`, per-language `mjs`/`cjs`/`mts` overrides, lifecycle hooks) with worker-pool options (`workerScript`, `workerData`, `maxWorkers`, `idleTimeout`). It returns `{ sync, async }` parser maps that are drop-in replacements for `parserForLanguage` in `@endo/compartment-mapper`. The module-source analysis step is handled implicitly by the pipeline; consumers only supply user-defined analyzer and transform factories.
Async-only consumers (e.g. policy generation) need only supply the worker/pool options and lifecycle hooks — they do not need to pass factory configs that only run inside the worker.
**`runPipelineInWorker(port, config)`** powers the async path. It accepts the same pre-merge `PipelineConfig` shape as `createParsers`, performing the merge internally. Consumer-provided worker scripts call this to listen for parse tasks dispatched by the worker pool, run the full pipeline in a worker thread, and post results back. The worker pool (`WorkerParserPool`) manages spawning, queuing, idle timeouts, and unref'd workers so the process can exit cleanly once all in-flight dispatches settle.
Introduces
@endo/parser-pipeline— a new package that eliminates redundantBabel AST parsing for
@endo/compartment-mapperconsumers. Tools like LavaMoathave historically parsed each module 2–3 times (once for import/export analysis,
once for evasive transforms, once for policy-globals analysis). This package
composes all those passes into a single parse → analyze → transform → generate
cycle per module.
Core API
createParsers(configs, options?)is the primary entry point. It accepts alanguage-keyed config map and returns
{ sync, async }parser maps compatiblewith
@endo/compartment-mapper'sparserForLanguageoption. The@endo/module-sourceanalysis step is handled implicitly by the pipeline based on thesource language (
mjsorcjs); consumers layer in user-defined analyzer andtransform factories.
moduleSourceConfigs(extras?)builds a pre-wired config map for both ESM andCJS by calling
analyzeModule()/analyzeCjs()from@endo/module-sourceinternally. Extra factories and per-language
finalizeRecordhooks are mergedon top.
createEvasiveTransformPass(options?)wraps@endo/evasive-transform'svisitor as a pipeline-compatible
TransformPassowned by this package (sostructural compatibility is verified at compile time, not by duck typing).
Workers
runPipelineInWorker(port, configs)powers the async path. Consumer workerscripts call this to handle parse messages dispatched by
WorkerParserPool.Workers are
unref()'d at spawn time; an idle timeout terminates them afterinactivity. Pending dispatches hold the event loop via their main-thread
promises, so the process exits cleanly once the last dispatch settles — no
explicit
terminate()call required.WorkerParserPoolmanages worker spawning, queueing when at capacity, idletimeouts, error handling, and result routing. Tests cover pool exhaustion,
worker crash, unexpected exit, and malformed response shapes.
Internals
runPipeline(params)is the shared core (used by both sync and async paths)that owns the Babel parse → implicit module-source analyze → user analyzers →
user transforms → implicit module-source transform → generate →
buildRecordsequence.
The package exports
AnalyzerFactory,TransformFactory,AnalyzerPass,TransformPass,PipelineConfigs,CreateParsersOptions,CreateParsersResult,and the worker-pool types.