perf: stream potential tokens in OriginalSource, avoid discarded slices#246
perf: stream potential tokens in OriginalSource, avoid discarded slices#246alexander-akait wants to merge 4 commits into
Conversation
OriginalSource.streamChunks built the full splitIntoPotentialTokens array of substrings and iterated it, even though map()/sourceAndMap() run with finalSource:true and discard every chunk substring. Refactor splitIntoPotentialTokens into a streaming core eachPotentialToken that reports each token by [start,end) offset; the array-returning helper becomes a thin wrapper (its unit test and benchmark are unchanged). OriginalSource consumes the streaming core and slices only when a chunk is actually emitted — never on the final-source map/sourceAndMap paths. Measured (interleaved in-process A/B vs current main): OriginalSource.map() ~+15-18% CPU, -38..-46% allocation OriginalSource.sourceAndMap() ~+37-40% CPU, -38..-46% allocation streamChunks (slices needed) ~+30% CPU (no intermediate array) All 89,876 tests (incl. Fuzzy + 1373 snapshots) pass; output is byte-identical.
🦋 Changeset detectedLatest commit: ec58f39 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #246 +/- ##
==========================================
+ Coverage 98.11% 98.13% +0.01%
==========================================
Files 25 25
Lines 2068 2089 +21
Branches 669 674 +5
==========================================
+ Hits 2029 2050 +21
Misses 37 37
Partials 2 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Merging this PR will not alter performance
|
…sion) The previous commit turned the array-returning splitIntoPotentialTokens into a thin wrapper over eachPotentialToken, driving it with a per-token callback. CodSpeed flagged the helper's own benchmark regressing ~11-15% (instruction count): the callback indirection stops V8 inlining the slice/push of the hot scan. Restore the standalone direct loop for splitIntoPotentialTokens and keep eachPotentialToken separate for OriginalSource. They share only the small classification table, so there is no behavioural duplication. The OriginalSource map()/sourceAndMap() win is unchanged (it uses eachPotentialToken), and the helper benchmark is back to parity.
splitIntoPotentialTokens is now a standalone loop (production goes through eachPotentialToken), so its phase-2 end-of-string break and phase-3 newline branches were no longer exercised by the OriginalSource suite, dropping coverage. Add round-trip and branch-targeted cases so the helper is fully covered on its own.
5bbef09 to
8d47d1f
Compare
|
/easycla Generated by Claude Code |
8d47d1f to
62d5e9b
Compare
CodSpeed flags "Different runtime environments detected" and shows phantom regressions across untouched benchmarks because `ubuntu-latest` migrates between underlying images (22.04 -> 24.04), so the stored main BASE and the PR HEAD can run on different system libraries. Pin the OS image to ubuntu-24.04 for both benchmark jobs so base and head share an identical runtime environment. Node is deliberately left at `lts/*` rather than pinned: main and PRs resolve it to the same release on a given day, whereas pinning a specific Node would itself create a base/head mismatch until main is re-benchmarked under the pin.
Summary
OriginalSource.streamChunks(and thereforemap()/sourceAndMap()) built the fullsplitIntoPotentialTokensarray of substrings and iterated it — even thoughgetMap/getSourceAndMaprunstreamChunkswithfinalSource: true, where every chunk substring is dropped (chunk = finalSource ? undefined : match). On the dominantmap()/sourceAndMap()paths the code was allocating the whole token array and every per-token slice, only to discard them.This refactors
splitIntoPotentialTokensinto a streaming core,eachPotentialToken(str, onToken), that reports each token by[start, end)offset instead of materialising substrings. The array-returningsplitIntoPotentialTokensbecomes a thin wrapper over it (its unit test and benchmark are unchanged).OriginalSourceconsumes the streaming core and slices a chunk only when one is actually emitted — never on the final-sourcemap()/sourceAndMap()paths.This is the same class of fix as #240 (lookup-table token classification) and #226 (single-line
ReplaceSourcefast path), sitting right on top of the path #240 just optimized.Measured impact
In-process interleaved A/B (both lib versions loaded in one process, alternated each round so shared-host CPU drift cancels in the ratio; CPU = min/median over 80 rounds, allocation = single-call
gc()heap delta). This methodology was chosen because separate-process wall-clock has a ±17% noise floor on the measurement host.OriginalSource.map()OriginalSource.sourceAndMap()streamChunks()(chunks genuinely needed, non-final)Correctness
lint(eslint +tsc+ types generation) is clean.patch).Notes
I also prototyped extending the same idea to the non-final
streamChunksOfSourceMapvariants, but measured it as perf-neutral (0.0% allocation, <2% CPU) and dropped it: V8 representsString.sliceof long strings as a zero-copySlicedString, sosplitIntoLines's array was already cheap, and in the non-final path the emitted chunks are retained into the concatenated output. The gain here is specifically from not producing strings that get discarded (finalSourcemode), which is unique to theOriginalSourcemap/sourceAndMap paths.🤖 Generated with Claude Code
Generated by Claude Code