perf(json): drop the (Double, StringView?) tuple from number lexing#3639
Open
mizchi wants to merge 1 commit into
Open
perf(json): drop the (Double, StringView?) tuple from number lexing#3639mizchi wants to merge 1 commit into
mizchi wants to merge 1 commit into
Conversation
The number-lex chain (lex_zero / lex_decimal_* / lex_number_end / lex_integer_end) returned (Double, StringView?), where the second component is the source-text view that's only ever Some(_) on the infinity-overflow path. Native boxes every tuple, so each parsed JSON number cost one heap allocation just to carry a value that was structurally None (1.2 M / 18 MB on the 10k-integer bench). Side-channel the optional view through a new private ParseContext.last_number_repr field. Each leaf return writes it (usually None); lex_main.mbt reads + clears when constructing the Number token. Split out of the original combined PR per review feedback (3/3). https://claude.ai/code/session_017MVaMFPQYuvxLCudMEsQr2
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Refactors the JSON number lexer to avoid per-call tuple allocations by replacing (Double, StringView?) return values with a Double return plus a side-channel field last_number_repr on ParseContext.
Changes:
- Added mutable
last_number_repr : StringView?field toParseContext. - Changed all
lex_*number functions to returnDoubleonly, writing the optional string repr to the new field. - Updated
lex_valuecallers to read and clearlast_number_reprafter each number lex.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| json/internal_types.mbt | Adds the last_number_repr side-channel field with initialization. |
| json/lex_number.mbt | Removes tuple returns from number lexing functions; writes string repr to ctx field instead. |
| json/lex_main.mbt | Updates number-token call sites to read/clear last_number_repr from ctx. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
189
to
195
| ///| | ||
| // Returns the parsed `Double` and writes the optional string form to | ||
| // `ctx.last_number_repr` (instead of returning a `(Double, StringView?)` | ||
| // tuple). The tuple used to dominate native-target allocation in | ||
| // integer-heavy parses — one heap object per integer, 1.2 M / 18 MB on | ||
| // the 10 k-integer bench. See `internal_types.mbt` for the rationale. | ||
| fn ParseContext::lex_integer_end( |
Comment on lines
+52
to
55
| let n = ctx.lex_zero(start=ctx.offset - 2) | ||
| let repr = ctx.last_number_repr | ||
| ctx.last_number_repr = None | ||
| return Number(n, repr.map(repr => repr.to_owned())) |
Comment on lines
+378
to
+379
| ctx.last_number_repr = None | ||
| return signed.to_double() |
Contributor
|
@mizchi this is an invasive change. did you try to define a valtype for (Double,StringView?) or do you have numbers on how much perf is improved? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
#3633
The number-lex chain (lex_zero / lex_decimal_* / lex_number_end / lex_integer_end) returned (Double, StringView?), where the second component is the source-text view that's only ever Some(_) on the infinity-overflow path. Native boxes every tuple, so each parsed JSON number cost one heap allocation just to carry a value that was structurally None (1.2 M / 18 MB on the 10k-integer bench).
Side-channel the optional view through a new private ParseContext.last_number_repr field. Each leaf return writes it (usually None); lex_main.mbt reads + clears when constructing the Number token.