BE-607: HashQL: Refactor ModuleRegistry into builder/immutable split with allocator-generic stdlib construction#8887
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
PR SummaryMedium Risk Overview Module registry: Construction moves to Stdlib: Tests / misc: Namespace and resolver tests build custom modules via Reviewed by Cursor Bugbot for commit df8bde2. Bugbot is set up for automated code reviews on this repo. Configure here. |
8dd3f9b to
929d239
Compare
cf0c212 to
e59222a
Compare
e59222a to
00cf591
Compare
00cf591 to
85a5c9a
Compare
81f4f86 to
0f60303
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 85a5c9a. Configure here.
85a5c9a to
ceb05c9
Compare
0f60303 to
5a0812f
Compare
ceb05c9 to
df8bde2
Compare
| b"a15" => Some(Self::M2), | ||
| b"a16" | b"as1" | b"as2" | b"as3" => Some(Self::M3), | ||
| b"as4" | b"as4-1" | b"as4-2" => Some(Self::M4), | ||
| b"as5" | b"as5-1" | b"as5-2" => Some(Self::M5), | ||
| b"as5" | b"as5-2" => Some(Self::M5), | ||
| _ => None, |


🌟 What is the purpose of this PR?
Replaces the
StandardLibrarytype (12.9KB on the stack due to nestedSmallVec) with a flatModuleCachebacked byMaybeUninitslots and aFiniteBitSet, and restructuresModuleRegistryconstruction around a builder pattern (PartialModuleRegistry). Reduces stdlib construction cost by ~20% in instructions and ~60% in L1D cache misses.🔍 What does this change?
Module cache and construction:
SmallVec<(TypeId, ModuleDef)>inStandardLibrarywith a flatModuleCacheusingBox<[MaybeUninit<ModuleDef>]>andFiniteBitSet<CacheId, u64>for initialization trackingStandardLibraryintoStandardLibrary(coordinator),StandardLibraryContext(passed todefinemethods), andModuleCache(shared across modules)CacheIdenum with one variant per stdlib module in preorder traversal order, replacing runtimeTypeIdlinear scan with compile-time indicesModuleDeftracksemitted_lenfor exact output capacity when building item slicesmatchinbuild()instead of[Option<ItemKind>; 2].flatten()for item emissionInternSet) since module item slices are provably unique byModuleIdownershipRegistry restructuring:
PartialModuleRegistryas a mutable builder that collects modules during constructionModuleRegistryis now immutable after construction, backed by a contiguousIdSliceinstead ofInternMapRwLockfrom the root namespace (mutable access during construction, shared reference after)ModuleRegistry::builder()test helper andPartialModuleRegistry::insert_root_module()for tests that inject custom modulesSymbol and type cleanup:
heap.intern_symbol("...")calls in stdlib modules with pre-internedsym::constants.opaque("::...")calls withsym::path::nested constantsInterned::empty()for zero-genericdecl!expansions and special form typesModuleDefparameterized by allocatorSfor bump-scope 2.x supportBenchmark results (Apple Silicon, bump-scope 2.x):
Pre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
🐾 Next steps
Potential further optimizations identified but not implemented:
TypeIds (number, boolean, etc.) onTypeBuilderto avoid repeated intern round-tripsTypeDefs in modules likemathandspecial_formdecl!macro🛡 What tests cover this?
hashql-coretest suite (880 unit + 326 integration + 20 doc tests) all passmodule::namespaceandmodule::resolvertests exercise custom module construction through the newPartialModuleRegistrybuilder API❓ How to test this?
cargo test --package hashql-core