Skip to content

BE-607: HashQL: Refactor ModuleRegistry into builder/immutable split with allocator-generic stdlib construction#8887

Open
indietyp wants to merge 12 commits into
bm/be-538-hashql-permission-system-integrationfrom
bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module
Open

BE-607: HashQL: Refactor ModuleRegistry into builder/immutable split with allocator-generic stdlib construction#8887
indietyp wants to merge 12 commits into
bm/be-538-hashql-permission-system-integrationfrom
bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module

Conversation

@indietyp

@indietyp indietyp commented Jun 19, 2026

Copy link
Copy Markdown
Member

🌟 What is the purpose of this PR?

Replaces the StandardLibrary type (12.9KB on the stack due to nested SmallVec) with a flat ModuleCache backed by MaybeUninit slots and a FiniteBitSet, and restructures ModuleRegistry construction around a builder pattern (PartialModuleRegistry). Reduces stdlib construction cost by ~20% in instructions and ~60% in L1D cache misses.

🔍 What does this change?

Module cache and construction:

  • Replaced nested SmallVec<(TypeId, ModuleDef)> in StandardLibrary with a flat ModuleCache using Box<[MaybeUninit<ModuleDef>]> and FiniteBitSet<CacheId, u64> for initialization tracking
  • Split StandardLibrary into StandardLibrary (coordinator), StandardLibraryContext (passed to define methods), and ModuleCache (shared across modules)
  • Added CacheId enum with one variant per stdlib module in preorder traversal order, replacing runtime TypeId linear scan with compile-time indices
  • ModuleDef tracks emitted_len for exact output capacity when building item slices
  • Direct match in build() instead of [Option<ItemKind>; 2].flatten() for item emission
  • Item slices allocated directly on the heap (bypassing InternSet) since module item slices are provably unique by ModuleId ownership

Registry restructuring:

  • Introduced PartialModuleRegistry as a mutable builder that collects modules during construction
  • ModuleRegistry is now immutable after construction, backed by a contiguous IdSlice instead of InternMap
  • Removed RwLock from the root namespace (mutable access during construction, shared reference after)
  • Added ModuleRegistry::builder() test helper and PartialModuleRegistry::insert_root_module() for tests that inject custom modules

Symbol and type cleanup:

  • Replaced all heap.intern_symbol("...") calls in stdlib modules with pre-interned sym:: constants
  • Replaced all raw string .opaque("::...") calls with sym::path:: nested constants
  • Interned::empty() for zero-generic decl! expansions and special form types
  • ModuleDef parameterized by allocator S for bump-scope 2.x support

Benchmark results (Apple Silicon, bump-scope 2.x):

Counter Before After Delta
Instructions 440.0K 351.2K -20.2%
L1D cache misses 1,213 487 -59.9%
Branch mispredictions 420 347 -17.4%
Walltime 23.49 us 19.57 us -16.7%

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • does not modify any publishable blocks or libraries, or modifications do not need publishing

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

🐾 Next steps

Potential further optimizations identified but not implemented:

  • Cache primitive TypeIds (number, boolean, etc.) on TypeBuilder to avoid repeated intern round-trips
  • Hoist repeated monomorphic TypeDefs in modules like math and special_form
  • Pre-size hash maps for the finite stdlib construction
  • Batch generic declaration construction in the decl! macro

🛡 What tests cover this?

  • Existing hashql-core test suite (880 unit + 326 integration + 20 doc tests) all pass
  • The module::namespace and module::resolver tests exercise custom module construction through the new PartialModuleRegistry builder API

❓ How to test this?

  1. cargo test --package hashql-core
  2. Verify all 1226 tests pass

@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hash Ready Ready Preview, Comment Jun 23, 2026 10:13am
petrinaut Ready Ready Preview Jun 23, 2026 10:13am
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
hashdotdesign-tokens Ignored Ignored Preview Jun 23, 2026 10:13am

@cursor

cursor Bot commented Jun 19, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Large refactor of core module resolution and stdlib registration; behavior should be equivalent but any mistake in builder invariants or cache indexing could break name resolution across the compiler.

Overview
Refactors how HashQL builds and stores the module graph and standard library, aiming for faster startup and simpler lookups after construction.

Module registry: Construction moves to PartialModuleRegistry (provision_moduleinsert_moduleregisterfinish). The finished ModuleRegistry is immutable: modules live in a contiguous IdSlice, root names in a plain FastHashMap (no RwLock), and module items are heap slices instead of Interned slices. search_by_name uses DenseBitSet for visited modules. Item::absolute_path is removed in favor of absolute_path_rev (expander diagnostics reverse that when formatting paths).

Stdlib: StandardLibrary no longer keeps a large nested SmallVec of **ModuleDef**s. Each stdlib module gets a compile-time CacheId; ModuleCache holds MaybeUninit<ModuleDef> slots with a FiniteBitSet, and cross-module types use cache.request instead of runtime TypeId scans. define takes StandardLibraryContext + allocator-generic scratch; ModuleDef tracks emitted_len for exact item buffers. Stdlib code switches from ad-hoc heap.intern_symbol / string opaque paths to sym:: and sym::path:: constants; zero-generic decl! uses Interned::empty().

Tests / misc: Namespace and resolver tests build custom modules via ModuleRegistry::builder() and insert_root_module. darwin-kperf drops as5-1 from the M5 CPU name mapping.

Reviewed by Cursor Bugbot for commit df8bde2. Bugbot is set up for automated code reviews on this repo. Configure here.

indietyp commented Jun 19, 2026

Copy link
Copy Markdown
Member Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@vercel vercel Bot temporarily deployed to Preview – petrinaut June 22, 2026 10:51 Inactive
Copilot AI review requested due to automatic review settings June 22, 2026 12:02
@indietyp indietyp force-pushed the bm/be-538-hashql-permission-system-integration branch from 8dd3f9b to 929d239 Compare June 22, 2026 12:02
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from cf0c212 to e59222a Compare June 22, 2026 12:02
@vercel vercel Bot temporarily deployed to Preview – petrinaut June 22, 2026 12:02 Inactive

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 1 comment.

Comment thread libs/@local/hashql/core/src/module/std_lib/mod.rs
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from e59222a to 00cf591 Compare June 22, 2026 12:48
Copilot AI review requested due to automatic review settings June 23, 2026 09:41
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from 00cf591 to 85a5c9a Compare June 23, 2026 09:41
@indietyp indietyp force-pushed the bm/be-538-hashql-permission-system-integration branch from 81f4f86 to 0f60303 Compare June 23, 2026 09:41

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 85a5c9a. Configure here.

Comment thread libs/darwin-kperf/events/src/lib.rs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 1 comment.

Comment thread libs/@local/hashql/core/src/module/std_lib/mod.rs
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from 85a5c9a to ceb05c9 Compare June 23, 2026 09:52
@indietyp indietyp force-pushed the bm/be-538-hashql-permission-system-integration branch from 0f60303 to 5a0812f Compare June 23, 2026 09:52
Copilot AI review requested due to automatic review settings June 23, 2026 10:02
@indietyp indietyp force-pushed the bm/be-607-hashql-remove-smallvec-from-the-standardlibrary-for-module branch from ceb05c9 to df8bde2 Compare June 23, 2026 10:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 1 comment.

Comment on lines 59 to 63
b"a15" => Some(Self::M2),
b"a16" | b"as1" | b"as2" | b"as3" => Some(Self::M3),
b"as4" | b"as4-1" | b"as4-2" => Some(Self::M4),
b"as5" | b"as5-1" | b"as5-2" => Some(Self::M5),
b"as5" | b"as5-2" => Some(Self::M5),
_ => None,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/libs Relates to first-party libraries/crates/packages (area) type/eng > backend Owned by the @backend team

Development

Successfully merging this pull request may close these issues.

2 participants