Skip to content

perf(hgraph): skip mutex acquisition on immutable index search#2154

Open
jfeng18 wants to merge 3 commits into
antgroup:mainfrom
jfeng18:fix/hgraph-immutable-search
Open

perf(hgraph): skip mutex acquisition on immutable index search#2154
jfeng18 wants to merge 3 commits into
antgroup:mainfrom
jfeng18:fix/hgraph-immutable-search

Conversation

@jfeng18
Copy link
Copy Markdown

@jfeng18 jfeng18 commented Jun 8, 2026

Summary

  • Skip global_mutex_ and force_remove_mutex_ acquisition in all three HGraph search paths (KnnSearch, RangeSearch, SearchWithRequest) when the index is immutable
  • Change immutable_ from bool to std::atomic<bool> with acquire/release semantics for ARM correctness
  • Have SetImmutable() acquire add_mutex_ before global_mutex_ to prevent TOCTOU race with Tune()

Motivation

After SetImmutable(), no writer can modify shared state (all mutations guarded by CHECK_IMMUTABLE_INDEX). Yet every search unconditionally acquired shared_lock(global_mutex_), causing atomic CAS cache-line contention across all search threads on read-only indexes.

Test plan

  • Unit tests: 441 cases / 85M assertions passed (release)
  • PR-level HGraph functional tests: 38 cases / 827K assertions passed (release)
  • CI format/lint check (no local clang-format-15)
  • Adversarial review by 3 independent agents (concurrency specialist, memory model specialist, edge case hunter)

🤖 Generated with Claude Code

When an HGraph index is marked immutable via SetImmutable(), no writer
(Add/Build/Remove/Tune) can modify shared state. The search paths
previously acquired global_mutex_ and force_remove_mutex_ unconditionally,
causing unnecessary atomic CAS cache-line contention across all search
threads on read-only indexes.

This change skips both locks when immutable_ is true, eliminating the
contention on high-concurrency search workloads.

Additionally fixes two pre-existing correctness issues:
- immutable_ was a plain bool read without synchronization from search
  threads while written under exclusive lock by SetImmutable(). Changed
  to std::atomic<bool> with acquire/release semantics for ARM safety.
- SetImmutable() now acquires add_mutex_ before global_mutex_ to prevent
  a TOCTOU race where Tune() could pass CHECK_IMMUTABLE_INDEX, then
  SetImmutable() completes, then search skips the lock while Tune()
  modifies basic_flatten_codes_.

Signed-off-by: 水芝 <fengjiangtian.fjt@alibaba-inc.com>
Assisted-by: Claude:claude-opus-4-6
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Jun 8, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Require kind label

Wonderful, this rule succeeded.
  • label~=^kind/

🟢 Require version label

Wonderful, this rule succeeded.
  • label~=^version/

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes the index immutability check thread-safe by converting immutable_ to an atomic boolean and using acquire-release memory ordering. It also optimizes search paths by skipping global lock acquisitions when the index is immutable. The review feedback highlights a critical TOCTOU race condition in HGraph::Tune() where the index could still be modified after being marked immutable, and suggests re-checking the immutability status after acquiring add_mutex_. Additionally, the reviewer recommends using std::defer_lock for std::shared_lock initialization in search methods to avoid move-assignment overhead.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +439 to 440
std::scoped_lock<std::shared_mutex> add_lock(this->add_mutex_);
std::scoped_lock<std::shared_mutex> wlock(this->global_mutex_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

While acquiring add_mutex_ in SetImmutable() is intended to prevent a TOCTOU race with Tune(), it is currently insufficient because CHECK_IMMUTABLE_INDEX is checked outside of any lock in IndexImpl::Tune().

If Tune() passes the check before SetImmutable() runs, it will block on add_mutex_ inside HGraph::Tune(). Once SetImmutable() completes and releases the locks, Tune() will proceed to modify the index even though it has been made immutable. Since search threads skip acquiring global_mutex_ when immutable_ is true, this will lead to concurrent reads and writes on basic_flatten_codes_ and other member variables, causing a severe data race and undefined behavior.

To completely resolve this, HGraph::Tune() must re-check immutable_ after acquiring add_mutex_:

std::scoped_lock lock(this->add_mutex_);
if (this->immutable_.load(std::memory_order_acquire)) {
    return false;
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! Already addressed in commit f895451 — Tune() now re-checks immutable_ after acquiring add_mutex_:

std::scoped_lock lock(this->add_mutex_);
if (this->immutable_.load(std::memory_order_acquire)) {
    return false;
}

Comment on lines 84 to 91
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using std::defer_lock is more idiomatic and efficient than default-constructing std::shared_lock and then move-assigning a temporary lock object inside the conditional block. This avoids the overhead of move construction/assignment.

Suggested change
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
std::shared_lock<std::shared_mutex> force_remove_rlock(this->force_remove_mutex_, std::defer_lock);
std::shared_lock<std::shared_mutex> shared_lock(this->global_mutex_, std::defer_lock);
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock.lock();
}
shared_lock.lock();
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. We intentionally use default-construct + conditional move-assign because force_remove_rlock is only conditionally associated with its mutex (when support_force_remove() is true). Applying defer_lock to it would unconditionally bind it to force_remove_mutex_, which is semantically imprecise. The current approach handles both lock variables uniformly and is correct (default-constructed shared_lock destructor is a no-op).

Comment on lines 362 to 369
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using std::defer_lock is more idiomatic and efficient than default-constructing std::shared_lock and then move-assigning a temporary lock object inside the conditional block. This avoids the overhead of move construction/assignment.

Suggested change
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
std::shared_lock<std::shared_mutex> force_remove_rlock(this->force_remove_mutex_, std::defer_lock);
std::shared_lock<std::shared_mutex> shared_lock(this->global_mutex_, std::defer_lock);
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock.lock();
}
shared_lock.lock();
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. We intentionally use default-construct + conditional move-assign because force_remove_rlock is only conditionally associated with its mutex (when support_force_remove() is true). Applying defer_lock to it would unconditionally bind it to force_remove_mutex_, which is semantically imprecise. The current approach handles both lock variables uniformly and is correct (default-constructed shared_lock destructor is a no-op).

Comment on lines 463 to 470
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using std::defer_lock is more idiomatic and efficient than default-constructing std::shared_lock and then move-assigning a temporary lock object inside the conditional block. This avoids the overhead of move construction/assignment.

Suggested change
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
std::shared_lock<std::shared_mutex> force_remove_rlock(this->force_remove_mutex_, std::defer_lock);
std::shared_lock<std::shared_mutex> shared_lock(this->global_mutex_, std::defer_lock);
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock.lock();
}
shared_lock.lock();
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. We intentionally use default-construct + conditional move-assign because force_remove_rlock is only conditionally associated with its mutex (when support_force_remove() is true). Applying defer_lock to it would unconditionally bind it to force_remove_mutex_, which is semantically imprecise. The current approach handles both lock variables uniformly and is correct (default-constructed shared_lock destructor is a no-op).

Prevents a TOCTOU race where Tune() passes CHECK_IMMUTABLE_INDEX before
SetImmutable() runs, then proceeds to modify basic_flatten_codes_ after
SetImmutable() completes, while search threads have already stopped
acquiring global_mutex_.

Signed-off-by: 水芝 <fengjiangtian.fjt@alibaba-inc.com>
Assisted-by: Claude:claude-opus-4-6
@jfeng18
Copy link
Copy Markdown
Author

jfeng18 commented Jun 8, 2026

Hi maintainers, could you please add kind/improvement and version/1.0 labels? I don't have label permissions as a fork contributor. Thanks!

@wxyucs wxyucs added kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) version/1.0 labels Jun 8, 2026
@wxyucs wxyucs self-assigned this Jun 8, 2026
@wxyucs
Copy link
Copy Markdown
Collaborator

wxyucs commented Jun 8, 2026

Hi maintainers, could you please add kind/improvement and version/1.0 labels? I don't have label permissions as a fork contributor. Thanks!

@jfeng18 Hi! Thanks for your contribution! I have already added labels. Please check the comments from the AI reviewers.

@wxyucs
Copy link
Copy Markdown
Collaborator

wxyucs commented Jun 8, 2026

Hi @jfeng18, the Format check has failed. Please format the codes via make fmt locally and push to GitHub again.

Signed-off-by: 水芝 <fengjiangtian.fjt@alibaba-inc.com>
Assisted-by: Claude:claude-opus-4-6
@jfeng18
Copy link
Copy Markdown
Author

jfeng18 commented Jun 8, 2026

Hi @wxyucs, thanks for adding the labels! I've addressed all review comments:

  • Format check: Fixed and pushed (commit 4256894)
  • Tune TOCTOU (critical): Already fixed in commit f895451 — re-checks immutable_ after acquiring add_mutex_
  • defer_lock (medium x3): Explained why we prefer the current approach (replied inline)

All CI checks should be green now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) module/index size/M version/1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants