Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions src/algorithm/hgraph/hgraph.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,9 @@ HGraph::Tune(const std::string& parameters, bool disable_future_tuning) {
}

std::scoped_lock lock(this->add_mutex_);
if (this->immutable_.load(std::memory_order_acquire)) {
return false;
}

// check which code need to tune and update create_param_ptr_
bool is_tune_base_code = false;
Expand Down Expand Up @@ -433,14 +436,15 @@ HGraph::GetVectorByInnerId(InnerIdType inner_id, float* data) const {

void
HGraph::SetImmutable() {
if (this->immutable_) {
if (this->immutable_.load(std::memory_order_acquire)) {
return;
}
std::scoped_lock<std::shared_mutex> add_lock(this->add_mutex_);
std::scoped_lock<std::shared_mutex> wlock(this->global_mutex_);
Comment on lines +442 to 443
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

While acquiring add_mutex_ in SetImmutable() is intended to prevent a TOCTOU race with Tune(), it is currently insufficient because CHECK_IMMUTABLE_INDEX is checked outside of any lock in IndexImpl::Tune().

If Tune() passes the check before SetImmutable() runs, it will block on add_mutex_ inside HGraph::Tune(). Once SetImmutable() completes and releases the locks, Tune() will proceed to modify the index even though it has been made immutable. Since search threads skip acquiring global_mutex_ when immutable_ is true, this will lead to concurrent reads and writes on basic_flatten_codes_ and other member variables, causing a severe data race and undefined behavior.

To completely resolve this, HGraph::Tune() must re-check immutable_ after acquiring add_mutex_:

std::scoped_lock lock(this->add_mutex_);
if (this->immutable_.load(std::memory_order_acquire)) {
    return false;
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! Already addressed in commit f895451 — Tune() now re-checks immutable_ after acquiring add_mutex_:

std::scoped_lock lock(this->add_mutex_);
if (this->immutable_.load(std::memory_order_acquire)) {
    return false;
}

this->neighbors_mutex_.reset();
this->neighbors_mutex_ = std::make_shared<EmptyMutex>();
this->searcher_->SetMutexArray(this->neighbors_mutex_);
this->immutable_ = true;
this->immutable_.store(true, std::memory_order_release);
}

void
Expand Down
27 changes: 18 additions & 9 deletions src/algorithm/hgraph/hgraph_search.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -82,10 +82,13 @@ HGraph::KnnSearch(const DatasetPtr& query,
fmt::format("ef_search({}) must in range[1, {}]", params.ef_search, ef_search_threshold));

std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
Comment on lines 84 to 91
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using std::defer_lock is more idiomatic and efficient than default-constructing std::shared_lock and then move-assigning a temporary lock object inside the conditional block. This avoids the overhead of move construction/assignment.

Suggested change
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
std::shared_lock<std::shared_mutex> force_remove_rlock(this->force_remove_mutex_, std::defer_lock);
std::shared_lock<std::shared_mutex> shared_lock(this->global_mutex_, std::defer_lock);
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock.lock();
}
shared_lock.lock();
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. We intentionally use default-construct + conditional move-assign because force_remove_rlock is only conditionally associated with its mutex (when support_force_remove() is true). Applying defer_lock to it would unconditionally bind it to force_remove_mutex_, which is semantically imprecise. The current approach handles both lock variables uniformly and is correct (default-constructed shared_lock destructor is a no-op).

std::shared_lock shared_lock(this->global_mutex_);
k = std::min(k, GetNumElements());

FilterPtr ft = this->create_search_filter(filter, params.use_extra_info_filter);
Expand Down Expand Up @@ -357,10 +360,13 @@ HGraph::RangeSearch(const DatasetPtr& query,
this->validate_range_args(query, radius, limited_size);

std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
Comment on lines 362 to 369
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using std::defer_lock is more idiomatic and efficient than default-constructing std::shared_lock and then move-assigning a temporary lock object inside the conditional block. This avoids the overhead of move construction/assignment.

Suggested change
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
std::shared_lock<std::shared_mutex> force_remove_rlock(this->force_remove_mutex_, std::defer_lock);
std::shared_lock<std::shared_mutex> shared_lock(this->global_mutex_, std::defer_lock);
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock.lock();
}
shared_lock.lock();
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. We intentionally use default-construct + conditional move-assign because force_remove_rlock is only conditionally associated with its mutex (when support_force_remove() is true). Applying defer_lock to it would unconditionally bind it to force_remove_mutex_, which is semantically imprecise. The current approach handles both lock variables uniformly and is correct (default-constructed shared_lock destructor is a no-op).

std::shared_lock shared_lock(this->global_mutex_);

InnerSearchParam search_param;
search_param.ep = this->entry_point_id_;
Expand Down Expand Up @@ -455,10 +461,13 @@ HGraph::SearchWithRequest(const SearchRequest& request) const {
fmt::format("ef_search({}) must in range[1, {}]", params.ef_search, ef_search_threshold));

std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
Comment on lines 463 to 470
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using std::defer_lock is more idiomatic and efficient than default-constructing std::shared_lock and then move-assigning a temporary lock object inside the conditional block. This avoids the overhead of move construction/assignment.

Suggested change
std::shared_lock<std::shared_mutex> force_remove_rlock;
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
std::shared_lock<std::shared_mutex> shared_lock;
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock = std::shared_lock<std::shared_mutex>(this->force_remove_mutex_);
}
shared_lock = std::shared_lock<std::shared_mutex>(this->global_mutex_);
}
std::shared_lock<std::shared_mutex> force_remove_rlock(this->force_remove_mutex_, std::defer_lock);
std::shared_lock<std::shared_mutex> shared_lock(this->global_mutex_, std::defer_lock);
if (!this->immutable_.load(std::memory_order_acquire)) {
if (this->support_force_remove()) {
force_remove_rlock.lock();
}
shared_lock.lock();
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. We intentionally use default-construct + conditional move-assign because force_remove_rlock is only conditionally associated with its mutex (when support_force_remove() is true). Applying defer_lock to it would unconditionally bind it to force_remove_mutex_, which is semantically imprecise. The current approach handles both lock variables uniformly and is correct (default-constructed shared_lock destructor is a no-op).

std::shared_lock shared_lock(this->global_mutex_);
k = std::min(k, GetNumElements());

// Setup reasoning context if expected labels are provided.
Expand Down
2 changes: 1 addition & 1 deletion src/algorithm/inner_index_interface.h
Original file line number Diff line number Diff line change
Expand Up @@ -567,7 +567,7 @@ class InnerIndexInterface {
IndexFeatureListUPtr index_feature_list_{nullptr};

const InnerIndexParameterPtr create_param_ptr_{nullptr};
bool immutable_{false};
std::atomic<bool> immutable_{false};

protected:
std::atomic<int64_t> current_memory_usage_{0};
Expand Down
2 changes: 1 addition & 1 deletion src/algorithm/sindi/sindi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -709,7 +709,7 @@ SINDI::CalDistanceById(const DatasetPtr& query,
void
SINDI::SetImmutable() {
std::scoped_lock wlock(this->global_mutex_);
this->immutable_ = true;
this->immutable_.store(true, std::memory_order_release);
}

void
Expand Down
2 changes: 1 addition & 1 deletion src/index/index_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ class IndexImpl : public Index {
return DatasetImpl::MakeEmptyDataset(); \
}
#define CHECK_IMMUTABLE_INDEX(operation_str) \
if (this->inner_index_->immutable_) { \
if (this->inner_index_->immutable_.load(std::memory_order_acquire)) { \
return tl::unexpected(Error(ErrorType::UNSUPPORTED_INDEX_OPERATION, \
"immutable index no support " operation_str)); \
}
Expand Down
Loading