[wip][core] Relieve lock contention on reference removal by offloading to the IO thread#63049
Conversation
…thread Signed-off-by: yicheng <yicheng@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request updates the RemoveLocalReference method in CoreWorker to execute its logic asynchronously on an IO thread, which is intended to reduce lock contention. A critical safety concern was raised regarding the use of a raw this pointer within the asynchronous lambda, as it could lead to a use-after-free error if the CoreWorker instance is destroyed before the task executes. The reviewer suggested capturing shared_from_this() to safely extend the object's lifetime during the operation.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Reviewed by Cursor Bugbot for commit a1c06bf. Configure here.
| // properly from reference counter. | ||
| memory_store_->Delete(deleted); | ||
| }, | ||
| "CoreWorker.RemoveLocalReference"); |
There was a problem hiding this comment.
Race: async removal causes spurious leak warning
Medium Severity
Making RemoveLocalReference asynchronous introduces a race in SealOwned. At line 1184 of core_worker.cc, RemoveLocalReference is called and then HasReference is checked immediately after on the calling thread. Since the removal is now posted to the IO thread but hasn't executed yet, HasReference will almost always return true, emitting spurious "Object may leak" warnings on every SealOwned failure path.
Reviewed by Cursor Bugbot for commit a1c06bf. Configure here.
| // properly from reference counter. | ||
| memory_store_->Delete(deleted); | ||
| }, | ||
| "CoreWorker.RemoveLocalReference"); |
There was a problem hiding this comment.
PR description is empty, needs clarification
Low Severity
The PR description contains only the ## Description header with no content. It does not explain what problem is present or how it is fixed. This violates the clear PR descriptions rule.
⚠️ This PR needs a clearer title and/or description.To help reviewers, please ensure your PR includes:
- Title: A concise summary of the change
- Description:
- What problem does this solve?
- How does this PR solve it?
- Any relevant context for reviewers such as:
- Why is the problem important to solve?
- Why was this approach chosen over others?
See this list of PRs as examples for PRs that have gone above and beyond:
- [Core] Introduce local port service discovery #59613
- [Core] Improve Large-Scale Resource View Synchronization Through Sync Message Batching #57641
- Remove node observability information from hot path of core components #56474
- [core][rdt] Support out-of-order actors by extracting metadata when creating #59610
- [core] fix open leak for plasma store memory (shm/fallback) by workers #52622
Triggered by project rule: Bugbot Rules
Reviewed by Cursor Bugbot for commit a1c06bf. Configure here.


Description