Skip to content

[wip][core] Relieve lock contention on reference removal by offloading to the IO thread#63049

Open
Yicheng-Lu-llll wants to merge 3 commits into
ray-project:masterfrom
Yicheng-Lu-llll:offload-ref-cleanup-to-io-thread
Open

[wip][core] Relieve lock contention on reference removal by offloading to the IO thread#63049
Yicheng-Lu-llll wants to merge 3 commits into
ray-project:masterfrom
Yicheng-Lu-llll:offload-ref-cleanup-to-io-thread

Conversation

@Yicheng-Lu-llll
Copy link
Copy Markdown
Member

Description

…thread

Signed-off-by: yicheng <yicheng@anyscale.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the RemoveLocalReference method in CoreWorker to execute its logic asynchronously on an IO thread, which is intended to reduce lock contention. A critical safety concern was raised regarding the use of a raw this pointer within the asynchronous lambda, as it could lead to a use-after-free error if the CoreWorker instance is destroyed before the task executes. The reviewer suggested capturing shared_from_this() to safely extend the object's lifetime during the operation.

Comment thread src/ray/core_worker/core_worker.h
@Yicheng-Lu-llll Yicheng-Lu-llll marked this pull request as ready for review May 12, 2026 19:49
@Yicheng-Lu-llll Yicheng-Lu-llll requested a review from a team as a code owner May 12, 2026 19:49
@Yicheng-Lu-llll Yicheng-Lu-llll changed the title [core] Relieve lock contention on reference removal by offloading to the IO thread [wip][core] Relieve lock contention on reference removal by offloading to the IO thread May 12, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit a1c06bf. Configure here.

// properly from reference counter.
memory_store_->Delete(deleted);
},
"CoreWorker.RemoveLocalReference");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race: async removal causes spurious leak warning

Medium Severity

Making RemoveLocalReference asynchronous introduces a race in SealOwned. At line 1184 of core_worker.cc, RemoveLocalReference is called and then HasReference is checked immediately after on the calling thread. Since the removal is now posted to the IO thread but hasn't executed yet, HasReference will almost always return true, emitting spurious "Object may leak" warnings on every SealOwned failure path.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit a1c06bf. Configure here.

// properly from reference counter.
memory_store_->Delete(deleted);
},
"CoreWorker.RemoveLocalReference");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description is empty, needs clarification

Low Severity

The PR description contains only the ## Description header with no content. It does not explain what problem is present or how it is fixed. This violates the clear PR descriptions rule.

⚠️ This PR needs a clearer title and/or description.

To help reviewers, please ensure your PR includes:

  • Title: A concise summary of the change
  • Description:
    • What problem does this solve?
    • How does this PR solve it?
    • Any relevant context for reviewers such as:
      • Why is the problem important to solve?
      • Why was this approach chosen over others?

See this list of PRs as examples for PRs that have gone above and beyond:

Fix in Cursor Fix in Web

Triggered by project rule: Bugbot Rules

Reviewed by Cursor Bugbot for commit a1c06bf. Configure here.

@ray-gardener ray-gardener Bot added the core Issues that should be addressed in Ray Core label May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant