Reduce the memory highwatermark in DistributedClosestPoint::computeClosestPoints by MrBurmark · Pull Request #1889 · llnl/axom

MrBurmark · 2026-06-18T18:44:56Z

Reduce the memory highwatermark in DistributedClosestPoint::computeClosestPoints. In DistributedClosestPoint::computeClosestPoints cleanup conduit nodes as soon as they are no longer needed instead of waiting until the end of the routine.
Also refactor storage to use unique_ptr instead of shared_ptr.

Summary

This PR is a refactoring/bugfix
It does the following (modify list as needed):
- Modifies/refactors DistributedClosestPoint::computeClosestPoints to use unique_ptr
- Fixes an issue where conduit Nodes lived longer than necessary causing increased memory usage

In DistributedClosestPoint::computeClosestPoints cleanup conduit nodes as soon as they are no longer needed instead of waiting until the end of the routine. Also refactor storage to use unique_ptr instead of shared_ptr.

…ndCompletionCleanUp

rhornung67 · 2026-06-18T19:48:53Z

@MrBurmark thanks for this. i ran clang-format on your branch so the CI checks will run

MrBurmark · 2026-06-23T00:00:10Z

    std::vector<MPI_Request> reqs;
-    for(auto& isr : isendRequests)
+    reqs.reserve(isendRequests.size());
+    for(auto const& isr : isendRequests)
    {
-      reqs.push_back(isr.m_request);
+      reqs.push_back(isr.first.m_request);
    }


Allocating and freeing a vector of requests every time this function is called seems not optimal, but I didn't want to change too much at once.

The Request object holds onto the packed data so we don't need to hold onto the original node while the isends are processing.

…eanUp' of github.com:llnl/axom into bugfix/burmark1/DistributedClosestPointSendCompletionCleanUp

MrBurmark · 2026-06-24T17:39:49Z

I has codex take a look and it pointed out that the requests own their own packed buffers. So I am now removing the nodes even earlier.

edponce

These are some general observations on how check_send_requests() is implemented and used which can be discussed with another PR. Ideally, the argument isendRequests would be a std::vector that can be used directly in MPI_[Wait|Test]some(). When using these MPI functions, there is no need to resize the requests array and inCount can remain invariant, but they can change if needed. The completed requests are nullified in the input requests array and ignored when used again. Another observation, the MPI standard defines MPI_Request as an opaque handle, and it is not recommended to consider it a copyable datatype although in many MPI implementations it is implemented as an integer.

Also, when waiting to complete all remaining non-blocking sends, these can be handled with MPI_Waitall() instead of using a while loop and invoking check_send_requests() multiple times.

publixsubfan · 2026-06-24T23:25:14Z

@MrBurmark - are the host-side memory allocations that aren't made via Umpire the primary concern/target of these code changes?

MrBurmark · 2026-06-25T14:38:09Z

@MrBurmark - are the host-side memory allocations that aren't made via Umpire the primary concern/target of these code changes?

@publixsubfan Yes, the memory in the conduit nodes and requests is my main concern here. In this PR I reduce the lifetime of the nodes, but did not change the lifetime of the requests.
I would like to pass an allocator to the conduit nodes and requests so we can use umpire for pools and counting, but I did not see an existing mechanism in axom for passing in allocators for different use-cases just the allocator that the parallel computation uses.

MrBurmark · 2026-06-25T14:41:38Z

These are some general observations on how check_send_requests() is implemented and used which can be discussed with another PR. Ideally, the argument isendRequests would be a std::vector that can be used directly in MPI_[Wait|Test]some(). When using these MPI functions, there is no need to resize the requests array and inCount can remain invariant, but they can change if needed. The completed requests are nullified in the input requests array and ignored when used again. Another observation, the MPI standard defines MPI_Request as an opaque handle, and it is not recommended to consider it a copyable datatype although in many MPI implementations it is implemented as an integer.

Also, when waiting to complete all remaining non-blocking sends, these can be handled with MPI_Waitall() instead of using a while loop and invoking check_send_requests() multiple times.

Given the request abstraction, I'm not sure that will be an easy thing to do in general. Perhaps if there was a abstraction for collections of requests?

rhornung67 · 2026-06-25T14:59:16Z

@publixsubfan I have been working through different approaches at making the Axom host allocation interface more consistent and flexible. You may recall that I put up a couple of PRs recently looking for comments and feedback. After discussions with several folks, I closed those and went back to the drawing board.

I am starting to work through a new PR where all host allocations will require an explicit allocation mechanism to be provided (e.g., Axom malloc, Umpire Host, or something else such as Umpire Pinned). Should we discuss this ASAP, or would it be better to talk about it when I have something close to done? I'm hoping to have a good draft by early next week.

MrBurmark · 2026-06-25T15:03:56Z

@publixsubfan I have been working through different approaches at making the Axom host allocation interface more consistent and flexible. You may recall that I put up a couple of PRs recently looking for comments and feedback. After discussions with several folks, I closed those and went back to the drawing board.

I am starting to work through a new PR where all host allocations will require an explicit allocation mechanism to be provided (e.g., Axom malloc, Umpire Host, or something else such as Umpire Pinned). Should we discuss this ASAP, or would it be better to talk about it when I have something close to done? I'm hoping to have a good draft by early next week.

@rhornung67 I would be interested in talking earlier rather than later.

publixsubfan · 2026-06-25T15:14:15Z

@rhornung67 I would be interested in talking earlier rather than later.

I’ll be back around next week if you guys want to take this offline. That being said, presuming it’s just the changes here I don’t see anything too controversial in this PR, just that long-term a broader solution might be along the lines that @rhornung67 is proposing with a new host memory interface.

…ndCompletionCleanUp

rhornung67 · 2026-06-25T20:05:13Z

@MrBurmark I merged Axom develop into your PR branch and ran clang-format on it. Now all Axom tests pass.

We should probably figure out what we are not testing in your use case.

…ndCompletionCleanUp

Free conduit nodes earlier

cc44e6f

In DistributedClosestPoint::computeClosestPoints cleanup conduit nodes as soon as they are no longer needed instead of waiting until the end of the routine. Also refactor storage to use unique_ptr instead of shared_ptr.

MrBurmark requested review from bmhan12 and kennyweiss June 18, 2026 18:44

MrBurmark added the Quest Issues related to Axom's 'quest' component label Jun 18, 2026

MrBurmark and others added 2 commits June 18, 2026 11:45

Merge branch 'develop' into bugfix/burmark1/DistributedClosestPointSe…

4ac28cc

…ndCompletionCleanUp

Clang format

d5960d8

MrBurmark changed the title ~~Free conduit nodes earlier in DistributedClosestPoint~~ Reduce the memory highwatermark in DistributedClosestPoint::computeClosestPoints Jun 18, 2026

kennyweiss requested review from Arlie-Capps, BradWhitlock, cyrush, jcs15c, nselliott, publixsubfan and rhornung67 June 18, 2026 20:11

MrBurmark commented Jun 23, 2026

View reviewed changes

MrBurmark added 4 commits June 24, 2026 10:31

Remove Nodes even sooner

e966109

The Request object holds onto the packed data so we don't need to hold onto the original node while the isends are processing.

Merge branch 'bugfix/burmark1/DistributedClosestPointSendCompletionCl…

274b063

…eanUp' of github.com:llnl/axom into bugfix/burmark1/DistributedClosestPointSendCompletionCleanUp

Fix indenting

8735137

fix compile

75f6ca0

edponce reviewed Jun 24, 2026

View reviewed changes

rhornung67 added 2 commits June 25, 2026 08:43

clang-format

993710a

Merge branch 'develop' into bugfix/burmark1/DistributedClosestPointSe…

d819ae4

…ndCompletionCleanUp

Merge branch 'develop' into bugfix/burmark1/DistributedClosestPointSe…

7d98a05

…ndCompletionCleanUp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce the memory highwatermark in DistributedClosestPoint::computeClosestPoints#1889

Reduce the memory highwatermark in DistributedClosestPoint::computeClosestPoints#1889
MrBurmark wants to merge 10 commits into
developfrom
bugfix/burmark1/DistributedClosestPointSendCompletionCleanUp

MrBurmark commented Jun 18, 2026 •

edited

Loading

Uh oh!

rhornung67 commented Jun 18, 2026

Uh oh!

MrBurmark Jun 23, 2026

Uh oh!

MrBurmark commented Jun 24, 2026

Uh oh!

edponce left a comment

Uh oh!

publixsubfan commented Jun 24, 2026

Uh oh!

MrBurmark commented Jun 25, 2026 •

edited

Loading

Uh oh!

MrBurmark commented Jun 25, 2026

Uh oh!

rhornung67 commented Jun 25, 2026

Uh oh!

MrBurmark commented Jun 25, 2026

Uh oh!

publixsubfan commented Jun 25, 2026 •

edited

Loading

Uh oh!

rhornung67 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

MrBurmark commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

rhornung67 commented Jun 18, 2026

Uh oh!

MrBurmark Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

MrBurmark commented Jun 24, 2026

Uh oh!

edponce left a comment

Choose a reason for hiding this comment

Uh oh!

publixsubfan commented Jun 24, 2026

Uh oh!

MrBurmark commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MrBurmark commented Jun 25, 2026

Uh oh!

rhornung67 commented Jun 25, 2026

Uh oh!

MrBurmark commented Jun 25, 2026

Uh oh!

publixsubfan commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rhornung67 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MrBurmark commented Jun 18, 2026 •

edited

Loading

MrBurmark commented Jun 25, 2026 •

edited

Loading

publixsubfan commented Jun 25, 2026 •

edited

Loading