Add post-solve per-process Rosenbrock tendency diagnostics#732
Draft
haakon-e wants to merge 1 commit into
Draft
Conversation
Member
Author
|
This change is part of the following stack:
Change managed by git-spice. |
This was referenced Jun 14, 2026
3877f3b to
060db2d
Compare
060db2d to
bc7f5f3
Compare
A diagnostic RosenbrockAverageVerbose mode for the 1M and 2M+P3 schemes. Each substep solve is linear in the raw tendency, so each process's instantaneous contribution f_p, pushed through the same equilibrated solve operator (one factorization context reused per substep), gives a realized per-process increment whose sum is the unclamped Delta x exactly. The positivity clamp max.(x+Dx,0) is not linear, so its effect is reported as a separate, non-attributable clamp-correction term rather than folded into the per-process tendencies; per-process parts sum to the unclamped increment and the clamp correction carries the remainder. The non-verbose hot loop is untouched and stays 0-alloc: the kernel is split into _rosenbrock_system + _rosenbrock_solve, with the existing update solving against the same context. Verbose per-process functors re-group the existing physics calls and sum to the existing instantaneous total exactly.
bc7f5f3 to
258ece7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this adds
This adds post-solve per-process tendency diagnostics for the Rosenbrock averaged-tendency modes, for both the 1-moment and 2-moment + P3 configurations, as a verbose variant of the mode. The substep solve is linear in the tendency, so with the equilibrated system built once per substep each process tendency is pushed through the same linear correction and the per-process realized increments sum exactly to the full increment.
How
The substep kernel is split into a system-build step (equilibration and the system matrix, computed once per substep) and a solve step that is linear in the right-hand side, so the full update and each per-process attribution solve against the same system. The positivity clamp is nonlinear, so it is returned as a separate, non-attributable correction term rather than folded into the per-process sums. The verbose tendency functors regroup the existing process calls instead of recomputing the physics, so the per-process parts sum to the existing instantaneous total.
Testing
Exactness tests confirm that the per-process realized tendencies plus the clamp correction reconstruct the net tendency to roundoff (relative error around 1e-13 in Float64), and that the verbose net equals the non-verbose net, for both schemes and both precisions. The non-verbose hot path is unchanged and still allocation-free, and the verbose entries are JET-clean.
Notes
Draft, builds on the PRs below. One internal helper is intentionally not inlined to sidestep a pre-existing type-inference false report in the P3 state constructor that also affects the non-verbose path; this is documented in the code and has no cost on the hot loop.