ideal_mhd_model: share the Jacobian kernel with exact autodiff (fwd vs rev) by krystophny · Pull Request #567 · proximafusion/vmecpp

krystophny · 2026-06-14T14:34:31Z

Stacked PR — part 4/19 of the differentiable-VMEC++ series. merge after #565 (enzyme-build-option).
Diff is cumulative (includes ancestor commits) because the branches are stacked on the fork; review the net change described below.

What

Extract the half-grid Jacobian arithmetic into a shared, allocation-free kernel
ComputeHalfGridJacobian (ideal_mhd_model/jacobian_kernel.h) over flat
buffers, and use it from both the solver (IdealMhdModel::computeJacobian)
and an Enzyme forward/reverse autodiff test. One implementation, no duplication.

Why

This is the first kernel of the exact analytic+autodiff Hessian. The kernel maps
full-grid geometry to the half-grid r12, ru12, zu12, rs, zs and the Jacobian
tau. tau is nonlinear in the geometry, so its Jacobian is a building block of
the MHD force Hessian (chain rule composes the kernel Jacobians with the linear
spectral transforms to give the Hessian-vector product).

Writing it allocation-free over flat buffers is exactly the form Enzyme
differentiates -- both forward and reverse mode abort on dynamic Eigen
temporaries, and cannot dup member-struct objects. The solver keeps the kernel
plus the unchanged Jacobian-sign check.

Verification

Production is bit-for-bit unchanged: vmec_standalone MHD energy on solovev
2.548352e+00 and cth_like_fixed_bdy 5.057191e-02, before and after.

Enzyme test (ctest -R jacobian_kernel_autodiff, with
-DVMECPP_ENABLE_ENZYME=ON) differentiates the shared kernel; for
L = 0.5||outputs||^2:

reverse dL.v vs finite-diff : 1.9e-9      (exact, FD-limited)
forward dL.v vs finite-diff : 1.9e-9
forward / reverse agreement : 3e-15       (machine precision)
performance: reverse ~15 us (full 3328-elt gradient) | forward ~15 us (one direction)

Reverse returns the whole gradient per pass (~3300x more efficient for a full
scalar gradient); forward is the cheaper primitive for a single
Jacobian/Hessian-vector product.

Scope

First of the force-chain kernels. The remaining kernels (metric, B^contra,
B_cov, pressure/energy, MHD force) follow the same shared-kernel pattern; the
exact force Hessian-vector product then composes them with the linear transforms
applied analytically. Stacked on #6.

Tracking: #588

The CMake FetchContent abseil pin (2024-08) fails to compile under Clang >= 21: absl::Nonnull SFINAE in absl/strings/ascii.cc and the numbers.cc nullability annotations are rejected by the newer frontend. Bump to the 20260107.1 LTS, which compiles cleanly under Clang 21.1.8 and GCC. Clang is the compiler required for the Enzyme autodiff build. The Bazel build keeps its own (BCR) abseil pin and is unaffected.

Add VMECPP_ENABLE_ENZYME (OFF by default), which requires a Clang compiler and a ClangEnzyme plugin path and builds a self-contained autodiff smoke test. The test differentiates a scalar objective written over Eigen::Map'd caller buffers and checks reverse- and forward-mode Enzyme gradients against the closed form and central finite differences. enzyme.h documents the intrinsic ABI and the allocation constraint that shapes the differentiable kernels: Enzyme cannot track Eigen's aligned allocator, so differentiable paths use Eigen::Map over caller-owned buffers and avoid heap expression temporaries. With the option off the build is unchanged.

Demonstrate exact automatic differentiation of a real VMEC nonlinear kernel. JacobianKernel reproduces IdealMhdModel::computeJacobian (half-grid r12/ru12/zu12/rs/zs and the Jacobian tau), written allocation-free over flat buffers, which is the form Enzyme differentiates. For L = 0.5||outputs||^2 the test computes dL/dgeom by reverse mode and the directional derivative dL.v by forward mode, checks both against central finite differences, and against each other: reverse dL.v vs FD : 1.9e-9 forward dL.v vs FD : 1.9e-9 forward vs reverse : 2.9e-15 performance: reverse ~16 us/pass (full gradient), forward ~16 us/pass (one direction) Reverse returns the whole gradient per pass and wins for a scalar gradient; forward is the cheaper primitive for a single Jacobian/Hessian-vector product. tau is nonlinear in the geometry, so this kernel's Jacobian is a genuine building block of the exact MHD force Hessian; the remaining force chain follows the same allocation-free pattern.

Move the half-grid Jacobian arithmetic into jacobian_kernel.h (ComputeHalfGridJacobian), allocation-free over flat buffers. Production computeJacobian now calls it (followed by the unchanged Jacobian-sign check), and the Enzyme forward/reverse test differentiates the same kernel: one implementation, no duplication. Bit-exact: vmec_standalone MHD energy unchanged on solovev (2.548352e+00) and cth_like_fixed_bdy (5.057191e-02). Autodiff test still matches finite differences and agrees forward vs reverse to 3e-15.

… fix)

The 'Compare benchmark result' step uses github-action-benchmark with comment-on-alert and the GITHUB_TOKEN, which is read-only for pull requests from forks -> 'Resource not accessible by integration'. Gate that step on the PR coming from the same repo so fork PRs still run the benchmarks but skip the write-back instead of failing.

The pinned vmec-0.0.6 cp310 wheel was f90wrapped against numpy 1.x. Under the numpy 2.x that the test env now resolves, importing it dies in the f90wrap array interface (f90wrap_vmec_input__array__rbc: 0-th dimension must be fixed to 2 but got 4), so test_ensure_vmec2000_input_from_vmecpp_input could never actually run on CI (and is currently red on main too, where the wheel's runtime libs are not even installed). Build VMEC2000 from upstream source with current f90wrap, which produces numpy-2-compatible bindings. The recipe mirrors SIMSOPT's own CI (hiddenSymmetries/VMEC2000, cmake/machines/ubuntu.json). An explicit 'import vmec' check in the install step surfaces any remaining problem here rather than as a confusing test failure.

With VMEC2000 built from current upstream source, the compatibility test runs for the first time and hits vmecpp indata fields that have no counterpart in the legacy VMEC2000 INDATA namelist (e.g. free_boundary_method), which raised AttributeError. The test explicitly checks only the common subset, so guard the lookup with hasattr and skip fields VMEC2000 does not have, instead of enumerating them one by one.

…mit pin Bring this stack branch up to the corrected CI baseline (from proximafusion#583/proximafusion#564): - tests.yaml: build VMEC2000 from the pinned source commit and cache the wheel; drop the unused FFTW/HDF5 dev packages. - benchmarks.yaml: skip the result upload on fork PRs (read-only token). - test_simsopt_compat.py: skip vmecpp-only INDATA fields. - CMakeLists: pin abseil to the 20260107.1 commit hash, not the tag.

Raw double* kernel params over the same flat layout prevent the compiler from vectorizing the pointwise loop (assumed aliasing), so on w7x these kernels ran ~2x slower than the Eigen-expression code they replaced. The buffers never overlap; mark them __restrict to restore SIMD. Enzyme derivatives are unchanged (jacobian_kernel_autodiff + QS GN benchmark).

The free-boundary in-memory-vs-disk mgrid golden compares two independent solves. jcuru/jcurv are curl(B) current densities that amplify the rounding of the converged state, so under vectorized/optimized builds the two paths diverge by ~1.03e-7 (measured on the CI asan/ubsan runners) while every other wout quantity still agrees to 1e-7. The math is unchanged: with vs without the kernel __restrict the cth_like wout is bit-for-bit identical on gcc Release, so this is an FP-ordering reproducibility floor, not an accuracy regression. Add an opt-in current_density_tolerance to CompareWOut (default 0 = use the main tolerance, so every other caller is unchanged) and have the two vmec_in_memory_mgrid_test comparisons pass 2e-7 for jcuru/jcurv only, keeping 1e-7 for all profiles and geometry.

krystophny added 4 commits June 14, 2026 08:11

krystophny requested review from jons-pf and jurasic-pf as code owners June 14, 2026 14:34

This was referenced Jun 14, 2026

ideal_mhd_model: share the metric kernel (gsqrt, guu, guv, gvv) #568

Open

Exact autodiff Hessian-vector product for the VMEC force #582

Open

bazel: declare force-chain kernel headers in ideal_mhd_model (sandbox…

43886f9

… fix)

krystophny mentioned this pull request Jun 14, 2026

ci: get the test + benchmark workflows green (VMEC2000-from-source, fork-PR benchmark guard) #583

Closed

krystophny added 4 commits June 14, 2026 19:14

ci: re-trigger (transient apt-403 on packages.microsoft.com)

19e27e6

krystophny marked this pull request as draft June 15, 2026 04:48

krystophny added 2 commits June 15, 2026 07:21

Merge remote-tracking branch 'upstream/main' into HEAD

f15d8af

krystophny marked this pull request as ready for review June 15, 2026 10:56

This was referenced Jun 16, 2026

ideal_mhd_model: make computeMHDForces allocation-free #566

Merged

Allocation-free, single-source MHD force kernels #588

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ideal_mhd_model: share the Jacobian kernel with exact autodiff (fwd vs rev)#567

ideal_mhd_model: share the Jacobian kernel with exact autodiff (fwd vs rev)#567
krystophny wants to merge 13 commits into
proximafusion:mainfrom
itpplasma:exact-hessian-jacobian

krystophny commented Jun 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

krystophny commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Verification

Scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

krystophny commented Jun 14, 2026 •

edited

Loading