perf: optimize oriented BV traversal (precompute transform, early exits)#842
Closed
facontidavide wants to merge 1 commit into
Closed
perf: optimize oriented BV traversal (precompute transform, early exits)#842facontidavide wants to merge 1 commit into
facontidavide wants to merge 1 commit into
Conversation
Three thematically-related micro-optimizations to the BVH traversal hot path for oriented bounding volumes (OBB, RSS, kIOS, OBBRSS): - Precompute inverse relative transform (RT._RTranspose / RT._InvT) once at the top of MeshCollisionTraversalNode and reuse it via the new overlapPrecomputedRTranspose() overloads. Replaces the per-leaf R^T * (a - T) computation with a direct R'_a + T'_a evaluation. - Skip the redundant oriented-mesh "seed" pass that computed an initial triangle-pair distance bound before the recursion. The recursion's first leaf already establishes a usable bound at no extra cost; the seed was duplicating work and adding ~30 lines of code. - Skip the leaf sqrt() and result update entirely when the squared distance already exceeds the current min. The hardened guard added in the previous SIMD commit handles the negative / Scalar::max edge cases. These three changes together cut a measurable fraction of per-pair work in the polso/gen4 mesh-vs-mesh distance benchmark.
This was referenced Apr 30, 2026
Contributor
|
See #858 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three thematically-related micro-optimisations in the BVH-traversal hot path for oriented bounding volumes (OBB, RSS, kIOS, OBBRSS):
MeshCollisionTraversalNode(RT._RTranspose/RT._InvT) and reuse it via newoverlapPrecomputedRTranspose()overloads on each oriented BV. Replaces the per-leafR^T * (a - T)computation with a single direct evaluation per node.sqrt()and result update entirely when the squared distance already exceeds the current min. The hardened guard added in the SIMD-support commit handles the negative /Scalar::maxedge cases.Implementation notes
include/coal/BV/BV.h,OBB.h,OBBRSS.h,RSS.h,kIOS.h: each gains anoverlapPrecomputedRTranspose(const Matrix3s& Rt, const Vec3s& InvT, const BV& other) constoverload (free for OBBRSS — internal compose; one-line wrapper for the others).src/BV/{OBB,RSS,kIOS}.cpp: implement the new overloads; reuse the same SAT bodies as the existingoverlap()but read from the precomputedRt/InvTarguments.include/coal/internal/traversal.hgains aRelativeTransformation::precomputeInverse()helper.include/coal/internal/traversal_node_bvhs.h: caches the inverse at node setup; the traversal's per-node-pairBVDisjointsnow callsoverlapPrecomputedRTransposeinstead of re-computing from scratch. Drops the seed-pass code path. The leafsqrt-skip uses the same guard logic that landed with the SIMD-support commit.include/coal/internal/traversal_node_setup.h: shaves 4 lines that initialised the now-unused seed scratch state.Performance impact
Methodology
taskset -c 4), turbo locked.cmake --build build -j12 -DCMAKE_BUILD_TYPE=Releasein isolated worktrees; separatelibcoal.soper variant. Both base and variant compiled with stock upstream flags.coal-test-benchmark(test/benchmark.cpp) — exercises all 4 oriented BV types for both collision and distance queries.coal-test-benchmarktotalCorrectness gate: full
ctestsuite passes on the variant build (excluding python/nanobind tests).Tests
coal-collision,coal-distance, andcoal-distance_lower_boundsuites which already exercise all 4 oriented BV types (OBB,RSS,kIOS,OBBRSS) with strict numerical tolerances. The "skip seed pass" change is implicitly covered by the distance-lower-bound assertions in those suites — if the seed were load-bearing, those bounds would shift detectably. Thecoal-distancetest in particular runs end-to-end mesh-distance overenv.objvsrob.objfor each BV type and asserts results match within a tight epsilon.Risk & regression analysis
R^Tis identity-preserving; the seed pass was redundant (the recursion's first leaf produces an equally tight bound); the sqrt-skip only fires when the result would be discarded anyway.sqrtskip: with the hardened squared-distance guard from earlier work in this branch, negative orScalar::maxsentinel values are caught upstream — the skip never sees them.overlapPrecomputedRTransposeAPI surface: new public method on each oriented BV. Header-only; no ABI break.MeshCollisionTraversalNode<BV>exposes the sameBVDisjointsandleafCollidessemantics; only the implementation changes.