Fix pybind module link leak when testing+pybind enabled (2.15.0 regression)#75
Merged
Merged
Conversation
Regression caught by xmscore CI in 2.15.0: under the new ``pybind=True+testing=True+Debug`` coverage configuration, the recipe's ``build()`` runs ``run_python_tests`` which loads the pybind module — but the .so fails dlopen with ``undefined symbol: CxxTest::charToString(char, char*)``. Root cause is a latent bug in ``CMakeLists.txt.jinja``: it appends ``testing_sources`` (e.g. xmscore's ``TestTools.cpp``) to the main library's source list. The main library is what the pybind module links against. ``TestTools.cpp`` includes ``cxxtest/ValueTraits.h`` and uses ``ValueTraits<char>`` whose inline ctor calls ``CxxTest::charToString`` — declared in that header but defined only in ``cxxtest/ValueTraits.cpp`` (which is included via ``cxxtest/Root.cpp`` only by the cxxtestgen-generated runner.cpp). Pre-2.15.0 the combination ``BUILD_TESTING+IS_PYTHON_BUILD`` was impossible so the leak was latent; 2.15.0 made it reachable and the canary missed it because the stub fixture had no testing_sources. Fix the template to compile testing_sources directly into the runner target (matching the documented contract: "testing_sources are compiled into the test runner only"). The main library no longer carries those translation units so the pybind module's link closure contains no CxxTest references. Also: * Update the stub fixture: add ``stub/testing/test_tools.cpp`` whose body forces a CxxTest::charToString reference via ``ValueTraits<char>``, and call it from ``stub.t.h`` so the test_tools .o gets pulled into the runner's link closure. With the pre-fix template a coverage build of the stub would fail with the same dlopen error; with the fix it succeeds. Closes the canary fidelity gap saved in feedback_canary_test_fidelity.md two messages before the regression shipped. * Default ``library_sources``, ``library_headers``, ``testing_headers``, ``pybind_sources``, and ``pybind_headers`` to ``[]`` in the generator. The docs already promise this; the code did not. Downstream impact: any xms library that was relying on the buggy behavior (linking against xmscore's static lib to consume xmscore's TestTools.cpp symbols) will now fail to find those symbols. The documented contract has always been that testing_sources are runner- only — such libraries are reaching outside the contract and need to either duplicate the testing helpers or wait for a follow-up that exposes a separate testing static lib. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
2.15.0 enabled the combined
pybind=True+testing=True+Debugconfiguration so coverage runs would exercise bothrun_cxx_testsandrun_python_testsagainst the same instrumented binary. xmscore's first coverage run on 2.15.0 immediately failed:```
ImportError: ... /xms/core/_xmscore.cpython-313-x86_64-linux-gnu.so:
undefined symbol: _ZN7CxxTest12charToStringEcPc
```
That's
CxxTest::charToString(char, char*). It's declared incxxtest/ValueTraits.h:85but defined only incxxtest/ValueTraits.cpp(a .cpp file shipped inside cxxtest's include dir), which is in turn pulled bycxxtest/Root.cpp— andRoot.cppis only included by the cxxtestgen-generatedrunner.cpp.xmscore's
TestTools.cppusesCxxTest::ValueTraits<char>(whose inline ctor callscharToString). The template was compiling testing_sources likeTestTools.cppinto the main static library — exactly the library the pybind module links. The pybind module link closure therefore contained an unresolved CxxTest symbol that the dynamic loader couldn't find at Python import time.Pre-2.15.0 this was a latent bug:
BUILD_TESTING+IS_PYTHON_BUILDwas an impossible combination so the leak never reached daylight. 2.15.0 made the combination reachable, and the integration test (which the user explicitly approved despite my "land #68 first" advice) didn't catch it because the stub fixture had no testing_sources at all.Fix
CMakeLists.txt.jinja— remove thelist(APPEND ${library_name}_sources testing_sources)block. Plumb testing_sources into a${library_name}_testing_sourcesvariable instead, and add it to the runner target viatarget_sources(runner PRIVATE ...)(cxxtest path) or directly inadd_executable(runner ...)(gtest path). This aligns with the documented contract indocs/USAGE.md§5.2: "testing_sources are compiled into the test runner only."stub/testing/test_tools.cppwhose body usesValueTraits<char>to force aCxxTest::charToStringreference, plustest_tools.hand a call fromstub.t.hso the .o gets pulled into the runner's link closure. With the pre-fix template, a coverage build of this stub would fail with the same dlopen error; with the fix, it succeeds. This closes the canary fidelity gap I saved to memory two messages before the regression shipped, and immediately violated.build_file_generator.py— defaultlibrary_sources,library_headers,testing_headers,pybind_sources,pybind_headersto[]. The docs already promised this; the code did not. (Caught while reproducing in WSL: a stub without all five lists tripped onStrictUndefinedbefore even reaching the linker bug.)Downstream impact
Any xms library that was implicitly relying on the buggy behavior — depending on xmscore as a Conan dep and using
xmscore::TestToolssymbols via the linked static lib — will now find those symbols missing. The documented contract has always been runner-only; such libraries are reaching outside the contract. If this turns out to bite more than xmscore, the follow-up is to expose a separate${library}_testingstatic lib inpackage_infowhentesting=True. Not bundled here to keep the patch small.Test plan
pytest tests/ -v— 532 unit tests pass, 2 expected skips.flake8 .clean.set(stub_testing_sources stub/testing/test_tools.cpp)lands;add_library(${PROJECT_NAME} STATIC ${stub_sources} ${stub_headers})no longer includes testing_sources;target_sources(runner PRIVATE ${stub_testing_sources})wires them into the runner.🤖 Generated with Claude Code