Phase 7: test coverage for visualization, model_checking, distance1D + interval_rescaling fix#37
Open
edeno wants to merge 5 commits into
Open
Phase 7: test coverage for visualization, model_checking, distance1D + interval_rescaling fix#37edeno wants to merge 5 commits into
edeno wants to merge 5 commits into
Conversation
These modules had ~0% test coverage. Add focused unit/smoke tests: - tests/visualization/: smoke tests for static plotting (Agg backend), the movie writer (tmp_path, ffmpeg-skipped), and the figurl 1D/2D builders (sortingview-importorskip). Exercises each module's public surface without pixel comparison. - tests/model_checking/: unit tests for clusterless.py (empirical_cdf, rosenblatt_transform, interval_rescaling_transform) and sorted_spikes.py (the TimeRescaling Brown-Barbieri-Ventura-Kass-Frank implementation + module functions). Homogeneous-Poisson-in -> uniform-rescaled-out checks, hand-computed constant-intensity values, and the Wiener-2003 short-trial correction direction. The 7 NotImplementedError stubs in clusterless.py are pinned with raises-NotImplementedError tests (their removal is a separate decision). test_shape_mismatch_is_a_bug pins a real interval_rescaling_transform broadcasting bug (joint_mark_intensity (n_spikes, n_features) / ground_process_intensity (n_time,)) for a future source fix. - tests/test_analysis_distance_1d.py: get_trajectory_data, get_ahead_behind_distance (graph-based 1D), and the get_map_speed n_time==1 branch. (The plan assumed get_1D_distance / maximum_a_posteriori_estimate_1D, which do not exist; tested the actual public surface.) Coverage: model_checking 100%/94%, distance1D 89%, visualization exercised via smoke tests. Tests only; no source changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- tests/core/test_hmm_algorithms.py: add TestViterbiGlobalOptimum (a deterministic forced-transition chain makes the global Viterbi path [0,1,2] differ from the greedy per-step argmax [0,0,0], proving Viterbi honors the transition structure) and TestSingleStateHMM (n_states=1: filtered prob is 1.0 everywhere and the marginal log-likelihood equals the sum of per-step log-likelihoods). - tests/properties/test_probability_properties.py: fix two flaky / tautological property tests. test_normalize_preserves_probability_property fed an already-normalized distribution into _normalize and checked it stayed normalized (testing the strategy, not the function), and flaked on float32 rounding under Hypothesis exploration. Replaced with test_normalize_yields_unit_sum, which feeds an UN-normalized positive array (new positive_array strategy) and asserts a float32-safe unit sum. Loosened three normalization-comparison rtol=1e-6 assertions (idempotence, scale-invariance, divide_safe) to float32-appropriate tolerances; rtol=1e-6 was below float32 precision and was the source of the intermittent failures (the stored example lived only in the gitignored local .hypothesis cache, so CI was never affected). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
model_checking.clusterless.interval_rescaling_transform divided joint_mark_intensity (n_spikes, n_features) by ground_process_intensity (n_time,), which only broadcasts in the degenerate n_features == n_time case and otherwise raised ValueError — i.e. the function was broken for its documented shape regime (n_time >> n_features). It has no production callers yet; the bug surfaced while adding test coverage (Phase 7b). The conditional mark intensity given the spike time is joint_mark_intensity (at the spikes) divided by the ground intensity AT THOSE SAME SPIKE TIMES. Evaluate the (n_time,) ground intensity at the spike times via np.interp (the same idiom _compute_rescaled_isi already uses), then divide with broadcasting against (n_spikes, 1). Rather than ship a test that enshrined the bug (the original test_shape_mismatch_is_a_bug asserted it raised ValueError), fix the source and convert the test into real correctness checks: test_realistic_shapes_run (the previously-broken n_time >> n_features case now runs and yields valid uniform outputs) and test_conditional_mark_uses_ground_intensity_at_spike_times (verifies the quotient uses the spike-time-interpolated ground intensity, not the raw (n_time,) array). 17 clusterless tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erty test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k tests - interval_rescaling_transform: raise on non-positive ground intensity at a spike time instead of silently producing inf/nan that the Rosenblatt rank transform would launder into a plausible [0,1] goodness-of-fit result. - Add edge-case tests: non-positive ground intensity raises; n_features == n_time (the shape the old broadcast bug silently mis-divided) matches the reference. - discretize_and_trim test: use a one-hot posterior with known zero-mass bins and assert they are actually dropped, instead of restating the > 0 filter. - Add a soft-transition Viterbi test where greedy [0,1,0] differs from the global optimum [0,0,0], exercising real backtracking discrimination. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 7 of the review-findings stack: closes test-coverage gaps the full-repo review flagged for previously-untested modules, and fixes one genuine bug discovered while writing those tests.
Stacked on #36 (
docstring-scaffolding-hygiene). Review/merge #29 → #30 → #33 → #34 → #35 → #36 first.New test coverage (+1630 LOC, 15 files)
visualization/— smoke tests forstatic,movie,figurl_1D,figurl_2D(figurl tests skip cleanly whensortingviewis absent).model_checking/— unit tests forclusterless.py(empirical_cdf,rosenblatt_transform,interval_rescaling_transform) andsorted_spikes.py, plus status-pinning for the 7 documentedNotImplementedErrorstubs (intended-future Yousefi-2020 API).analysis/distance1D.py— direct coverage of the track-graph distance helpers (exercises the Phase 5_setup_track_graphper-call-copy fix).core/test_hmm_algorithms.py— non-trivial Viterbi backtracking (forced 0→1→2 chain vs greedy[0,0,0]) and a single-state HMM certainty/marginal check.properties/test_probability_properties.py— made three property tests robust to float32 (JAX default):normalize-yields-unit-sum on un-normalized input, and loosenedrtolfrom1e-6(below float32 precision) to1e-4/1e-5.Fixed
model_checking.clusterless.interval_rescaling_transformdivided the(n_spikes, n_features)joint mark intensity by the full(n_time,)ground-process intensity, raising a broadcastingValueErrorfor any realistic input (n_time != n_spikes). The ground intensity is now interpolated at the spike times (np.interp → (n_spikes,)) before the division, matching the docstring. (Found while writing tests; fixed in source rather than pinned.)Test plan
pytest -m snapshot— 8 pass, no diffspytest -m "not slow"— 895 passed, 5 skipped, 1 xfailed, 0 failuresruff check+ruff format --checkclean🤖 Generated with Claude Code