Skip to content

Phase 7: test coverage for visualization, model_checking, distance1D + interval_rescaling fix#37

Open
edeno wants to merge 5 commits into
docstring-scaffolding-hygienefrom
test-coverage-gaps
Open

Phase 7: test coverage for visualization, model_checking, distance1D + interval_rescaling fix#37
edeno wants to merge 5 commits into
docstring-scaffolding-hygienefrom
test-coverage-gaps

Conversation

@edeno

@edeno edeno commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

Phase 7 of the review-findings stack: closes test-coverage gaps the full-repo review flagged for previously-untested modules, and fixes one genuine bug discovered while writing those tests.

Stacked on #36 (docstring-scaffolding-hygiene). Review/merge #29#30#33#34#35#36 first.

New test coverage (+1630 LOC, 15 files)

  • visualization/ — smoke tests for static, movie, figurl_1D, figurl_2D (figurl tests skip cleanly when sortingview is absent).
  • model_checking/ — unit tests for clusterless.py (empirical_cdf, rosenblatt_transform, interval_rescaling_transform) and sorted_spikes.py, plus status-pinning for the 7 documented NotImplementedError stubs (intended-future Yousefi-2020 API).
  • analysis/distance1D.py — direct coverage of the track-graph distance helpers (exercises the Phase 5 _setup_track_graph per-call-copy fix).
  • core/test_hmm_algorithms.py — non-trivial Viterbi backtracking (forced 0→1→2 chain vs greedy [0,0,0]) and a single-state HMM certainty/marginal check.
  • properties/test_probability_properties.py — made three property tests robust to float32 (JAX default): normalize-yields-unit-sum on un-normalized input, and loosened rtol from 1e-6 (below float32 precision) to 1e-4/1e-5.

Fixed

  • model_checking.clusterless.interval_rescaling_transform divided the (n_spikes, n_features) joint mark intensity by the full (n_time,) ground-process intensity, raising a broadcasting ValueError for any realistic input (n_time != n_spikes). The ground intensity is now interpolated at the spike times (np.interp → (n_spikes,)) before the division, matching the docstring. (Found while writing tests; fixed in source rather than pinned.)

Test plan

  • 111 new/modified tests pass
  • pytest -m snapshot — 8 pass, no diffs
  • golden regression — 4 pass, no diffs
  • broad pytest -m "not slow" — 895 passed, 5 skipped, 1 xfailed, 0 failures
  • ruff check + ruff format --check clean

🤖 Generated with Claude Code

edeno and others added 5 commits May 28, 2026 14:44
These modules had ~0% test coverage. Add focused unit/smoke tests:

- tests/visualization/: smoke tests for static plotting (Agg backend),
  the movie writer (tmp_path, ffmpeg-skipped), and the figurl 1D/2D
  builders (sortingview-importorskip). Exercises each module's public
  surface without pixel comparison.

- tests/model_checking/: unit tests for clusterless.py (empirical_cdf,
  rosenblatt_transform, interval_rescaling_transform) and
  sorted_spikes.py (the TimeRescaling Brown-Barbieri-Ventura-Kass-Frank
  implementation + module functions). Homogeneous-Poisson-in ->
  uniform-rescaled-out checks, hand-computed constant-intensity values,
  and the Wiener-2003 short-trial correction direction. The 7
  NotImplementedError stubs in clusterless.py are pinned with
  raises-NotImplementedError tests (their removal is a separate
  decision). test_shape_mismatch_is_a_bug pins a real
  interval_rescaling_transform broadcasting bug (joint_mark_intensity
  (n_spikes, n_features) / ground_process_intensity (n_time,)) for a
  future source fix.

- tests/test_analysis_distance_1d.py: get_trajectory_data,
  get_ahead_behind_distance (graph-based 1D), and the get_map_speed
  n_time==1 branch. (The plan assumed get_1D_distance /
  maximum_a_posteriori_estimate_1D, which do not exist; tested the
  actual public surface.)

Coverage: model_checking 100%/94%, distance1D 89%, visualization
exercised via smoke tests. Tests only; no source changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- tests/core/test_hmm_algorithms.py: add TestViterbiGlobalOptimum
  (a deterministic forced-transition chain makes the global Viterbi
  path [0,1,2] differ from the greedy per-step argmax [0,0,0], proving
  Viterbi honors the transition structure) and TestSingleStateHMM
  (n_states=1: filtered prob is 1.0 everywhere and the marginal
  log-likelihood equals the sum of per-step log-likelihoods).

- tests/properties/test_probability_properties.py: fix two flaky /
  tautological property tests. test_normalize_preserves_probability_property
  fed an already-normalized distribution into _normalize and checked it
  stayed normalized (testing the strategy, not the function), and flaked
  on float32 rounding under Hypothesis exploration. Replaced with
  test_normalize_yields_unit_sum, which feeds an UN-normalized positive
  array (new positive_array strategy) and asserts a float32-safe unit
  sum. Loosened three normalization-comparison rtol=1e-6 assertions
  (idempotence, scale-invariance, divide_safe) to float32-appropriate
  tolerances; rtol=1e-6 was below float32 precision and was the source
  of the intermittent failures (the stored example lived only in the
  gitignored local .hypothesis cache, so CI was never affected).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
model_checking.clusterless.interval_rescaling_transform divided
joint_mark_intensity (n_spikes, n_features) by ground_process_intensity
(n_time,), which only broadcasts in the degenerate n_features == n_time
case and otherwise raised ValueError — i.e. the function was broken for
its documented shape regime (n_time >> n_features). It has no production
callers yet; the bug surfaced while adding test coverage (Phase 7b).

The conditional mark intensity given the spike time is
joint_mark_intensity (at the spikes) divided by the ground intensity AT
THOSE SAME SPIKE TIMES. Evaluate the (n_time,) ground intensity at the
spike times via np.interp (the same idiom _compute_rescaled_isi already
uses), then divide with broadcasting against (n_spikes, 1).

Rather than ship a test that enshrined the bug (the original
test_shape_mismatch_is_a_bug asserted it raised ValueError), fix the
source and convert the test into real correctness checks:
test_realistic_shapes_run (the previously-broken n_time >> n_features
case now runs and yields valid uniform outputs) and
test_conditional_mark_uses_ground_intensity_at_spike_times (verifies the
quotient uses the spike-time-interpolated ground intensity, not the raw
(n_time,) array). 17 clusterless tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erty test

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k tests

- interval_rescaling_transform: raise on non-positive ground intensity at a
  spike time instead of silently producing inf/nan that the Rosenblatt rank
  transform would launder into a plausible [0,1] goodness-of-fit result.
- Add edge-case tests: non-positive ground intensity raises; n_features == n_time
  (the shape the old broadcast bug silently mis-divided) matches the reference.
- discretize_and_trim test: use a one-hot posterior with known zero-mass bins and
  assert they are actually dropped, instead of restating the > 0 filter.
- Add a soft-transition Viterbi test where greedy [0,1,0] differs from the global
  optimum [0,0,0], exercising real backtracking discrimination.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant