Correctness bug bundle: distance, posterior, transitions, sim, viz, HPD (+ position-dim refactor) by edeno · Pull Request #29 · LorenFrankLab/non_local_detector

edeno · 2026-05-21T12:24:12Z

Summary

A bundle of correctness fixes plus small follow-ups surfaced by a full-repo review and three filed issues (#40–#42). Changes are organized into focused, independently-revertible commits. No public API removals; model snapshot and golden-regression outputs are unchanged.

Closes #40
Closes #41
Closes #42

Fixed

Review findings

analysis.distance1D.get_map_speed: trailing boundary speed was inserted before the last sample (np.insert(arr, -1, val)), misordering the final two speeds for any chunk with ≥3 time bins. Now appended.
_DetectorBase.estimate_parameters: post-EM cleanup of _encoding_model_data never ran (hasattr attribute-name mismatch); fitted models retained spike-time/waveform arrays.
ContFrag{SortedSpikes,Clusterless}Classifier.get_posterior: raised on 2D environments (summed a position dim that only exists in 1D). Now collapses every position/*_position dim, guards against an empty dim list, and the two identical bodies are de-duplicated into one helper. state output dim preserved.
Environment.fit_place_grid: a tuple place_bin_size with a track_graph crashed deep inside np.linspace; now raises a clear ValidationError with an example.
RandomWalk direction-aware transitions: crashed with AttributeError in networkx when no N-D track graph was available; now raises ConfigurationError.
figurl_2D.process_decoded_data: the uint16 grid guard had an off-by-one — it rejected valid 256×256 grids (65536 bins, max index 65535 fits). Fixed to key on the largest index (overflow begins at 65537 bins) and extracted into a sortingview-free, unit-testable validate_grid_within_uint16 helper.
make_simulated_run_data: hardcoded mark_spacing=10 while replay generators default to MARK_SPACING=5; unified on the module constant.
get_HPD_spatial_coverage: only handled 1D; now integrates over the product of per-axis bin widths, supporting 1D (length), 2D (area), 3D (volume), and beyond. 1D/2D results numerically identical to before.

Filed issues

get_ahead_behind_distance2D(precomputed_distance=True) crashes under networkx>=3 (.items() on generator) #40 distance2D.get_map_estimate_direction_from_track_graph (precomputed_distance=True): the all-pairs nx.shortest_path is a generator under networkx≥3, not a dict — .items() raised AttributeError. Now iterated directly (also avoids materializing the full O(n_nodes²) dict).
make_single_environment_movie: multiunit panel disappears after first parallel chunk (ax1 limits gated on frame_idx==0) #42 make_single_environment_movie: the multiunit panel went blank after the first parallel chunk because ax1 limits were gated on frame_idx == 0. Now initialized per-worker.

Latent

base._convert_seq_to_df position naming capped labels at 4 (x/y/z/w) and silently dropped position columns for 5-6D environments. Convention centralized in non_local_detector._position_dims and shared by all consumers; truncation fixed.

Comprehensive-review follow-ups (silent → loud)

distance2D.get_map_estimate_direction_from_track_graph (precomputed_distance=True): on a disconnected track graph an unreachable head/MAP pair kept the -1 path sentinel and silently indexed node_positions[-1], emitting a wrong heading. Unreachable pairs now warn and return NaN (mirrors get_2D_distance).
RandomWalk.make_state_transition: a direction= argument was silently ignored unless use_manifold_distance=True on an N-D grid (the default use_manifold_distance=False, or a track_graph, dropped it). Now raises ConfigurationError up front rather than returning an unconstrained walk.

Added

get_HPD_spatial_coverage gained an optional bin_width parameter for non-uniform 1D (linearized track-graph) grids. Multi-segment tracks with unequal arms produce on-track bins of different widths (plus edge_spacing gap bins); the uniform default used the first bin's width for every bin, biasing coverage. Pass per-bin widths — e.g. np.diff(environment.edges_[0]) — for the exact integral. Default unchanged (uniform), restricted to a single position dim since multidimensional grids are uniformly spaced. (Determined by tracing every grid path: open-field/single-segment tracks are uniform → center-derived widths equal the true edges; only unequal-arm track graphs are non-uniform, and there the widths must come from the real edges.)

Changed

head_direction_simliarity / get_ahead_behind_distance2D: head_direction docstring says (n_time, 2) but code requires a 1-D angle array #41 head_direction_simliarity / get_ahead_behind_distance2D: corrected the head_direction docstring from (n_time, 2) to a 1-D heading-angle array (n_time,). No code change.
posterior_consistency_hpd_overlap: Notes block clarifies the asymmetric containment coefficient |A∩B| / min(|A|,|B|). No code change.
posterior_consistency_{kl_divergence,hpd_overlap}: corrected the input-shape docstrings (both require flattened (n_time, n_position_bins), not the previously-documented (n_time, n_x_bins, n_y_bins) which raises) and added the missing factor of 2 to the Sørensen-Dice formula. Docstrings only.

Removed

environment.order_boundary and get_track_boundary_points: relied on nx.from_scipy_sparse_matrix (removed in NetworkX 3.0) and were unused; the unused NearestNeighbors import was removed too.

Refactor / internal

New non_local_detector._position_dims centralizes the position-dim naming convention (get_position_dim_names / get_position_dims), shared by both base.py results constructors, ContFrag*.get_posterior, and get_HPD_spatial_coverage (previously re-derived with three different rules). Direct unit tests pin the 5-6D and >6D branches.
Shared synthetic-results test factory (conftest.make_state_bins_results) replaces near-duplicate builders across two test modules.

Test plan

uv run pytest -m "not slow" — 849 passed, 3 skipped, 1 xfailed (pre-existing)
uv run pytest -m snapshot and test_golden_regression.py — no diffs
HPD 1D/2D coverage numerically identical to the prior formula (max|diff| = 0); added 3D and non-uniform-bin_width regression tests
ruff check src/ and ruff format --check src/ — clean
Each fix has an isolated regression test; the figurl off-by-one, distance2D get_ahead_behind_distance2D(precomputed_distance=True) crashes under networkx>=3 (.items() on generator) #40, disconnected-graph, RandomWalk-direction, and movie make_single_environment_movie: multiunit panel disappears after first parallel chunk (ax1 limits gated on frame_idx==0) #42 tests were verified RED against the pre-fix code
CI green

🤖 Generated with Claude Code

@staticmethod

…erior 2D Three independently-revertible bug fixes plus their behavioral tests. - analysis.distance1D.get_map_speed: trailing boundary speed was inserted before the last array element instead of appended; for every chunk with three or more time bins, the final two speed samples were misordered. Switch np.insert(arr, -1, val) to np.append(arr, val). - models.base._DetectorBase.estimate_parameters: post-EM cleanup of the encoding-model data did nothing because the hasattr check used trailing-underscore form while write sites use leading-underscore. Spike-time and waveform-feature arrays were retained for the lifetime of every fitted model. - models.cont_frag_model.ContFrag{SortedSpikes,Clusterless}Classifier .get_posterior: raised for 2D environments because it summed over the dim name "position" which only exists for 1D. Collapse every dim whose name matches "position" or ends with "_position", preserving the @staticmethod decorator and the singular "state" output dim name. Tests: - Parameterized boundary-handling tests for get_map_speed across n_time in {3, 4, 5, 100}, plus a regression test that distinguishes the np.append output from the buggy np.insert(-1) output. - Assertion that _encoding_model_data is no longer present after estimate_parameters completes. - 2D and 1D get_posterior tests for both ContFrag classifiers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Seven small correctness fixes plus tests; one docstring clarification; one CHANGELOG file. - environment.Environment.fit_place_grid: tuple place_bin_size with a track_graph previously dereferenced a dead local and crashed inside np.linspace. Raises ValidationError with example. - environment: remove order_boundary and get_track_boundary_points (both called the removed nx.from_scipy_sparse_matrix, both unused outside the module). Drop the now-unused NearestNeighbors import. - continuous_state_transitions.RandomWalk._handle_no_track_graph: direction-aware mode on environments without an N-D track graph previously crashed inside networkx; raises ConfigurationError with a hint pointing to the two valid configurations. - visualization.figurl_2D.process_decoded_data: linearized bin indices stored as uint16 silently wrapped for grids larger than 65535 cells (e.g., 256x256). Raise ValueError at the boundary; keep the uint16 dtype because the downstream sortingview schema receives the field as uint16. - simulate.clusterless_simulation: replace both hardcoded mark_spacing=10 occurrences with the module-level MARK_SPACING constant. Encoding marks previously landed at {0,10,20,30} while replay generators used {0,5,10,15}, silently invalidating any test mixing the two. - model_checking.highest_posterior_density.get_HPD_spatial_coverage: docstring advertised 2D support but the implementation only used a 1D bin width. Branch on dim names and use the 2D cell area (bin_width_x * bin_width_y) when x_position and y_position are present. Signature preserved. - model_checking.posterior_consistency.posterior_consistency_hpd_overlap: docstring expanded to flag the asymmetric containment-coefficient semantics (|A intersect B| / min(|A|,|B|)) and point readers at Sorensen-Dice / Jaccard for symmetric overlap. No code change. Tests cover each fix in isolation. The figurl overflow tests use pytest.importorskip("sortingview"); the mark-spacing test asserts exact replay-side spacing and bounds encoding-side spread below the pre-fix value (encoding marks have additive Gaussian noise, so the literal grid is not directly observable). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The overflow guard rejected n_position_bins > 65535, but the values stored as uint16 are bin *indices* (max index = n_position_bins - 1), so a 256x256 grid (65536 bins, max index 65535) fits in uint16 and was wrongly rejected. Compare the largest index instead; true overflow begins at 65537 bins. Extract the check into a module-level validate_grid_within_uint16 helper so it is importable and unit-testable without sortingview, and rewrite the guard test to pin the true pass/raise boundary (65536 passes, 65537 raises). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ContFrag{SortedSpikes,Clusterless}Classifier.get_posterior: - Guard against an empty position-dim list (unstacked.sum([]) would silently return the full per-bin array instead of (time, state)). - Extract the byte-identical body of both classifiers into a shared module helper _state_probabilities_from_results. RandomWalk: reword the direction-aware ConfigurationError hint, which had referenced a non-existent Environment(use_manifold_distance=...) argument (it is a RandomWalk parameter). Tests: add nD (dim{i}_position) and empty-dim coverage and consolidate the 1D/2D/nD get_posterior tests into one parametrized test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- HPD 2D coverage: add a multi-bin test exercising the sum([x_position, y_position]) reduction; the existing test put all mass on a single bin and could not distinguish a multi-axis sum from a single-axis one. - Simulator MARK_SPACING: derive the regression boundary from PLACE_FIELD_MEANS / N_TETRODES / MARK_SPACING instead of hardcoded 4/10/30. - _encoding_model_data cleanup: add a fast, isolated unit test so the cleanup contract is checked even when slow/integration tests are deselected, and drop the assertion that was riding along on the slow EM-consistency test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…nel) - #40 distance2D.get_map_estimate_direction_from_track_graph: the all-pairs nx.shortest_path is a generator under networkx>=3 (not a dict), so .items() raised AttributeError. Iterate it directly (also avoids materializing the full O(n_nodes^2) dict). - #41 head_direction_simliarity / get_ahead_behind_distance2D: correct the head_direction docstring from (n_time, 2) to a 1-D heading-angle array (n_time,), matching the implementation and the 1-D sibling. - #42 make_single_environment_movie: the multiunit panel went blank after the first parallel chunk because the bottom-axis limits were gated on frame_idx == 0; only the worker owning frame 0 set them. Initialize on the first frame each worker renders via an _ax1_init flag (mirrors the mesh pattern). Adds a regression test. Also adds distance2D track-graph regression tests and strengthens the distance1D get_map_speed boundary test (asymmetric final edge so the pre-fix np.insert(-1) swap is observable at every n_time, replacing a degenerate uniform-edge case), and updates the CHANGELOG for the above plus the figurl uint16 wording. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… N-D The canonical position-dim names (position; x/y/z/w/v/u_position for 2-6D; dim{i}_position beyond six) were re-derived independently in base.py (two sites), cont_frag_model.py, and highest_posterior_density.py. One base.py site (_convert_seq_to_df) capped its labels at four (x/y/z/w) and silently dropped the extra position columns for 5- and 6-D environments. - Add non_local_detector._position_dims with get_position_dim_names (build the names) and get_position_dims (detect them on a DataArray); depends only on xarray so both models and model_checking can import it without a cycle. - Route both base.py constructors, ContFrag*.get_posterior, and get_HPD_spatial_coverage through it. Fixes the 5-6D truncation. - get_HPD_spatial_coverage now integrates over the product of per-axis bin widths, supporting 1D (length), 2D (area), 3D (volume) and beyond, instead of only 1D/2D. 1D/2D results are numerically identical to before (verified max|diff| = 0); adds a 3D regression test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The state_bins-MultiIndex results Dataset was built by near-identical helpers in two test files (_make_results_dataset_1d/2d and _make_cont_frag_results). Extract the shared construction into conftest.make_state_bins_results (keyed on per-state position columns) and make both files' builders thin wrappers over it — the paired-column variant directly, the ContFrag variant via a meshgrid. Same seeds and layouts, so all existing assertions are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The comment implied the sum is over named position dims (x_position/y_position), but after .sel(state=...) the position bins remain stacked under a single 'state_bins' dim, so it sums that. Document the stacked layout and warn against swapping in get_position_dims (which would return [] and break normalization). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- distance2D.get_map_estimate_direction_from_track_graph (precomputed_distance =True): on a disconnected track graph an unreachable head/MAP pair kept the -1 path sentinel and silently indexed node_positions[-1], emitting a wrong heading. Unreachable pairs now warn and return NaN, mirroring get_2D_distance. - RandomWalk.make_state_transition: a direction= argument was silently ignored unless use_manifold_distance=True on an N-D grid (the default use_manifold_distance=False, or a track_graph, dropped it). Now raises ConfigurationError up front instead of returning an unconstrained walk. Both have regression tests verified RED against the prior code; the connected- graph and direction=None (all model-default) paths are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Both posterior_consistency_kl_divergence and posterior_consistency_hpd_overlap documented an (n_time, n_x_bins, n_y_bins) input alternative that the implementations do not support (they operate on axis=1 / axis=-1 and raise on a true 3D array); corrected to flattened (n_time, n_position_bins). Also added the missing factor of 2 to the Sørensen-Dice formula in the Notes block. Docstrings only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Pins get_position_dim_names at n=5,6,7 (the v/u_position and dim{i}_position branches where the latent 5-6D column-truncation lived) and the get_position_dims round-trip / non-position-dim exclusion, locking the single-source-of-truth convention at the seam rather than only via consumers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rids Linearized track-graph environments with unequal segment lengths produce on-track bins of different widths (and edge_spacing gap bins), so the uniform default — which uses the first bin's width for every bin — biases the coverage integral. Add an optional bin_width parameter (per-bin widths, e.g. np.diff(environment.edges_[0])) that integrates the exact bin measure; it is restricted to a single position dimension since multidimensional grids are uniformly spaced. The default path is unchanged and verified numerically identical (max|diff|=0) for 1D/2D; non-uniform widths cannot be recovered from bin centers, so passing real edges is the only exact route. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The all-pairs form of nx.shortest_path returns {source: {target: path}} as a dict in networkx 3.0-3.4 but a generator of (source, paths) pairs in >=3.5. The /simplify pass dropped the dict() wrapper to iterate directly, which is only valid for the generator form; under the older networkx on the Python 3.10 CI runner it iterated the dict's keys and raised 'cannot unpack non-iterable int object'. dict(...) normalizes both (the package pins networkx>=3.0). Verified the three track-graph tests pass against both networkx 3.2.1 (dict) and 3.6.1 (generator). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Replaces the unconditional dict(...) materialization with an isinstance branch: the all-pairs nx.shortest_path is a dict in networkx 3.0-3.4 (use .items()) and a generator in >=3.5 (iterate lazily). Iterating the generator one source at a time avoids holding all O(n_nodes^2) shortest paths simultaneously, which matters for large track graphs, while staying correct on the older dict form. Verified the track-graph tests pass against both networkx 3.2.1 (dict) and 3.6.1 (generator). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

analysis.distance2D.get_map_estimate_direction_from_track_graph with precomputed_distance=False: a disconnected head/MAP pair made nx.shortest_path(source, target) raise nx.NetworkXNoPath, which was uncaught and crashed. The sibling precomputed_distance=True branch (and get_2D_distance) already warn and return NaN for unreachable pairs; the non-precomputed branch now does the same, keeping the existing source==target IndexError catch. Tests: - test_nonprecomputed_direction_disconnected_graph_is_nan (RED against the prior code, which raised NetworkXNoPath). - test_precomputed_distance_dict_branch_matches_generator: monkeypatch the all-pairs nx.shortest_path to its networkx<3.5 dict shape and assert the .items() fallback matches the >=3.5 generator path, so that branch is covered regardless of the installed networkx version. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

make_simulated_run_data again passes mark_spacing=10 at both call sites (matching origin/main), so this branch no longer alters simulator output. The earlier change unified encoding on the module MARK_SPACING (=5), which silently halved the encoding mark-center spacing -- a behavioral change that does not belong in an output-preserving correctness bundle. Also drops the now-inapplicable module-docstring note, the CHANGELOG entry, and test_mark_spacing_consistent_between_encoding_and_replay, whose premise (encoding and replay share a spacing) no longer holds. The encoding/replay mark-spacing mismatch is a pre-existing latent issue left for a separate, deliberate change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

highest_posterior_density.get_HPD_spatial_coverage: - Raise a descriptive ValueError for a length-1 position axis (was a bare numpy IndexError from np.diff(...)[0]) and for non-monotonic/descending position coordinates (previously produced a silent negative/wrong spatial measure). - bin_width branch: reduce over the position axis by name rather than the positional axis=-1, so the result no longer silently assumes a time-leading layout (the uniform branch already reduced by name). - Docstring: "total spatial area" -> "total spatial measure" (length/area/volume) and document the strictly-increasing requirement. posterior_consistency.kl_divergence / hpd_overlap: add a shared _validate_consistency_inputs guard that rejects non-2-D or shape-mismatched input. The docstrings already require flattened (n_time, n_position_bins); 3-D input previously either reduced only the last axis (kl) or ran on axis=1 (overlap), silently returning a wrong-shaped/incorrect result. Tests cover the length-1, descending, and 1D bin_width shape-mismatch ValueErrors and the 2-D/shape enforcement (all RED against the prior code). Valid 1D/2D/3D coverage results are numerically unchanged (golden, snapshot, and the exact nonuniform-bin_width regression all pass). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…e update visualization.movie._render_single_env_frame: the per-worker ax1 limit init ran after multiunit_line.set_data, whose rate-window indexing can raise IndexError near a recording boundary (swallowed by the surrounding except). A worker whose opening frames all land in that edge region would skip the init and keep default ax1 limits, re-introducing the off-screen firing-rate bug for boundary chunks. Move the init ahead of set_data (the limits depend only on window_ind and rate.max(), not the frame), and log the swallowed IndexError at debug level instead of passing silently. Tests: a boundary chunk whose first frame overruns the rate array still sets the limits (RED against the prior ordering), and the limits are set once per worker (not reset on later frames). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…cabulary _position_dims.get_position_dim_names now raises ValueError for n_position_dims < 1 instead of silently returning [] (which would build a degenerate empty MultiIndex level set downstream). get_position_dims now matches only the closed vocabulary it emits -- "position", the labelled x/y/z/w/v/u_position names, and the numbered dim{i}_position fallback -- so a stray *_position dim (e.g. head_position, map_position, linear_position) is no longer mistaken for a decoder position axis and wrongly marginalized. Adds direct tests for the non-positive-dimension rejection and the underscore-suffix false-positive exclusion. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

test_random_walk_inward_no_trackgraphDD_raises matched "direction-aware RandomWalk", a phrase present in both the up-front make_state_transition guard and the deeper _handle_no_track_graph guard, so it did not pin which one fired. Match "requires an N-D track graph", unique to the deep guard this test exercises. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ut case TestPosteriorConsistencyHPDOverlap::test_rejects_3d_input used shape (2, 3, 3), where n_time != n_x made the pre-guard threshold broadcast *incidentally* raise -- so the test was RED for the wrong reason (a cryptic broadcast error), not because it exercised the genuinely silent path. Use (3, 3, 3) (n_time == n_x): the pre-guard code's `posterior >= threshold[:, None]` then broadcasts successfully and the function silently returns a wrong-shaped (3, 3) array, which is the silent failure _validate_consistency_inputs prevents. Verified RED against the pre-guard source (DID NOT RAISE) and green after. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

edeno and others added 2 commits May 21, 2026 05:23

edeno mentioned this pull request May 21, 2026

Surface silent convergence and optimizer failures during fitting #30

Merged

5 tasks

edeno and others added 6 commits June 10, 2026 09:12

edeno changed the title ~~Fix correctness bugs in distance, posterior, transitions, sim, viz~~ Correctness bug bundle: distance, posterior, transitions, sim, viz, HPD (+ position-dim refactor) Jun 10, 2026

edeno and others added 14 commits June 10, 2026 09:59

edeno merged commit e79a7b6 into main Jun 11, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correctness bug bundle: distance, posterior, transitions, sim, viz, HPD (+ position-dim refactor)#29

Correctness bug bundle: distance, posterior, transitions, sim, viz, HPD (+ position-dim refactor)#29
edeno merged 22 commits into
mainfrom
correctness-bug-bundle

edeno commented May 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

edeno commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fixed

Added

Changed

Removed

Refactor / internal

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

edeno commented May 21, 2026 •

edited

Loading