Skip to content

Remove duplicate code paths in transitions, GMM, simulators#33

Merged
edeno merged 4 commits into
mainfrom
duplicate-path-cleanup
Jun 11, 2026
Merged

Remove duplicate code paths in transitions, GMM, simulators#33
edeno merged 4 commits into
mainfrom
duplicate-path-cleanup

Conversation

@edeno

@edeno edeno commented May 21, 2026

Copy link
Copy Markdown
Contributor

Summary

Removes parallel implementations and dead code that CLAUDE.md prohibits ("don't leave parallel v1 / v2 implementations indefinitely"). This is the bare-deletes portion of the planned cleanup — the clusterless KDE consolidation is intentionally deferred (see below).

Removed

  • discrete_state_transitions.estimate_joint_distribution, multinomial_neg_log_likelihood, multinomial_gradient, multinomial_hessian, set_initial_discrete_transition: orphaned v1 M-step paths superseded by _aggregate_xi_by_state_jax / _transition_pair_stats_jax in c339fa9 (Implement exact discrete-transition M-step #24) and a19deb1. No remaining production callers.
  • likelihoods.clusterless_gmm.EncodingModel dataclass: declared but never returned; comment marked it deprecated. Return-type annotation on fit_clusterless_gmm_encoding_model updated to dict to match the docstring.
  • Duplicate simulate_time, simulate_place_field_firing_rate, simulate_neuron_with_place_field definitions in simulate/sorted_spikes_simulation.py: hoisted bit-identical copies into simulate/_common.py as the single source of truth.

Changed

  • simulate.simulate_poisson_spikes now returns Poisson counts consistently across the package. Previously two same-named functions disagreed (counts vs binary indicators). Callers needing a boolean indicator should apply > 0; all in-repo consumers already do this.
  • simulate.get_trajectory_direction now consistently returns (direction_label, is_inbound). The one-value variant has been removed.
  • simulate.simulate_position retains the velocity-based semantics from simulate/simulate.py. The period-based sinusoid that previously shared the name in simulate/sorted_spikes_simulation.py is renamed to a private _simulate_sinusoidal_position_by_period (used only by the sorted-spikes simulator flows). Preserves golden-regression trajectories.

Deferred

Test plan

  • uv run ruff check src/ and uv run ruff format --check src/ clean
  • uv run pytest src/non_local_detector/tests/test_golden_regression.py -v — 4 pass, no diffs
  • uv run pytest -m snapshot -v — 8 pass, no diffs
  • uv run pytest src/non_local_detector/tests/test_discrete_state_transitions.py src/non_local_detector/tests/test_discrete_transitions_estimation.py — 70 pass
  • uv run pytest src/non_local_detector/tests/test_simulator_contract.py src/non_local_detector/tests/test_simulator_properties.py — 21 pass
  • grep "estimate_joint_distribution|multinomial_neg_log_likelihood|set_initial_discrete_transition|class EncodingModel" src/non_local_detector/ --include="*.py" returns zero matches
  • notebooks/02_likelihood_models/weighted_place_fields.ipynb import updated to from non_local_detector.simulate import …; new imports verified to resolve and run

Stacked on #30 (Phase 2 EM convergence surfacing).

🤖 Generated with Claude Code

@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

edeno and others added 3 commits June 11, 2026 16:17
The class was declared but never returned (the comment said
"DEPRECATED: Use dictionary for now") and had no external callers.
Remove the class, drop the now-unused `from dataclasses import
dataclass` import, and fix the leftover `-> EncodingModel` return
annotation on `fit_clusterless_gmm_encoding_model` to `-> dict`
(matching the docstring and the actual returned shape).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The exact-discrete-transition M-step landed in c339fa9 (#24) and the
aggregated path was removed in a19deb1. The original v1
entry-points (estimate_joint_distribution,
multinomial_neg_log_likelihood + its jax.grad/hessian wrappers, and
set_initial_discrete_transition) were left behind with no remaining
production callers, only referenced by tests that exercised them or
used them as a v1-vs-new oracle.

Delete the three functions and the module-level grad/hessian
bindings. Delete the test classes that exclusively exercised the v1
paths; rewrite the v1-as-oracle tests in
test_discrete_transitions_estimation.py to call the canonical
estimate_discrete_transition_counts_from_expanded_posteriors /
..._from_responses_... directly. 70 discrete-transition tests still
pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
simulate/simulate.py and simulate/sorted_spikes_simulation.py both
defined functions with the same names but in two cases different
semantics:

- simulate_poisson_spikes was binary indicators in simulate.py vs
  raw Poisson counts in sorted_spikes_simulation.py. Keep the
  counts semantics (closer to spike-time events; indicator =
  count > 0). All in-repo consumers already compose correctly with
  counts.

- get_trajectory_direction returned one value in simulate.py vs
  two values (direction_label, is_inbound) in
  sorted_spikes_simulation.py. Keep the two-value form (strictly
  more informative). Update the single one-value caller to
  unpack-and-discard.

- simulate_position is a velocity-parameterized sinusoid in
  simulate.py vs a period-parameterized sinusoid in
  sorted_spikes_simulation.py. These have genuinely different
  trajectories and consolidating them would change goldens. Keep
  the velocity form as the canonical public name; rename the
  period form to the private _simulate_sinusoidal_position_by_period
  used only by the sorted-spikes simulator flows.

The remaining utilities (simulate_time, simulate_place_field_firing_rate,
simulate_neuron_with_place_field) are bit-identical; hoist into
simulate/_common.py and re-export from simulate/__init__.py so
callers can `from non_local_detector.simulate import …`.

Update the weighted_place_fields notebook's import path. CHANGELOG
documents the new public-surface conventions, the deletions in
Groups 3c and 3d, and the deferral of the clusterless_kde
consolidation pending the log-KDE Local NaN fix (#32).

21 simulator-contract tests pass; 8 snapshot + 4 golden regression
tests pass with no diffs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@edeno edeno force-pushed the duplicate-path-cleanup branch from 3262556 to 23c1c5a Compare June 11, 2026 20:31
- clusterless_gmm.fit_clusterless_gmm_encoding_model: correct the Returns
  docstring to the nine keys the function actually returns. It previously
  listed occupancy_bins, log_occupancy_bins (real key is log_occupancy),
  position_time, and eight gmm_* config arguments that are never returned.
  Fix the CHANGELOG entry that incorrectly claimed the annotation matched
  the "documented and actual" return shape (the docstring was wrong).

- test_discrete_transitions_estimation: add a magnitude guard
  (learned_transition[0, 2] < 0.5) to
  test_simulated_likelihoods_shift_learned_transition_to_matching_source.
  This restores the exact-vs-source-blind discrimination that the deleted
  aggregate-path negative control provided, without resurrecting v1 code.

- test_simulator_contract: add test_simulate_poisson_spikes_returns_counts,
  a direct guard pinning the unified counts (not binary 0/1) semantics that
  the package now advertises; previously exercised only transitively.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@edeno edeno merged commit c7d43ec into main Jun 11, 2026
9 checks passed
@edeno edeno deleted the duplicate-path-cleanup branch June 11, 2026 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant