docs: meta-scrub + research-tooling relocation + doc gaps (perfectionist pass PR-3)#614
docs: meta-scrub + research-tooling relocation + doc gaps (perfectionist pass PR-3)#614KedoKudo wants to merge 9 commits into
Conversation
The 14 scripts/perf/* profiling drivers and the two scripts/fixtures/extract_venus_*.py extractors exist only to read the private 38 GB VENUS measurement cube under the gitignored research archive — they cannot run for anyone without that data, and shipping them in the repository points users at paths that do not exist for them. They now live with the data they consume (outside the repo), with their repo-root depth, internal invocation paths, and the SAMMY EMIN/EMAX comment wording updated for the new location; all verified with py_compile / bash -n. No executable wiring referenced them (pixi tasks, CI, Cargo all checked). Public docs that pointed at those scripts or at research-archive paths now describe the provenance without the paths: the tests/data/README regeneration note, the nexus.rs producer attributions, the resolution.rs bit-exact-baseline comment, and the PR template's audit-reference instructions. Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
House rule: a comment says why the code is shaped this way; when it
changed, in which PR, and who reviewed it live in the commit message
and PR body, where git log and blame preserve them. Production
comments had accumulated ~140 reviewer/round/PR decorations
('Codex round-3 P3 on PR #480', 'Copilot review finding on PR #470',
'Independent review P1 reproduction', 'epic #472, PR #475', …).
Every decoration is now gone; every carried fact stays inline
('Codex PR #475 round 2 measured 73 % rel err' → 'measured: 73 % rel
err'; bench-off outcomes, error magnitudes, and defect descriptions
all preserved). Issue numbers that anchor the defect or contract
being guarded (#458, #608, #514, #517, #483 A1, …) are kept — issues
are the durable what-record. Algorithm terms that merely contain
'round' (round-trip, round-off, round-robin) and SAMMY manual
revision citations are untouched.
Meta-only test docstrings ('Regression for Round-2 review fix #4'),
the codex04 study citations, and rustdoc .research/ paths are
rewritten separately — this commit is the mechanical tail strip.
Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
…te results inline
Rendered rustdoc and production comments cited artifacts a reader
cannot open: gitignored .research/ paths (the algorithm-design
round-robin judgments, the counts-path governing-equations note, the
real-VENUS partial-GAL validation), private memo sections ('memo 35
§P2.1', 'EG5'), agent-session memory files
(feedback_synthetic_resolution_test_design,
feedback_bit_exact_oracle_verbatim), the codex04 contestant name, and
a 'byte-identical to main' branch reference. Every one is rewritten
to carry its content inline: the surrogate module now describes the
design study and keeps its measured compression table; joint_poisson
states the profiled-flux derivation and benchmark regimes in place;
the partial-GAL test explains why the real measurement cannot ship
as a fixture (ORNL release policy on the USR/FTS kernel) without
naming a directory nobody else has.
Meta-only test docstrings ('Regression for Round-2 review fix #4',
'Codex review: …', 'Round-1 audit flagged …') now state the pinned
invariant directly; issue anchors (#514, #486) stay. No behavioral
change — comments and docs only.
Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
- spatial_map_typed — the pipeline's flagship spatial entry point — had zero rustdoc. It now documents the three input modes, the full up-front validation contract (shape checks, live-pixel value domains, the three rejected degenerate configurations and why each would otherwise yield a silently all-NaN map), the rayon per-pixel parallelism, cancel/progress semantics, per-pixel-failure-as-NaN behavior, and the error variants. - The pipeline / io / fitting crate-level module lists had drifted: calibration, error, export, nexus, project, rebin, spectrum, active_mask, and forward_model were missing. Every one-line description is taken from the module's own header doc. physics already matched. - The bare unreachable!() in penetrability's old_phase test helper now states its l = 1..=4 coverage invariant. - The 13 in-test println!s are gone: 11 were value dumps whose assertions already carry the checks; the Hf-177 fixture-missing skip notice becomes eprintln! (notices go to stderr, matching the suite's measurement-output convention); the parser's two summary prints become a real contract assertion (total_resonance_count must cover the resolved range's per-L-group sum). The deliberate measurement eprintln!s (samtry, VENUS, Doppler) are untouched. Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
…comments
The acceptance grep for the meta scrub caught what the original
inventory's pattern missed: ~65 more references to private research
memos ('memo 35 §P1.2', 'memo 38 §6') and their benchmark shorthand
('EG2 S2 C_An', 'EG5', 'EG1') across the Python API docstrings
(__init__.pyi + the PyO3 rustdoc — user-visible on docs renders),
GUI hover text and comments, user-facing error message strings, and
the joint-Poisson/pipeline doc comments.
Same treatment as the rest of the scrub: every pointer to an
artifact a reader cannot open is replaced by the content it pointed
at ('memo 38 §6 — 17 min/pixel polish cost' → 'measured ~17
min/pixel polish cost'; 'EG2 S2 C_An → −23% density bias' →
'benchmarked at −23% density bias'), deferral labels become plain
deferrals, and two test assertions that matched on memo-section
substrings in error messages now match on the parameter names. One
test identifier carrying memo-section shorthand (p2_2) is renamed to
describe the rule it pins.
Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
…ce corrections The scrub's own rewrites introduced two claim-accuracy regressions, both caught by cross-family review: - nelder_mead.rs attributed the 1/20→10/20 polish rescue to 'real VENUS data'; that regime is the synthetic counts benchmark — on real VENUS counts D saturates at 10^4-10^5 and polish cannot even self-terminate (the reason it defaults off). The doc now states both regimes explicitly. - joint_poisson/pipeline/.pyi claimed the counts-KL path was 'validated experimentally on synthetic and real VENUS counts data' while this branch deleted the only real-data KL driver, leaving nothing in-tree to substantiate the real-data half. The proper fix: a new real-data regression gate (TestVenusMlbwRegression::test_counts_kl_fit_matches_baseline) runs the joint-Poisson fit on the committed aggregated VENUS Hf fixture and anchors density (2.911e-5) and deviance/dof (3.14e4 — the documented real-data saturation regime, asserted >> 1). The wording now points at that test. Python suite: 127 → 128 (+~40 s, the cost of a real-data fit). P2s: dangling §P2.2 and 'memo 38' references resolved; state.rs sentence break restored; spatial_map_typed doc gains the reachable PipelineError::Transmission variant and the live-pixel progress target; stale absolute pipeline.rs line citations replaced with function-name anchors; 'duck-tape' → 'duct-tape' (×2). Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
…ture precompute The new spatial_map_typed rustdoc promises every cancellation path returns PipelineError::Cancelled, but the fit_temperature base-XS precompute wrapped its error with .map_err(PipelineError::Transmission), bypassing the From<TransmissionError> impl whose whole purpose is the Cancelled→Cancelled mapping — a user cancelling during the expensive Reich-Moore precompute got an error toast instead of a clean cancel. Bare ? now routes through the conversion (the pre-existing mis-mapping predates this branch; the new doc made it a contract). Regression test with engineered windows: caller-supplied precomputed broadened XS removes the earlier (already correctly mapped) precompute window, and a 100k-point grid stretches the base-XS window to tens of ms while the watcher flips cancel at 5 ms — mutation-checked 3/3 (restoring the map_err fails every run with Transmission(Cancelled)); the assertion itself is timing-independent, so the test cannot flake. Doc P2s: the npz regression fixture gets its provenance section in tests/data/README.md (the deleted extraction script was its only record); the validation bullet now states the deliberate active-bin scoping for transmission cubes; the counts-KL gate docstring carries the LM gate's cross-backend tolerance rationale. Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
…recision - The new nereids-io module list linked the hdf5-gated nexus/project modules, which breaks 'cargo doc -p nereids-io' under default features (-D broken_intra_doc_links) — the workspace doc job only passed via feature unification from apps/gui. The two entries are plain code spans now; standalone doc build verified clean. - The committed VENUS npz fixture embedded a description string pointing at the deleted extraction script and a research-archive path — the exact reference class this PR scrubs, baked into a shipped binary. Regenerated with provenance wording matching tests/data/README.md; all six data arrays verified bitwise identical, and both real-data regression gates pass on the regenerated file (anchors untouched). - spatial_map_typed's validation bullet now states each cube's actual domain (uncertainty is strictly positive, not merely non-negative). - The counts-KL gate gains a coarse physical bracket (1e-5 < density < 1e-4) independent of the machine-generated anchor, so a future re-anchoring commit cannot silently absorb an order-of-magnitude solver regression. Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #614 +/- ##
==========================================
+ Coverage 91.59% 91.63% +0.03%
==========================================
Files 46 46
Lines 32227 32283 +56
==========================================
+ Hits 29518 29582 +64
+ Misses 2709 2701 -8 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
…t gate Windows CI lost the watcher race in test_spatial_map_mid_run_cancellation_returns_err: on the saturated 4-core runner the busy-spinning watcher thread was starved long enough for all 64 pixels AND the post-loop cancel check to finish before its store became visible — a COMPLETE Ok map, which the test treated as failure even though that outcome carries no information about the regression under test (a PARTIAL Ok map). The test now distinguishes the three outcomes: Err(Cancelled) passes, a complete map retries (bounded, 5 attempts), a partial map — the actual regression — fails immediately on any attempt. The watcher yields instead of spinning so it gets scheduled on saturated runners. Codecov flagged the reworded 'B_det wiring is deferred' error string as the one uncovered patch line in pipeline.rs — its gate (non-zero detector_background nuisance arm under the joint-Poisson solver) had no test at all. Added the contract test alongside the sibling gate tests. Assisted-With: Claude Fable 5 (1M context) <noreply@anthropic.com>
|
Windows CI failure + codecov disposition (addressed in 47e97fb):
|
There was a problem hiding this comment.
Pull request overview
This PR performs a broad “meta scrub” and documentation hardening pass aimed at making the repository readable and self-contained for external reviewers, while relocating research-only tooling out of the tracked tree and adding/strengthening a couple of regression gates tied to committed VENUS fixtures.
Changes:
- Remove review/round/PR/memo-style metadata from in-tree prose (rustdoc, tests, GUI strings, templates) while preserving the underlying technical facts.
- Relocate/remove research-only VENUS extraction + performance profiling scripts from the repository; update fixture provenance docs accordingly.
- Close key doc/testing gaps: add substantial rustdoc for
spatial_map_typed, ensure cancellation maps toPipelineError::Cancelled, and add a real-VENUS counts-KL regression gate in Python tests.
Reviewed changes
Copilot reviewed 47 out of 48 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_nereids.py | Scrubs review metadata in docstrings/comments and adds a real-VENUS counts-KL baseline regression test using committed fixtures. |
| tests/test_mcp.py | Removes review-metadata phrasing from docstrings/comments in MCP-facing tests. |
| tests/data/README.md | Updates VENUS fixture provenance/regeneration guidance without referencing private repo paths; documents the aggregated NPZ fixture. |
| scripts/python_api_allowlist.txt | Removes private memo reference from API allowlist commentary. |
| scripts/perf/summarize_sample.py | Removes research-only profiling helper script from the tracked tree. |
| scripts/perf/run_sample.sh | Removes research-only macOS sampling wrapper script. |
| scripts/perf/run_sample_kl_periso_tzero.sh | Removes research-only macOS sampling wrapper script. |
| scripts/perf/run_sample_b3.sh | Removes research-only macOS sampling wrapper script. |
| scripts/perf/run_sample_b2.sh | Removes research-only macOS sampling wrapper script. |
| scripts/perf/run_sample_b2_at_scale.sh | Removes research-only macOS sampling wrapper script. |
| scripts/perf/run_sample_b1.sh | Removes research-only macOS sampling wrapper script. |
| scripts/perf/profile_kl_periso_tzero.py | Removes research-only profiling driver referencing private data paths. |
| scripts/perf/profile_b3_lm_periso_tzero.py | Removes research-only profiling driver referencing private data paths. |
| scripts/perf/profile_b2_kl_grouped.py | Removes research-only profiling driver referencing private data paths. |
| scripts/perf/profile_b2_kl_grouped_at_scale.py | Removes research-only profiling driver referencing private data paths. |
| scripts/perf/profile_b1_lm_tzero_grouped.py | Removes research-only profiling driver referencing private data paths. |
| scripts/perf/profile_a1_lm_grouped.py | Removes research-only profiling driver referencing private data paths. |
| scripts/perf/baseline_dump.py | Removes research-only baseline dump/verify harness from the tracked tree. |
| scripts/fixtures/extract_venus_hf_nexus.py | Removes fixture extractor that depended on private VENUS cube; tooling moved out-of-repo. |
| scripts/fixtures/extract_venus_aggregated.py | Removes aggregated VENUS extractor that depended on private VENUS cube; tooling moved out-of-repo. |
| crates/nereids-pipeline/src/spatial.rs | Adds comprehensive rustdoc for spatial_map_typed, fixes cancellation error mapping in the temperature precompute path, and adds/updates cancellation regression tests. |
| crates/nereids-pipeline/src/pipeline.rs | Updates joint-Poisson / counts-KL documentation and error messages to remove private memo anchors; adds a deferred-feature rejection test for non-zero detector background. |
| crates/nereids-pipeline/src/lib.rs | Completes module list in crate-level rustdoc for pipeline crate. |
| crates/nereids-physics/tests/venus_usr_surrogate.rs | Rewords design-study provenance text to remove contestant naming while keeping the technical claim. |
| crates/nereids-physics/src/transmission.rs | Scrubs review metadata from doc comments in the transmission module. |
| crates/nereids-physics/src/surrogate.rs | Generalizes design-study provenance wording and removes .research/ references while preserving math/measurement content. |
| crates/nereids-physics/src/slbw.rs | Removes debug println! noise from tests. |
| crates/nereids-physics/src/resolution.rs | Scrubs review-metadata phrasing from safety/docs/tests while preserving invariants and rationale. |
| crates/nereids-physics/src/reich_moore.rs | Removes debug println!s from tests and changes a skip notice from stdout to stderr. |
| crates/nereids-physics/src/penetrability.rs | Replaces bare unreachable!() with an invariant message in a test helper. |
| crates/nereids-io/src/project.rs | Removes memo references from project snapshot field docs. |
| crates/nereids-io/src/nexus.rs | Updates rustdoc to avoid referencing removed in-repo extractors while preserving the units/back-compat contract description. |
| crates/nereids-io/src/lib.rs | Expands module list documentation (including non-link entries for hdf5-gated modules to keep standalone doc builds warning-free). |
| crates/nereids-fitting/src/transmission_model.rs | Scrubs PR/review references and tightens explanatory comments around surrogate dispatch guards. |
| crates/nereids-fitting/src/poisson.rs | Updates module docs to remove memo references while retaining the distinction between legacy helpers and production counts-KL path. |
| crates/nereids-fitting/src/nelder_mead.rs | Rewrites motivation docs to remove private benchmark shorthand while preserving the performance/correctness rationale. |
| crates/nereids-fitting/src/lm.rs | Cleans up regression-test doc comments to remove review-round metadata. |
| crates/nereids-fitting/src/lib.rs | Completes crate-level module list documentation. |
| crates/nereids-fitting/src/joint_poisson.rs | Removes .research//memo references from module/docs and clarifies the validation/regression-gate narrative. |
| crates/nereids-endf/src/parser.rs | Replaces test println! output with an assertion that checks an invariant (total resonance count covers resolved sum). |
| bindings/python/src/lib.rs | Scrubs memo references in PyO3-facing doc comments and keeps the binding-side validation narrative intact. |
| bindings/python/python/nereids/init.pyi | Updates stub docstrings to remove memo anchors and reflect the real-VENUS regression gate claim. |
| apps/gui/src/widgets/design.rs | Updates GUI overlay rationale comment to avoid memo references while pointing to the underlying pipeline behavior. |
| apps/gui/src/state.rs | Removes memo references from persisted GUI state field docs. |
| apps/gui/src/guided/result_widgets.rs | Rewords GOF labeling comment to remove memo anchor. |
| apps/gui/src/guided/analyze.rs | Cleans hover/help text and result-label comment to remove memo references while preserving the user-facing meaning. |
| .github/pull_request_template.md | Removes .research/ path mention and rephrases process guidance without review-metadata jargon. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
PR-3 — the final queue item of the pre-paper perfectionist pass (after #610 claim accuracy, #611 exact Doppler kernel, #612 MIT/metadata). The repository's prose now reads for a cold reviewer: no review-process metadata in production comments, no citations to artifacts a reader cannot open, no research-only tooling shipped in-tree, and the flagship API documented.
Net: −1592 lines (+707/−2299 across 48 files).
What changed
Meta scrub (~205 sites)
House rule: a comment says why the code is shaped this way; when / which PR / which reviewer lives in commit messages and PR bodies, where git preserves them.
codex04design-study contestant name neutralized; the surrogate module doc keeps its full math + measured-compression table under neutral study provenance..research/paths removed from every tracked file (rustdoc, tests/data README, the PR template) — including the VENUS npz fixture's embeddeddescriptionfield (regenerated; all six data arrays verified bitwise identical).Research tooling out of the repo
The 14
scripts/perf/*profiling drivers and 2scripts/fixtures/extract_venus_*.pyextractors exist only to read the private 38 GB VENUS cube; they now live with that data outside the repository (path depth, internal invocation paths, and SAMMY EMIN/EMAX wording updated;py_compile/bash -nverified; zero executable wiring referenced them). Public docs describe the provenance without naming paths nobody else has.Doc gaps closed
spatial_map_typed— the pipeline's flagship spatial entry point had zero rustdoc; now documents input modes, the full up-front validation contract (per-domain value checks, active-bin scoping, the three rejected degenerate configs and why each would otherwise produce a silently all-NaN map), rayon parallelism, cancel/progress semantics, and the error variants.cargo doc -p nereids-iostays warning-free (verified).println!s gone (11 pure dumps removed; the fixture-missing skip notice →eprintln!; the parser's summary prints → a real contract assertion). The deliberate measurementeprintln!s (samtry, VENUS, Doppler) untouched. Bareunreachable!()in the penetrability test helper now states its invariant.Two real fixes the review process forced
PipelineError::Cancelledon every cancellation path, but thefit_temperaturebase-XS precompute wrapped its error with.map_err(PipelineError::Transmission), bypassing the purpose-builtFromconversion — a user cancelling during the expensive Reich-Moore precompute got an error toast instead of a clean cancel. Fixed with bare?+ a mutation-checked (3/3) regression test with engineered timing windows (caller-supplied precomputed XS removes the earlier window; a 100k-point grid stretches the target window to tens of ms; the assertion is timing-independent so the test cannot flake).test_counts_kl_fit_matches_baselineruns the joint-Poisson fit on the committed aggregated VENUS Hf fixture: anchored density (2.911e-5) + deviance/dof (3.14e4 — the documented real-data saturation regime, asserted ≫ 1) + an anchor-independent physical bracket. Python suite 127 → 128.Review
/review-pipeline, 3 cross-family rounds (Claude + Codex):Acceptance grep gates (all clean)
.researchappears in no tracked file outside.gitignore, CHANGELOG history, and agent-process docs — including inside binary fixtures;Codex|Copilot|codex04|memo 35|memo 38|EG[0-9]|round-N P[0-9]-class tokens: zero hits across crates/bindings/apps/tests;scripts/perf|extract_venus: zero hits.Gates
fmt · clippy
-D warnings·cargo test --workspace(22 binaries, 0 failures; pipeline suite +2 tests) ·cargo doc -D warnings(workspace and standalone nereids-io) · mdBook · maturin build ·pixi run test-python(128 passed).🤖 Generated with Claude Code