Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
fd7b10c
Add benchmark/autoresearch harness for the flattening stage (Phase A)
mvdoc Jun 10, 2026
1b10aed
Add Tutte/LSCM flip-free init probe (Phase B primary experiment)
mvdoc Jun 10, 2026
b9da0b7
Add unit tests for Tutte flip-free init and area scaling
mvdoc Jun 10, 2026
9ec06cb
Add fast flatmap plotter for visual verification of experiments
mvdoc Jun 10, 2026
9d5982c
Add general autoresearch experiment runner
mvdoc Jun 10, 2026
0c9428c
Document experiment.py runner and plot.py in benchmark README
mvdoc Jun 10, 2026
03d5cd7
Add curated findings: Tutte init verified (n=4), phase-removal ablations
mvdoc Jun 10, 2026
9a93b44
Add optimizer-swap probe (L-BFGS/CG/Adam on the same energy)
mvdoc Jun 10, 2026
b366403
Findings: optimizer-swap is a dead end on this energy; multiscale is …
mvdoc Jun 10, 2026
1345622
Add spectral (manifold-harmonic) multigrid probe
mvdoc Jun 10, 2026
8b40b9e
Findings: multigrid + Adam optimizer experiments
mvdoc Jun 10, 2026
e8ee2f6
Add speed levers to experiment runner (line-search points, iters, smo…
mvdoc Jun 10, 2026
64dfbe1
Findings: how to speed up the FreeSurfer optimizer (~2.5x, near-free)
mvdoc Jun 10, 2026
025b302
Findings: validated 3.6x speedup (fast_ultimate, n=4)
mvdoc Jun 10, 2026
691f280
Findings: energy/quality angle - true-geodesic metric + correction re…
mvdoc Jun 10, 2026
620eff6
Findings: true-geodesic metric, local-vs-global, distance-optimal out…
mvdoc Jun 10, 2026
c52aaa8
Findings: ground objective in Fischl 1999 (metric distortion, no area…
mvdoc Jun 10, 2026
c7488d8
Bench: global true metric + target-scale sweep; correction ~1.10 wins…
mvdoc Jun 10, 2026
9f15f0f
Findings: target correction re-judged on global metric (cross-hemi sw…
mvdoc Jun 10, 2026
122db8d
Findings: broadened correction validation (8 hemis/7 subjects) - not …
mvdoc Jun 10, 2026
d717653
Findings: executive summary + refreshed next-ideas (autoresearch wrap…
mvdoc Jun 10, 2026
58457de
Validate fast config on 9 hemispheres: not overfit, ~3.4x at equal-or…
mvdoc Jun 10, 2026
b032017
Parallelize angular k-ring computation with a Numba prange kernel (~26x)
mvdoc Jun 11, 2026
e3ad610
Findings: record k-ring parallelization (~26x, bit-identical)
mvdoc Jun 11, 2026
df3d7e5
Findings: authoritative end-to-end timing (4 hemis, ~205s/hemi, ~4.3x…
mvdoc Jun 11, 2026
337337d
Projection Phase 1: FreeSurfer-free cut mapping, validated bit-identical
mvdoc Jun 11, 2026
5fdf8b2
Phase 2: refinement ablation probe + continuity toggle
mvdoc Jun 11, 2026
e19b11d
Phase 2: refinement ablation verdict — drop geodesic thinning, keep c…
mvdoc Jun 11, 2026
61038ab
Phase 2 (2): thick-but-smoothed cut — cuts flips, not distortion
mvdoc Jun 11, 2026
80e7053
Fix _get_k_rings_numba: parallelize via per-chunk scratch (was serial…
mvdoc Jun 11, 2026
a770571
Parallelize k-ring distance loop (_kring_distances_kernel)
mvdoc Jun 11, 2026
a074a8c
Vectorize per-vertex distortion loop in viz.compute_kring_distortion
mvdoc Jun 12, 2026
f53c77b
Add optimal-scale normalization + signed distortion to flatmap overlay
mvdoc Jun 12, 2026
2acdd60
Distance-optimal output scale: make saved flatmaps metrically faithfu…
mvdoc Jun 12, 2026
2988813
Findings §16: distance-optimal output scale shipped (multi-hemi valid…
mvdoc Jun 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions .claude/skills/autoflatten-autoresearch/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
name: autoflatten-autoresearch
description: >-
Resume the AutoFlatten flattening-optimization autoresearch effort. Use when the user
says things like "resume the autoflatten autoresearch", "continue optimizing the
flattening", "what have we tried so far", "run the flatten benchmark", or asks to
propose/test a new flattening method or config. Orients a fresh session to the plan,
the provenance ledger, the data, and the conventions, and drives the experiment loop.
---

# AutoFlatten autoresearch

A benchmark-driven effort to improve the **pyflatten flattening stage** (and ultimately
replace its FreeSurfer-clone optimizer with a principled method), with every experiment
logged so the *process* is shareable in a paper.

## 0. Recover state FIRST (before proposing anything)

1. Read the design + status: `benchmark/PLAN.md` and `benchmark/README.md` (in this repo).
2. Read the ledger to see everything tried so far, the current baseline, and the last
planned step:
- Ledger: `/data2/projects/autoflatten/ledger/experiments.jsonl` (append-only JSONL).
- Rendered notebook: `/data2/projects/autoflatten/NOTEBOOK.md`
(regenerate with `python -m benchmark.report`).
- Look at the most recent records' `metrics` (the multi-objective vector) and any
`decision.next_step`. **Do not re-run experiments already in the ledger.**
3. Confirm the manifest exists: `/data2/projects/autoflatten/manifest.json`
(else `python -m benchmark.build_dataset`).

## 1. Environment & constraints

- Use the repo's **uv** venv: `uv sync --extra bench`, then `.venv/bin/python`.
- **CPU only.** GPUs are blocked by driver 440 / CUDA 10.2 — do **not** install/attempt
CUDA jax unless the user says the driver was upgraded (≥525).
- Dev on a **tiny subset** (`--dev`, 2 subjects / a few hemispheres); scale to the full
82 only once a change looks good. A single hemisphere takes ~8–15 min on CPU.
- **Output locations are pinned:** code + docs in the repo; *all generated artifacts* go
to `/data2/projects/autoflatten/` (see `benchmark/paths.py`,
override `AUTOFLATTEN_BENCH_ROOT`). Never write generated files into the repo.
- Data is the public **Narratives** set (OpenNeuro `ds002345`); never the private
`all-subjects/` lab data.

## 2. Core objects

- `benchmark/harness.py` — `evaluate(entries, config, method=...)`; geometry +
k-ring caching; `register_flatten_fn(name, fn)` to add a method. A `flatten_fn` has
signature `flatten_fn(flattener) -> uv` (shape `(V, 2)`), scored uniformly.
- `benchmark/metrics.py` — `per_patch_metrics(uv, flattener)` and `aggregate(...)`.
Objective stays **geodesic distance distortion** (+ flips, robustness, runtime) — never
switch to pure conformality.
- `benchmark/ledger.py` — `new_record(kind, label, ...)` + `Ledger().append(record)`.

## 3. The experiment loop

For each idea:

1. **State a hypothesis** (what you expect to improve and why).
2. **Implement** a `flatten_fn` (or a `FlattenConfig` change). The primary open lead is a
**Tutte/LSCM flip-free init** that removes the ~4-min initial NAR phase — see
`benchmark/probe_tutte_init.py` (Phase B in the plan).
3. **Evaluate on the dev subset**, then promote to the full train split if promising:
```bash
python -m benchmark.run_baseline --dev # reference
# ... your probe/optimize script, same evaluate() harness ...
```
4. **Append a ledger record** with `kind`, `label`, `method`, `metrics`, `per_subject`,
`repro_command`, and a **decision trace** in `record.decision`:
`{hypothesis, rationale, conclusion, next_step}`.
5. **Commit** the code change in this repo (the ledger pins the commit SHA), then
`python -m benchmark.report` to refresh `NOTEBOOK.md`.
6. If it **Pareto-beats** the baseline (lower distortion and/or fewer flips and/or faster,
none worse), promote to the **full 82-subject** run and validate on the **holdout**
split before claiming a win.

## 4. Guardrails

- Every experiment is pinned to a git commit; commit before/after running.
- Determinism is assumed (one run/experiment) but **re-assert it** when you change the
optimizer (`run_baseline.py --check-determinism`).
- Never silently drop subjects — `evaluate` records per-subject `status="error"`; surface
failures in the conclusion.
- Keep the ledger append-only; to revise a result, append a new record.
23 changes: 23 additions & 0 deletions autoflatten/flatten/algorithm.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from .distance import (
compute_kring_geodesic_distances,
compute_kring_geodesic_distances_angular,
distance_optimal_scale,
)
from .energy import (
compute_2d_areas,
Expand Down Expand Up @@ -1837,6 +1838,28 @@ def run(self, snapshot_callback: Callable | None = None) -> np.ndarray:
snapshot_callback=_wrap_callback(snapshot_callback, "smoothing"),
)

# Distance-optimal output scale: replace the area-matched display scale with the
# single global scale that minimizes true-geodesic distance distortion, so the saved
# flat map is metrically faithful (surface and flat map are directly comparable).
dos = config.distance_optimal_scale
if dos.enabled:
ref_vertices = (
self.fiducial_vertices
if self.fiducial_vertices is not None
else self.vertices
)
s_opt = distance_optimal_scale(
ref_vertices,
self.faces,
uv,
n_sources=dos.n_sources,
seed=dos.seed,
)
centroid = uv.mean(axis=0)
uv = (uv - centroid) * s_opt + centroid
if verbose:
print(f"Distance-optimal output scale: x{s_opt:.4f}")

# Final stats
uv_jax = jnp.asarray(uv)
n_flipped_final = int(count_flipped_triangles(uv_jax, self.faces_jax))
Expand Down
40 changes: 40 additions & 0 deletions autoflatten/flatten/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,34 @@ class FinalNegativeAreaRemovalConfig:
iters_per_level: int = 30


@dataclass
class DistanceOptimalScaleConfig:
"""Configuration for the distance-optimal output scale.

The flattening's area-matching final scale (``s = sqrt(orig_area/total_area)``) is a
display convention, not part of the objective; it leaves the map ~6% too small versus
true geodesic distances. When enabled (default), after optimization the output is
rescaled by the single global scale that minimizes true-geodesic distance distortion,
computed from a heat-method geodesic sample on the patch, so the saved flatmap is
metrically faithful (surface and flat map are directly comparable). The optimum is tight
across subjects (~1.06, std ~0.008) and reduces global distance distortion on every
benchmark hemisphere.

Attributes
----------
enabled : bool
Whether to apply the distance-optimal rescale (default: True).
n_sources : int
Number of heat-geodesic source vertices sampled (deterministic).
seed : int
RNG seed for source sampling (keeps the result deterministic).
"""

enabled: bool = True
n_sources: int = 200
seed: int = 0


def _default_phases() -> list[PhaseConfig]:
"""Return default optimization phases matching FreeSurfer's 3 epochs.

Expand Down Expand Up @@ -294,6 +322,9 @@ class FlattenConfig:
spring_smoothing: SpringSmoothingConfig = field(
default_factory=SpringSmoothingConfig
)
distance_optimal_scale: DistanceOptimalScaleConfig = field(
default_factory=DistanceOptimalScaleConfig
)
phases: list[PhaseConfig] = field(default_factory=_default_phases)
print_every: int = 100
verbose: bool = True
Expand Down Expand Up @@ -341,6 +372,11 @@ def to_dict(self) -> dict:
"max_step_mm": self.spring_smoothing.max_step_mm,
"enabled": self.spring_smoothing.enabled,
},
"distance_optimal_scale": {
"enabled": self.distance_optimal_scale.enabled,
"n_sources": self.distance_optimal_scale.n_sources,
"seed": self.distance_optimal_scale.seed,
},
"phases": [
{
"name": p.name,
Expand Down Expand Up @@ -376,6 +412,9 @@ def from_dict(cls, data: dict) -> "FlattenConfig":
**data.get("final_negative_area_removal", {})
)
spring_smoothing = SpringSmoothingConfig(**data.get("spring_smoothing", {}))
distance_optimal_scale = DistanceOptimalScaleConfig(
**data.get("distance_optimal_scale", {})
)
phases_data = data.get("phases", _default_phases())
phases = [
p if isinstance(p, PhaseConfig) else PhaseConfig(**p) for p in phases_data
Expand All @@ -387,6 +426,7 @@ def from_dict(cls, data: dict) -> "FlattenConfig":
negative_area_removal=negative_area_removal,
final_negative_area_removal=final_negative_area_removal,
spring_smoothing=spring_smoothing,
distance_optimal_scale=distance_optimal_scale,
phases=phases,
print_every=data.get("print_every", 100),
verbose=data.get("verbose", True),
Expand Down
Loading