gallantlab · mvdoc · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
diff --git a/.claude/skills/autoflatten-autoresearch/SKILL.md b/.claude/skills/autoflatten-autoresearch/SKILL.md
@@ -0,0 +1,82 @@
+---
+name: autoflatten-autoresearch
+description: >-
+  Resume the AutoFlatten flattening-optimization autoresearch effort. Use when the user
+  says things like "resume the autoflatten autoresearch", "continue optimizing the
+  flattening", "what have we tried so far", "run the flatten benchmark", or asks to
+  propose/test a new flattening method or config. Orients a fresh session to the plan,
+  the provenance ledger, the data, and the conventions, and drives the experiment loop.
+---
+
+# AutoFlatten autoresearch
+
+A benchmark-driven effort to improve the **pyflatten flattening stage** (and ultimately
+replace its FreeSurfer-clone optimizer with a principled method), with every experiment
+logged so the *process* is shareable in a paper.
+
+## 0. Recover state FIRST (before proposing anything)
+
+1. Read the design + status: `benchmark/PLAN.md` and `benchmark/README.md` (in this repo).
+2. Read the ledger to see everything tried so far, the current baseline, and the last
+   planned step:
+   - Ledger: `/data2/projects/autoflatten/ledger/experiments.jsonl` (append-only JSONL).
+   - Rendered notebook: `/data2/projects/autoflatten/NOTEBOOK.md`
+     (regenerate with `python -m benchmark.report`).
+   - Look at the most recent records' `metrics` (the multi-objective vector) and any
+     `decision.next_step`. **Do not re-run experiments already in the ledger.**
+3. Confirm the manifest exists: `/data2/projects/autoflatten/manifest.json`
+   (else `python -m benchmark.build_dataset`).
+
+## 1. Environment & constraints
+
+- Use the repo's **uv** venv: `uv sync --extra bench`, then `.venv/bin/python`.
+- **CPU only.** GPUs are blocked by driver 440 / CUDA 10.2 — do **not** install/attempt
+  CUDA jax unless the user says the driver was upgraded (≥525).
+- Dev on a **tiny subset** (`--dev`, 2 subjects / a few hemispheres); scale to the full
+  82 only once a change looks good. A single hemisphere takes ~8–15 min on CPU.
+- **Output locations are pinned:** code + docs in the repo; *all generated artifacts* go
+  to `/data2/projects/autoflatten/` (see `benchmark/paths.py`,
+  override `AUTOFLATTEN_BENCH_ROOT`). Never write generated files into the repo.
+- Data is the public **Narratives** set (OpenNeuro `ds002345`); never the private
+  `all-subjects/` lab data.
+
+## 2. Core objects
+
+- `benchmark/harness.py` — `evaluate(entries, config, method=...)`; geometry +
+  k-ring caching; `register_flatten_fn(name, fn)` to add a method. A `flatten_fn` has
+  signature `flatten_fn(flattener) -> uv` (shape `(V, 2)`), scored uniformly.
+- `benchmark/metrics.py` — `per_patch_metrics(uv, flattener)` and `aggregate(...)`.
+  Objective stays **geodesic distance distortion** (+ flips, robustness, runtime) — never
+  switch to pure conformality.
+- `benchmark/ledger.py` — `new_record(kind, label, ...)` + `Ledger().append(record)`.
+
+## 3. The experiment loop
+
+For each idea:
+
+1. **State a hypothesis** (what you expect to improve and why).
+2. **Implement** a `flatten_fn` (or a `FlattenConfig` change). The primary open lead is a
+   **Tutte/LSCM flip-free init** that removes the ~4-min initial NAR phase — see
+   `benchmark/probe_tutte_init.py` (Phase B in the plan).
+3. **Evaluate on the dev subset**, then promote to the full train split if promising:
+   ```bash
+   python -m benchmark.run_baseline --dev          # reference
+   # ... your probe/optimize script, same evaluate() harness ...
+   ```
+4. **Append a ledger record** with `kind`, `label`, `method`, `metrics`, `per_subject`,
+   `repro_command`, and a **decision trace** in `record.decision`:
+   `{hypothesis, rationale, conclusion, next_step}`.
+5. **Commit** the code change in this repo (the ledger pins the commit SHA), then
+   `python -m benchmark.report` to refresh `NOTEBOOK.md`.
+6. If it **Pareto-beats** the baseline (lower distortion and/or fewer flips and/or faster,
+   none worse), promote to the **full 82-subject** run and validate on the **holdout**
+   split before claiming a win.
+
+## 4. Guardrails
+
+- Every experiment is pinned to a git commit; commit before/after running.
+- Determinism is assumed (one run/experiment) but **re-assert it** when you change the
+  optimizer (`run_baseline.py --check-determinism`).
+- Never silently drop subjects — `evaluate` records per-subject `status="error"`; surface
+  failures in the conclusion.
+- Keep the ledger append-only; to revise a result, append a new record.
diff --git a/autoflatten/flatten/algorithm.py b/autoflatten/flatten/algorithm.py
@@ -22,6 +22,7 @@
 from .distance import (
     compute_kring_geodesic_distances,
     compute_kring_geodesic_distances_angular,
+    distance_optimal_scale,
 )
 from .energy import (
     compute_2d_areas,
@@ -1837,6 +1838,28 @@ def run(self, snapshot_callback: Callable | None = None) -> np.ndarray:
                 snapshot_callback=_wrap_callback(snapshot_callback, "smoothing"),
             )
 
+        # Distance-optimal output scale: replace the area-matched display scale with the
+        # single global scale that minimizes true-geodesic distance distortion, so the saved
+        # flat map is metrically faithful (surface and flat map are directly comparable).
+        dos = config.distance_optimal_scale
+        if dos.enabled:
+            ref_vertices = (
+                self.fiducial_vertices
+                if self.fiducial_vertices is not None
+                else self.vertices
+            )
+            s_opt = distance_optimal_scale(
+                ref_vertices,
+                self.faces,
+                uv,
+                n_sources=dos.n_sources,
+                seed=dos.seed,
+            )
+            centroid = uv.mean(axis=0)
+            uv = (uv - centroid) * s_opt + centroid
+            if verbose:
+                print(f"Distance-optimal output scale: x{s_opt:.4f}")
+
         # Final stats
         uv_jax = jnp.asarray(uv)
         n_flipped_final = int(count_flipped_triangles(uv_jax, self.faces_jax))

diff --git a/autoflatten/flatten/config.py b/autoflatten/flatten/config.py
@@ -218,6 +218,34 @@ class FinalNegativeAreaRemovalConfig:
     iters_per_level: int = 30
 
 
+@dataclass
+class DistanceOptimalScaleConfig:
+    """Configuration for the distance-optimal output scale.
+
+    The flattening's area-matching final scale (``s = sqrt(orig_area/total_area)``) is a
+    display convention, not part of the objective; it leaves the map ~6% too small versus
+    true geodesic distances. When enabled (default), after optimization the output is
+    rescaled by the single global scale that minimizes true-geodesic distance distortion,
+    computed from a heat-method geodesic sample on the patch, so the saved flatmap is
+    metrically faithful (surface and flat map are directly comparable). The optimum is tight
+    across subjects (~1.06, std ~0.008) and reduces global distance distortion on every
+    benchmark hemisphere.
+
+    Attributes
+    ----------
+    enabled : bool
+        Whether to apply the distance-optimal rescale (default: True).
+    n_sources : int
+        Number of heat-geodesic source vertices sampled (deterministic).
+    seed : int
+        RNG seed for source sampling (keeps the result deterministic).
+    """
+
+    enabled: bool = True
+    n_sources: int = 200
+    seed: int = 0
+
+
 def _default_phases() -> list[PhaseConfig]:
     """Return default optimization phases matching FreeSurfer's 3 epochs.
 
@@ -294,6 +322,9 @@ class FlattenConfig:
     spring_smoothing: SpringSmoothingConfig = field(
         default_factory=SpringSmoothingConfig
     )
+    distance_optimal_scale: DistanceOptimalScaleConfig = field(
+        default_factory=DistanceOptimalScaleConfig
+    )
     phases: list[PhaseConfig] = field(default_factory=_default_phases)
     print_every: int = 100
     verbose: bool = True
@@ -341,6 +372,11 @@ def to_dict(self) -> dict:
                 "max_step_mm": self.spring_smoothing.max_step_mm,
                 "enabled": self.spring_smoothing.enabled,
             },
+            "distance_optimal_scale": {
+                "enabled": self.distance_optimal_scale.enabled,
+                "n_sources": self.distance_optimal_scale.n_sources,
+                "seed": self.distance_optimal_scale.seed,
+            },
             "phases": [
                 {
                     "name": p.name,
@@ -376,6 +412,9 @@ def from_dict(cls, data: dict) -> "FlattenConfig":
             **data.get("final_negative_area_removal", {})
         )
         spring_smoothing = SpringSmoothingConfig(**data.get("spring_smoothing", {}))
+        distance_optimal_scale = DistanceOptimalScaleConfig(
+            **data.get("distance_optimal_scale", {})
+        )
         phases_data = data.get("phases", _default_phases())
         phases = [
             p if isinstance(p, PhaseConfig) else PhaseConfig(**p) for p in phases_data
@@ -387,6 +426,7 @@ def from_dict(cls, data: dict) -> "FlattenConfig":
             negative_area_removal=negative_area_removal,
             final_negative_area_removal=final_negative_area_removal,
             spring_smoothing=spring_smoothing,
+            distance_optimal_scale=distance_optimal_scale,
             phases=phases,
             print_every=data.get("print_every", 100),
             verbose=data.get("verbose", True),