Skip to content

feat(gsdc2023): divergence-gated 2-stage residual re-solve (FGO VD path)#97

Open
rsasaki0109 wants to merge 2 commits into
mainfrom
feat/fgo-two-stage-residual-resolve
Open

feat(gsdc2023): divergence-gated 2-stage residual re-solve (FGO VD path)#97
rsasaki0109 wants to merge 2 commits into
mainfrom
feat/fgo-two-stage-residual-resolve

Conversation

@rsasaki0109

@rsasaki0109 rsasaki0109 commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Status. The unconditional re-solve is net wash with +10 cm regressions
(see "26-trip A/B"). A divergence-gated acceptance criterion
(--fgo-two-stage-divergence-p95-m, commit 0ad5894) turns it into a clean
zero-regression lever: across all 26 train trips it fires on exactly one trip
(lax-o) and leaves the other 25 bit-identical to baseline. Small but safe;
default off. Shippable as an opt-in divergence-rescue safety net.

What

Adds an opt-in 2-stage residual re-solve to the FGO VD solve path
(--fgo-two-stage-residual-resolve, default off), mirroring taroz's MATLAB FGO
outlier handling. After the FGO converges (pass 1), it recomputes the fixed-
linearization pseudorange residual at the converged state, masks rows whose
|residual| exceeds an L1/L5 threshold (20 m / 15 m, with a MIN_KEEP floor),
and re-solves warm-started (pass 2); on failure the pass-1 state is restored.

The problem: unconditional re-solve is net wash

Full 26-trip production-config A/B (taroz preset + --fgo-extra-constellations

  • --fgo-residual-mask-propagation), guard-off vs baseline, FGO standalone:
metric baseline unconditional 2-stage
aggregate score 1.6386 m 1.6283 m (-1.0 cm)
wins / reg / wash 6 / 6 / 14
excluding lax-o +28.0 cm (regression)

The single win (lax-o -54.6 cm) carries the whole aggregate; the rest is wash
plus three +10 cm regressions (mtv-b +11.5, mtv-pe1 +10.2, lax-t +10.0). The
Huber guard (--fgo-two-stage-residual-guard) washes everything — including
lax-o — back to baseline, because the dense-urban win raises the robust cost.

The discriminator: divergence gating

A per-chunk diagnostic probe across win / wash / regression trips found the
lax-o win is two separable parts:

pass-1 active residual p95 (per-chunk max) value
lax-o (the win) 38 840 m — two chunks where pass-1 diverged
every other trip (win, wash, and regression) ≤ 53 m

The divergence chunks stand alone by three orders of magnitude (position move
16 908 m vs ≤ 3 m elsewhere). Every regression trip masks only marginal
outliers on a healthy pass-1 fit. So gating the re-solve on pass-1 divergence
(p95 > 500 m) keeps the rescue and skips the marginal masking that causes the
regressions.

--fgo-two-stage-divergence-p95-m 500 (default 0 = disabled), 26-trip A/B:

metric divergence-gated (div500)
wins / reg / wash 1 / 0 / 25
gate fired on lax-o only — other 25 trips bit-identical to baseline
lax-o 2.2724 → 2.1797 (-9.3 cm, the divergence-rescue portion)
aggregate -0.36 cm/trip

Honest limitation

The gate recovers only -9.3 cm of lax-o's -54.6 cm unconditional win. The other
~-45 cm is pervasive marginal NLOS masking on healthy chunks — the same
operation that regresses lax-t/mtv-b by +10 cm — so it is structurally
inseparable from the regressions. The gated lever is therefore a small,
zero-regression divergence-rescue safety net, not a large score lever.

Tests

15 unit tests (tests/test_two_stage_residual_resolve.py): residual / threshold
/ mask / guard-cost helpers, the guard accept/reject/disabled paths, no-op
cases, state restore on pass-2 failure, and the divergence gate (skips a healthy
pass-1, fires on a diverged pass-1).

After the FGO converges, recompute the fixed-linearization pseudorange
residual at the converged state, mask rows whose |residual| exceeds an
L1/L5 threshold (20 m / 15 m, MIN_KEEP safety, worst-first), and re-solve
warm-started. Opt-in via --fgo-two-stage-residual-resolve, default off.

This is a selective dense-urban outlier killer. 15-trip pixel5 train A/B
(fgo standalone): lax-o (dense urban) -54.6cm / p95 -79cm, mtv-h -2.1cm,
four trips at most +0.5cm regression, the rest wash. Aggregate -3.8cm,
96% of it from lax-o. The win comes from dropping systematically biased
NLOS measurements so the position snaps toward truth.

A Huber trust-region guard is implemented behind --fgo-two-stage-residual-guard
but is OFF by default: it gates pass-2 on the full-set Huber residual cost,
which the dense-urban win *raises* (dropped outliers are re-added at their
original weight, Huber-capped), so the guard rejects exactly the re-solves
we want (validated: lax-o guard-on 2.268 ~= baseline 2.272, vs guard-off
1.727). Retained for experimentation only.

13 unit tests cover the residual/mask/guard-cost helpers and the
orchestration with a mock solver (accept/reject/keep-on-guard-off,
no-op without fixed linearization, restore on pass-2 failure).
@rsasaki0109 rsasaki0109 marked this pull request as draft June 14, 2026 01:10
@rsasaki0109 rsasaki0109 changed the title feat(gsdc2023): taroz-style 2-stage residual re-solve (FGO VD path, opt-in) feat(gsdc2023): taroz-style 2-stage residual re-solve (FGO VD path) — DRAFT Jun 14, 2026
…re-solve

Adds fgo_two_stage_divergence_p95_m (BridgeConfig, default 0 = disabled) and
CLI --fgo-two-stage-divergence-p95-m. When > 0, the 2-stage re-solve only fires
on chunks whose pass-1 fixed-linearization residual p95 (over active rows)
exceeds the threshold — i.e. only chunks where pass-1 diverged. Healthy chunks
are left at the pass-1 result, so the re-solve introduces no perturbation there.

Motivation: a per-chunk diagnostic probe across win / wash / regression trips
showed the unconditional re-solve's only net win (lax-o) is split into two parts:
a divergence rescue on a couple of chunks where pass-1 blew up to multi-km
residuals (pass-1 p95 ~39 km), and pervasive marginal NLOS masking on healthy
chunks (p95 ~30 m). Every regression trip (lax-t/mtv-b/mtv-pe1 +10..11 cm) does
only the marginal masking on a healthy fit (p95 <= ~50 m). The two are
indistinguishable by the marginal residuals themselves, but the divergence
chunks stand alone by orders of magnitude (pass-1 p95 38840 m vs <= 53 m
everywhere else, position move 16908 m vs <= 3 m).

Gating on pass-1 divergence (p95 > 500 m) cleanly captures the rescue and skips
the marginal masking. An 8-trip A/B (div500 vs baseline) confirms: every +10 cm
regression goes bit-identical to baseline, and lax-o keeps -9.3 cm (the
divergence-rescue portion of its -54.6 cm unconditional win). The marginal-mask
bulk of lax-o's win is inseparable from the regressions, so the gated lever is a
clean zero-regression rescue (1 win / 0 reg / 7 wash) rather than the larger but
net-wash unconditional lever.

The Huber guard cannot do this (the rescue raises the robust cost: lax-o
cost_ratio 1.32), so divergence gating is the correct discriminator. Default off
keeps legacy behaviour; independent of the guard flag. 2 new unit tests.
@rsasaki0109 rsasaki0109 marked this pull request as ready for review June 15, 2026 20:13
@rsasaki0109 rsasaki0109 changed the title feat(gsdc2023): taroz-style 2-stage residual re-solve (FGO VD path) — DRAFT feat(gsdc2023): divergence-gated 2-stage residual re-solve (FGO VD path) Jun 15, 2026
@rsasaki0109 rsasaki0109 force-pushed the feat/fgo-two-stage-residual-resolve branch 2 times, most recently from 9539356 to 61e00a0 Compare June 17, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant