Skip to content

Delete __run__ branches on every terminal state (MR-674)#43

Merged
aaltshuler merged 1 commit into
mainfrom
fix/mr-674-ephemeral-run-branches
Apr 21, 2026
Merged

Delete __run__ branches on every terminal state (MR-674)#43
aaltshuler merged 1 commit into
mainfrom
fix/mr-674-ephemeral-run-branches

Conversation

@aaltshuler

Copy link
Copy Markdown
Collaborator

Summary

Run branches are transactional scaffolding, not content — the durable audit lives on RunRecord. This PR enforces the invariant that every terminal state (Published, Aborted, Failed) deletes the __run__ branch, replacing the ad-hoc per-path cleanup with a single terminate_run helper.

Before this PR, all three terminal states leaked the run branch. (The MR-670 fix only added a defense-in-depth filter in schema_apply; nothing actually deleted branches.) Cleanup only happened opportunistically via branch_delete of the target, and even then Failed runs were skipped by the Published | Aborted filter.

Changes

crates/omnigraph/src/db/omnigraph.rs:

  • New terminate_run helper: appends terminal RunRecord, then deletes the run branch. Delete errors are swallowed — record is authoritative; cleanup_terminal_run_branches_for_target retries on later branch_delete of the target.
  • publish_run_as, abort_run, fail_run all go through terminate_run.
  • cleanup_terminal_run_branches_for_target filter now includes Failed (legacy-repo GC), and checks coordinator.all_branches() to skip branches already deleted by a concurrent handle — avoids Lance NotFound when two handles operate independently.
  • ensure_branch_delete_safe drops Failed from the rejection set — post-fix, Failed means the branch is gone, so blocking target deletion is unnecessary.

crates/omnigraph/tests/runs.rs:

  • New run_branches_do_not_accumulate_across_repeated_loads — the invariant test: 10 loads + 1 abort → branch_list() == [\"main\"].
  • New failed_load_deletes_run_branch — asserts the Failed path cleans up.
  • Rename abort_run_keeps_target_unchanged_and_preserves_hidden_branch_for_inspection..._and_deletes_run_branch, invert the assertion.
  • Rewrite public_{load,mutation}_preserves_staged_edge_ids_on_publish to capture staged IDs before publish (branch is gone after). Invariant is still covered — just measured differently.

omnigraph.rs inline MR-670 test: renamed to test_apply_schema_succeeds_after_load, inverted to assert the run branch is absent after publish.

Deferred to follow-up

  • --keep-run-branch debug flag on loader and run abort CLI (per MR-674 item 3).
  • omnigraph run gc one-shot for legacy repos — the cleanup filter now handles legacy Failed runs lazily during branch_delete, so a dedicated GC is nice-to-have not required.

Test plan

  • cargo test --workspace --no-fail-fast — all green (229 compiler + 61 engine lib + all integration suites + 33 CLI + 65 server + 41 openapi)
  • The 5 tests that previously asserted branch preservation all pass with inverted/rewritten assertions.
  • Cross-handle scenario (two Omnigraph handles, one publishes, the other deletes the target) verified green — the live_branches membership check prevents re-delete racing.

Closes MR-674.

🤖 Generated with Claude Code

Run branches are transactional scaffolding — the durable audit lives
on RunRecord. Invariant: every terminal state (Published, Aborted,
Failed) deletes the __run__ branch.

- Add `terminate_run` helper: appends terminal RunRecord, then
  deletes the run branch. Delete errors are swallowed — the record
  is authoritative; `cleanup_terminal_run_branches_for_target`
  retries on later `branch_delete` of the target.
- Wire into `publish_run_as`, `abort_run`, `fail_run`.
- Include `Failed` in the cleanup filter (was `Published | Aborted`
  only) for legacy-repo GC during branch_delete.
- Cleanup now checks `coordinator.all_branches()` first to skip
  branches already deleted by a concurrent handle — avoids Lance
  NotFound when two handles publish/clean up independently.
- Drop `Failed` from `ensure_branch_delete_safe` — post-fix, Failed
  means the branch is already gone, so there's no reason to block
  target deletion (MR-674 "Downstream effects").

Tests:
- New regression: `run_branches_do_not_accumulate_across_repeated_loads`
  — 10 loads + 1 abort → `branch_list() == ["main"]`.
- New `failed_load_deletes_run_branch` asserts Failed path cleans up.
- Rename `abort_run_keeps_target_unchanged_and_preserves_hidden_branch_for_inspection`
  → `abort_run_leaves_target_unchanged_and_deletes_run_branch`, invert
  the hidden-branch assertion.
- Rewrite `public_{load,mutation}_preserves_staged_edge_ids_on_publish`
  to capture staged IDs before publish instead of inspecting the run
  branch after (branch is gone now).
- Update MR-670 regression test to assert the run branch is *absent*
  after publish.

Deferred to follow-up: `--keep-run-branch` debug flag, `omnigraph run gc`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aaltshuler aaltshuler merged commit 102ccc0 into main Apr 21, 2026
4 checks passed
@aaltshuler aaltshuler deleted the fix/mr-674-ephemeral-run-branches branch April 21, 2026 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant