Delete __run__ branches on every terminal state (MR-674)#43
Merged
Conversation
Run branches are transactional scaffolding — the durable audit lives
on RunRecord. Invariant: every terminal state (Published, Aborted,
Failed) deletes the __run__ branch.
- Add `terminate_run` helper: appends terminal RunRecord, then
deletes the run branch. Delete errors are swallowed — the record
is authoritative; `cleanup_terminal_run_branches_for_target`
retries on later `branch_delete` of the target.
- Wire into `publish_run_as`, `abort_run`, `fail_run`.
- Include `Failed` in the cleanup filter (was `Published | Aborted`
only) for legacy-repo GC during branch_delete.
- Cleanup now checks `coordinator.all_branches()` first to skip
branches already deleted by a concurrent handle — avoids Lance
NotFound when two handles publish/clean up independently.
- Drop `Failed` from `ensure_branch_delete_safe` — post-fix, Failed
means the branch is already gone, so there's no reason to block
target deletion (MR-674 "Downstream effects").
Tests:
- New regression: `run_branches_do_not_accumulate_across_repeated_loads`
— 10 loads + 1 abort → `branch_list() == ["main"]`.
- New `failed_load_deletes_run_branch` asserts Failed path cleans up.
- Rename `abort_run_keeps_target_unchanged_and_preserves_hidden_branch_for_inspection`
→ `abort_run_leaves_target_unchanged_and_deletes_run_branch`, invert
the hidden-branch assertion.
- Rewrite `public_{load,mutation}_preserves_staged_edge_ids_on_publish`
to capture staged IDs before publish instead of inspecting the run
branch after (branch is gone now).
- Update MR-670 regression test to assert the run branch is *absent*
after publish.
Deferred to follow-up: `--keep-run-branch` debug flag, `omnigraph run gc`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Run branches are transactional scaffolding, not content — the durable audit lives on
RunRecord. This PR enforces the invariant that every terminal state (Published, Aborted, Failed) deletes the__run__branch, replacing the ad-hoc per-path cleanup with a singleterminate_runhelper.Before this PR, all three terminal states leaked the run branch. (The MR-670 fix only added a defense-in-depth filter in
schema_apply; nothing actually deleted branches.) Cleanup only happened opportunistically viabranch_deleteof the target, and even then Failed runs were skipped by thePublished | Abortedfilter.Changes
crates/omnigraph/src/db/omnigraph.rs:terminate_runhelper: appends terminalRunRecord, then deletes the run branch. Delete errors are swallowed — record is authoritative;cleanup_terminal_run_branches_for_targetretries on laterbranch_deleteof the target.publish_run_as,abort_run,fail_runall go throughterminate_run.cleanup_terminal_run_branches_for_targetfilter now includesFailed(legacy-repo GC), and checkscoordinator.all_branches()to skip branches already deleted by a concurrent handle — avoids LanceNotFoundwhen two handles operate independently.ensure_branch_delete_safedropsFailedfrom the rejection set — post-fix, Failed means the branch is gone, so blocking target deletion is unnecessary.crates/omnigraph/tests/runs.rs:run_branches_do_not_accumulate_across_repeated_loads— the invariant test: 10 loads + 1 abort →branch_list() == [\"main\"].failed_load_deletes_run_branch— asserts the Failed path cleans up.abort_run_keeps_target_unchanged_and_preserves_hidden_branch_for_inspection→..._and_deletes_run_branch, invert the assertion.public_{load,mutation}_preserves_staged_edge_ids_on_publishto capture staged IDs before publish (branch is gone after). Invariant is still covered — just measured differently.omnigraph.rsinline MR-670 test: renamed totest_apply_schema_succeeds_after_load, inverted to assert the run branch is absent after publish.Deferred to follow-up
--keep-run-branchdebug flag on loader andrun abortCLI (per MR-674 item 3).omnigraph run gcone-shot for legacy repos — the cleanup filter now handles legacy Failed runs lazily duringbranch_delete, so a dedicated GC is nice-to-have not required.Test plan
cargo test --workspace --no-fail-fast— all green (229 compiler + 61 engine lib + all integration suites + 33 CLI + 65 server + 41 openapi)Omnigraphhandles, one publishes, the other deletes the target) verified green — thelive_branchesmembership check prevents re-delete racing.Closes MR-674.
🤖 Generated with Claude Code