feat(engine): Stage the delete path; retire the inline-delete residual#308
Conversation
…e (red) A delete <Node> cascades a delete_where into every incident edge type. The inline delete_where (Dataset::delete) advances Lance HEAD even when zero edges match, but the cascade records the new version only if deleted_rows > 0 — so a node with no incident edges leaves edge:Knows HEAD>manifest drift, which trips the next strict write's ExpectedVersionMismatch and repair refuses it. Red today: edge:Knows manifest=v5, Lance HEAD=v6. Goes green when delete moves to the staged two-phase path (iss-950, Lance 7.0 DeleteBuilder::execute_uncommitted), where a 0-row delete commits no Lance version and the deleted_rows>0 gate becomes correct by construction.
Lance's Dataset::delete commits a new version even when the predicate matches nothing (build_transaction always emits Operation::Delete), so a node delete that cascades a delete_where into an incident edge type with no matching edges advanced that edge table's Lance HEAD while the cascade skipped record_inline (gated on deleted_rows > 0) — leaving HEAD>manifest drift that wedged the next strict write and that repair refused as suspicious/unverifiable. Use Lance 7.0's two-phase DeleteBuilder::execute_uncommitted to read num_deleted_rows before committing: a no-match delete now advances nothing (no version, no drift) and the existing deleted_rows>0 gate is correct by construction. Non-zero deletes commit the staged transaction with skip_auto_cleanup + affected_rows (parity with the prior inline path). First step of the staged-delete migration (iss-950); turns the node_delete_with_no_incident_edges_leaves_no_edge_table_drift regression green.
Add TableStore::stage_delete (Lance 7.0 DeleteBuilder::execute_uncommitted), the two-phase analogue of stage_merge_insert: writes deletion files without advancing Lance HEAD, returns Option<StagedWrite> (None on 0 rows = true no-op), carrying the deletion-vector updated_fragments as new_fragments and the superseded originals as removed_fragment_ids so combine_committed_with_staged makes the deletion visible to in-query reads. No affected_rows is threaded: like stage_merge_insert's Operation::Update commit, the staged delete relies on OmniGraph's per-table write queue + manifest CAS, not Lance's per-dataset conflict resolver (commit_staged is a single attempt). Flip the two residual guards to the staged path: staged_writes.rs now asserts stage_delete does NOT advance HEAD and that a staged delete is read-your-writes visible (the deletion-vector RYW proof D2 retirement depends on); the lance_surface_guards delete guard pins execute_uncommitted's UncommittedDelete. No behavior change yet (callers still use delete_where); Step 1 wires them.
…(MR-A step 1a) Add stage_delete/Option<StagedHandle> to the TableStorage trait (delegates to TableStore::stage_delete). Migrate the two branch_merge delete sites (three-way RewriteMerged + adopt delta) from the inline delete_where residual to stage_delete + commit_staged — identical in shape to the stage_merge_insert + commit_staged pair above each. HEAD still advances within the merge sequence (via commit_staged), under the unchanged SidecarKind::BranchMerge Phase-B confirmation; the _pre_delete/_pre_index failpoints fire by position, unchanged. merge_truth_table, branching, composite_flow green.
…delete (MR-A step 1b/1c) Routes every delete through the staged write path so delete never advances Lance HEAD inline — the last inline-commit residual on the mutation path is gone. `MutationStaging` now accumulates delete predicates (`record_delete`) alongside pending write batches; at end-of-query `stage_all` combines a table's predicates into one `(p1) OR (p2) …` `stage_delete` (a deletion-vector transaction, no HEAD advance) and `commit_all` commits it through the same `commit_staged` path as inserts/updates. Deletes are now ordinary staged entries: one sidecar pin at `expected + 1`, no inline special-casing. Migrated callers (all 5): the 3 mutation.rs sites (delete-node, cascade, delete-edge) and the 2 merge.rs sites (already on stage_delete in step 1a). `affected_edges`/`affected` move from post-inline-commit `deleted_rows` to a committed `count_rows` at record time — exact under D₂, bounded by the cascade working set. A predicate matching zero rows stages nothing (the staged equivalent of the old "skip record_inline on 0 deleted rows"), so the zero-row edge-table drift class stays closed by construction. Retired scaffolding now that no caller remains: - `MutationStaging.inline_committed` + `record_inline` → `delete_predicates` + `record_delete`; `StagedMutation.inline_committed`/`paths` fields and all the `commit_all` inline handling (queue keys, sidecar pins with the `record_inline` table_version special-case, the inline recheck loop). - `open_table_for_mutation`'s post-inline-commit reopen branch (deletes no longer advance HEAD mid-query, so a second touch reopens at the pinned version like any write). - `InlineCommitResidual::delete_where` + its `TableStore` impl, the orphaned `TableStore::delete_where`, and `DeleteState`. `InlineCommitResidual` now carries only `create_vector_index` (Lance #6666 still open). D₂ stays for now: staged-delete read-your-writes doesn't yet compose into the pending accumulator (insert-then-delete on one table), so mixed insert/update/delete in one query is still rejected at parse time. Retiring D₂ is step 2. Doc comments updated to match across exec/, storage_layer, db/. Tests (all green): writes, consistency, validators, end_to_end, composite_flow, merge_truth_table, maintenance, recovery, staged_writes, forbidden_apis, lance_surface_guards, changes, point_in_time (286), plus failpoints (63).
…tep 1) Update the docs that described `delete` as the inline-commit residual now that MR-A routes it through `stage_delete`. Always-loaded surfaces (AGENTS.md rule 4 / capability matrix, invariants.md Invariant 4 / truth matrix / known gaps) plus the dev write-path docs (writes.md, execution.md incl. its mutation sequence diagram, architecture.md) now state: deletes accumulate as predicates and stage like inserts/updates, no inline HEAD advance; `InlineCommitResidual` carries only `create_vector_index` (Lance #6666). The parse-time D₂ rule is documented as retained — not because delete inline-commits, but because staged-delete read-your-writes is not yet wired into the pending accumulator (MR-A step 2). lance.md's 7.0 audit note marked MR-A as landed.
…ose-out) After MR-A staged the delete path, D₂ (a mutation query is insert/update-only OR delete-only) was left framed as temporary — "until Lance ships two-phase delete" / "retire in step 2". Lance shipped that and we used it for the inline-commit fix; D₂'s original justification is gone. It now stands for a different, permanent reason: keeping a query to one kind keeps its read-your-writes unambiguous and each table to one version per query. Retiring it would buy single-commit mixed atomicity (cheap workaround: split, or a branch) at the cost of an in-query delete view, pending pruning, edge id-resolution, and two-commit-per-table ordering in the hot mutation path — complexity not worth earning. Decision: keep D₂ as a deliberate boundary. Reframes the now-stale wording everywhere, no logic change: - The D₂ parse-time error message no longer promises "this restriction lifts when Lance exposes a two-phase delete API"; it states the boundary and points to a branch+merge for one atomic commit. - `enforce_no_mixed_destructive_constructive` doc, AGENTS.md, invariants.md (Invariant 4 / truth matrix / removed from the known-gaps), writes.md, architecture.md, lance.md, and the user mutations doc (which wrongly said deletes "commit through a different path" — both stage now). - Swept remaining stale `delete_where` mentions left from the Step-1 migration: the merge.rs "swap when upstream ships" comments (already swapped), the forbidden_apis / table_ops residual notes, the staged_writes vector-index guard doc (was "same as stage_delete's absence" — stage_delete now exists), and test comments/assert messages in recovery/maintenance/writes/failpoints. Genuinely-historical records (dated Lance audit, rfc-013, bug-case-fix) left. Verified: engine builds warning-free; check-agents-md OK; writes/maintenance/ recovery/staged_writes/forbidden_apis all green. Closes MR-A.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bfe679f416
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| row_count: delete_state.row_count, | ||
| version_metadata: delete_state.version_metadata, | ||
| }); | ||
| staging.record_delete(&table_key, pred_sql.clone()); |
There was a problem hiding this comment.
Avoid double-counting staged deletes
When a destructive query has two delete statements whose predicates overlap on the same table, this now records the first predicate without making it visible to the second statement, so the later committed-snapshot scan/count sees rows that are already scheduled for deletion. For example delete Person where name = $x followed by the same delete reports affected_nodes = 2 even though the combined staged delete removes one row; a node cascade followed by an explicit edge delete for the same edges similarly over-reports affected_edges. The old inline delete path counted after each committed delete, so overlapping statements did not double-count; the staged path needs to count from the combined delete view or de-duplicate affected IDs before returning the mutation result.
Useful? React with 👍 / 👎.
…ected_* (red) Reproduces a reporting regression from the staged-delete migration flagged in PR #308 review. Because deletes now stage (instead of inline-committing), two delete statements in one query both scan the same unchanged committed snapshot; counting each predicate independently over-reports `affected_*` when they overlap. The old inline path committed each delete before the next ran, so it counted distinct. `delete Person where name = "Alice"` then `delete Person where age > 29` over the standard fixture (Alice 30, Charlie 35) removes 2 distinct nodes and 3 distinct edges, but the buggy per-statement counting returns 3 nodes / 6 edges. RED at this commit (asserts left=3, right=2).
…ed_* Count each delete statement against the committed snapshot MINUS the predicates a prior delete statement on the same table already recorded: `(pred) AND NOT ((prior1) OR (prior2) …)`. Summed over statements this is inclusion-exclusion — `Σ |pₙ \ (p₁ ∪ …)| = |p₁ ∪ p₂ ∪ …|` — exactly the distinct count the combined `(p1) OR (p2)` staged delete removes. Works for nodes and edges alike with no edge identity needed; the node ID scan uses the same exclusion so a later statement also doesn't re-cascade already-deleted nodes. The ORIGINAL predicate is still what gets recorded (the staged delete removes the union); only the count uses the exclusion. The common single-delete path is unchanged (`prior` empty → filter is just the base predicate). New helper `dedup_delete_filter` + `MutationStaging::recorded_delete_predicates`. Turns the red regression test green (2 nodes / 3 edges); writes (33), end_to_end, validators, maintenance, recovery, composite_flow, merge_truth_table, consistency, changes, and failpoints (63) all stay green.
|
Both review bots flagged the same real issue — thanks. Valid and fixed. Because deletes now stage (instead of inline-committing), two delete statements in one query both scan the same unchanged committed snapshot, so counting each predicate independently over-reported Fix (212d5d1 red test → 92b33c0): count each statement against the committed snapshot minus the predicates a prior statement on the same table already recorded — Regression test |
Follow-up to the overlapping-delete fix flagged in PR #308 review (Greptile P1): the `(base) AND NOT (prior)` exclusion breaks under SQL three-valued logic. If a prior delete predicate references a NULLable column, a later statement's matching row whose column is NULL makes `prior` evaluate to UNKNOWN, `NOT UNKNOWN` is UNKNOWN, and the row is filtered out of the scan — even though the prior delete never matched it. That drops it from `deleted_ids`, skipping its cascade (orphaned edges) or, if it is the only match, leaving the node undeleted. A data bug, not just a miscount. Data: Charlie(age 35), Zoe(age NULL); Knows Zoe→Charlie. `delete Person where age > 30` then `delete Person where name = "Zoe"`. Under the buggy `NOT`, Zoe's scan `(name='Zoe') AND NOT (age>30)` is UNKNOWN → Zoe survives. RED at this commit (Person count left=1, right=0).
… prior rows Change `dedup_delete_filter` from `(base) AND NOT (prior)` to `(base) AND ((prior) IS NOT TRUE)`. `IS NOT TRUE` keeps both FALSE and UNKNOWN rows, so a prior predicate that evaluates to SQL UNKNOWN (a NULL in a referenced column) no longer drops a row this statement legitimately matches — only rows a prior predicate matched as definitely TRUE are excluded from the count/scan. The distinct-count semantics are unchanged for non-NULL data. Turns the red NULL-dedup test green (Zoe deleted, her edge cascaded), and the overlapping-dedup + writes/end_to_end/validators/maintenance/recovery/ composite_flow/consistency suites stay green.
|
Re: Greptile's "Nullable Deletes Can Skip" (the Root cause: SQL three-valued logic. Fix (598bec5 red test → 8e6d8cc): Regression test The earlier Codex/Greptile double-count comments (now re-anchored to lines 1419/1529) are the original issue, already addressed in 92b33c0. |
Self-review follow-up: the overlapping-delete dedup assumes the committed snapshot is invariant across a query's statements, which holds only because D₂ forbids mixing writes with deletes (so a delete-touched table has no pending writes). Make that dependency explicit at the function so a future D₂ relaxation is forced to revisit the dedup. Comment-only.
| ds: Arc<Dataset>, | ||
| transaction: Transaction, | ||
| ) -> Result<Dataset> { | ||
| pub async fn commit_staged(&self, ds: Arc<Dataset>, staged: StagedWrite) -> Result<Dataset> { |
There was a problem hiding this comment.
Public API Break
TableStore::commit_staged is exported through omnigraph::table_store, and this change makes it require a StagedWrite instead of the Transaction that callers could previously pass from staged.transaction. The same change also makes StagedWrite.transaction private without adding a replacement accessor, so downstream Rust code that stages a write and then commits it with the old call shape can no longer compile. If this module remains public, keep a compatibility path for the old transaction-based commit or provide an explicit migration surface.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
Migrates
row deletefrom omnigraph's last inline-commit residual to the staged two-phase write path, so delete never advances Lance HEAD before the manifest publish. Closes MR-A (iss-950).This fixes a deterministic, every-version drift bug: a node delete cascades into every incident edge type, and a zero-row cascade still committed a Lance version (Lance's
Dataset::deletealways commits) while the cascade skipped recording it — leavingedge:<T>HEAD ahead of the manifest, which wedged the next strict write and thatrepairrefused. Reproduced down toinit → load → delete → delete.What changed
TableStore::stage_deletetwo-phase primitive (Lance 7.0DeleteBuilder::execute_uncommitted, #6658); a 0-row delete is a true no-op (None, no version).exec/mutation.rs: delete-node, cascade, delete-edge; 2 inexec/merge.rs).affected_*now counted viacount_rowsat record time.record_inline,inline_committed, thecommit_allinline special-casing,open_table_for_mutation's post-inline-commit reopen branch,InlineCommitResidual::delete_where+ its impl, the orphanedTableStore::delete_where, andDeleteState.InlineCommitResidualnow carries onlycreate_vector_index(Lance #6666 still open).D₂ stays — as a deliberate boundary
D₂ (a mutation query is insert/update-only or delete-only) was framed as temporary "until Lance ships two-phase delete." Lance shipped it and we used it for the inline-commit fix, so D₂'s original justification is gone. It's kept for a different, permanent reason: a single query staying one-kind keeps its read-your-writes unambiguous and each table to one version per query. Full retirement (mixed insert+delete in one atomic mutation) was declined — it would add an in-query delete view, pending pruning, edge id-resolution, and two-commit-per-table ordering to the hot mutation path, for a capability with a cheap workaround (split into two mutations, or a branch for one atomic commit). The error message and docs now state this boundary rather than promising it lifts.
Test plan
staged_writes::stage_delete_does_not_advance_head_and_reads_through_staged(deletion-vector read-your-writes proof), thelance_surface_guardsdelete guard, and the red→green regressionwrites::node_delete_with_no_incident_edges_leaves_no_edge_table_drift.cargo test --workspace(writes, consistency, validators, end_to_end, composite_flow, merge_truth_table, maintenance, recovery, staged_writes, forbidden_apis, lance_surface_guards, changes, point_in_time) +--features failpoints. Engine builds warning-free;check-agents-mdOK.🤖 Generated with Claude Code
Note
High Risk
Changes the core mutation and merge commit protocol and manifest–Lance HEAD invariants; also breaks the exported
TableStore::commit_stagedcall shape for downstream Rust users.Overview
Deletes no longer advance Lance HEAD mid-query. Mutations and branch merges record delete predicates (with dedup for overlapping statements and SQL
IS NOT TRUEfor NULL-safe counts), thenstage_deletevia Lance 7.0DeleteBuilder::execute_uncommittedcommits with inserts/updates at the end-of-query boundary. The inline path (InlineCommitResidual::delete_where,inline_committed, post-delete reopen logic) is removed.StagedWrite/commit_stagednow carry Lanceaffected_rowsthrough staged deletes and merge-inserts so commits can rebase correctly;commit_stagedtakes a fullStagedWriteinstead of a bareTransaction(publictable_storesurface change).D₂ (no mixed insert/update and delete in one query) stays, documented as a permanent semantic boundary rather than a Lance workaround. Docs and guards are updated;
create_vector_indexremains the sole inline-commit residual.Reviewed by Cursor Bugbot for commit 457f09d. Bugbot is set up for automated code reviews on this repo. Configure here.
Greptile Summary
This PR moves row deletes onto the staged write path. The main changes are:
stage_deleteandcommit_staged.Confidence Score: 5/5
This looks safe to merge.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[Delete statement] --> B[Open strict mutation table] B --> C[Count committed rows with dedup filter] C -->|new matches| D[Record original delete predicate] C -->|zero new matches| E[Skip predicate] D --> F[MutationStaging stage_all] F --> G[Combine table predicates with OR] G --> H[TableStorage stage_delete] H -->|zero rows| I[No staged write and no HEAD advance] H -->|rows deleted| J[Staged write with deletion fragments] J --> K[commit_staged advances Lance HEAD] K --> L[Manifest publish records table version]%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A[Delete statement] --> B[Open strict mutation table] B --> C[Count committed rows with dedup filter] C -->|new matches| D[Record original delete predicate] C -->|zero new matches| E[Skip predicate] D --> F[MutationStaging stage_all] F --> G[Combine table predicates with OR] G --> H[TableStorage stage_delete] H -->|zero rows| I[No staged write and no HEAD advance] H -->|rows deleted| J[Staged write with deletion fragments] J --> K[commit_staged advances Lance HEAD] K --> L[Manifest publish records table version]Reviews (6): Last reviewed commit: "Merge origin/main into fix/staged-delete..." | Re-trigger Greptile
Context used: