Ensure that pruning slabs/sectors in parallel doesn't leave orphaned slabs/sectors#1010
Ensure that pruning slabs/sectors in parallel doesn't leave orphaned slabs/sectors#1010chris124567 wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses #1009 by hardening Postgres slab/sector pruning against concurrency races that could leave orphaned slabs or sectors, and adds a regression test to reproduce/guard against those scenarios.
Changes:
- Add row-level locking in
unpinSlabsto serialize concurrent unpin/prune operations on the same slabs and their sectors. - Make slab stat decrements rely on the actual number of deleted slab rows (via
RowsAffected) to keep counters accurate under concurrency. - Add
TestSlabPruningConcurrentto exercise concurrent pruning across shared slabs, shared sectors, and account-pruning interactions.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| persist/postgres/slabs_test.go | Adds a multi-scenario concurrency test to detect orphaned slabs/sectors and verify stats return to zero after concurrent pruning. |
| persist/postgres/sectors.go | Adds explicit FOR UPDATE locking around slab and sector deletion checks and adjusts slab stat decrementing to use actual deletions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
3b00a57 to
2092763
Compare
peterjan
left a comment
There was a problem hiding this comment.
Verified test fails on master and there's no noticeable regression in BenchmarkUnpinSlab
| // slabs. The deletion checks below rely on committed account_slabs rows, so | ||
| // two transactions each removing one of the final references must not run | ||
| // their checks at the same time. | ||
| if _, err := tx.Exec(ctx, ` |
There was a problem hiding this comment.
As discussed in the meeting today we want to move unpinning slabs and sectors out of the hot path. So when a user prunes their account we limit the deletion to the account_slabs table and prune slabs and sectors in a background loop which should then avoid the concurrency issue.
There was a problem hiding this comment.
I missed that meeting but is what I just pushed roughly what we wanted to do?
|
@chris124567 reminder to update this |
|
Ok I will do this on Wednesday got tied up with s3d stuff |
2092763 to
a2f5708
Compare
Close #1009
Test fails on master and passes on this branch