sim: return removed chunk hashes from remove_shard_dedup_entries#878
Open
sirahd wants to merge 1 commit into
Open
sim: return removed chunk hashes from remove_shard_dedup_entries#878sirahd wants to merge 1 commit into
sirahd wants to merge 1 commit into
Conversation
The DeletionControlableClient::remove_shard_dedup_entries operation discarded which dedup keys it reclaimed. Return the removed chunk hashes so callers (GC Stage 4) can audit them, mirroring the production CAS gc_delete_shard_dedup endpoint that already returns this list. - trait + LocalClient (collect deleted GLOBAL_DEDUP_TABLE keys) + MemoryClient (collect global_dedup keys before clear) now return Vec<MerkleHash> - local-server HTTP route serializes the list as JSON (Vec<HexMerkleHash>); SimulationControlClient parses it back - shared deletion tests assert the returned list reports the target shard's chunks and excludes others; the server suite covers the HTTP round-trip Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
assafvayner
approved these changes
Jun 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DeletionControlableClient::remove_shard_dedup_entriespreviously returnedResult<()>, discarding which global-dedup keys it reclaimed. The production CASgc_delete_shard_dedupendpoint already returns the removed chunk hashes; this brings the simulation clients in line so callers (GC Stage 4's dedup-removal audit) can observe the same data through the simulation harness.Changes
deletion_controls.rs):remove_shard_dedup_entries(&self, shard_hash) -> Result<Vec<MerkleHash>>.LocalClient: accumulates theGLOBAL_DEDUP_TABLEkeys it deletes (across the retry passes) and returns them.MemoryClient: collectsglobal_dedupkeys beforeclear()(the in-memory client holds a single shard, so all entries belong to it).Json(Vec<HexMerkleHash>);SimulationControlClientparses it back intoVec<MerkleHash>.test_remove_shard_dedup_entries_*cases now assert the returned list reports the target shard's chunks and excludes other shards'; the no-op case asserts an empty list. TheSimulationControlClientsuite + the builder-wired server test cover the full HTTP serialize→parse round-trip.Motivation / downstream
Consumed by
xet-garbage-collectionStage 4, which collects removed chunk hashes into an audit artifact (GC PR #102). That PR'sSimulationConnectorcurrently returns an empty list because this trait didn't surface it; once this lands and GC bumps its pinnedxet-corerev, the simulation path can carry the real chunk list end-to-end.Testing
cargo build -p xet-client --features simulation— clean.cargo test -p xet-client --features simulation deletion— 7 passed (LocalClient, memory backend, andSimulationControlClientHTTP suite).cargo clippy -r -p xet-client --features simulation -- -D warnings— clean. (Pre-existing--all-targetstest-code lints inhub_client/client.rsandchunk_cache/disk.rsare untouched here.)🤖 Generated with Claude Code
Note
Low Risk
Changes are confined to the simulation feature and deletion-control APIs; behavior for unknown shards remains a no-op with an empty list, with expanded test coverage.
Overview
remove_shard_dedup_entriesnow returnsVec<MerkleHash>of reclaimed global-dedup chunk keys, aligned with productiongc_delete_shard_dedup, so GC simulation can audit deregistered dedup entries.The
DeletionControlableClienttrait and all simulation backends (LocalClient,MemoryClient) collect and return the keys they delete. The local-serverDELETE /simulation/shards/{hash}/dedup_entriesroute responds withJson(Vec<HexMerkleHash>)instead of an empty body;SimulationControlClientdeserializes that list. Shared deletion tests and the builder-wired HTTP test assert the returned set matches the target shard and is empty for unknown shards.Reviewed by Cursor Bugbot for commit 552fd56. Bugbot is set up for automated code reviews on this repo. Configure here.