Skip to content

[release/6.x] Cherry pick: Recovery and Join snapshot ledger offset fix (#7901)#7917

Draft
cjen1-msft wants to merge 9 commits into
microsoft:release/6.xfrom
cjen1-msft:backport-6-7901
Draft

[release/6.x] Cherry pick: Recovery and Join snapshot ledger offset fix (#7901)#7917
cjen1-msft wants to merge 9 commits into
microsoft:release/6.xfrom
cjen1-msft:backport-6-7901

Conversation

@cjen1-msft

Copy link
Copy Markdown
Contributor

backport #7901

Co-authored-by: Amaury Chamayou <amaury@xargs.fr>
@cjen1-msft cjen1-msft requested a review from a team as a code owner June 3, 2026 10:37
@achamayou

Copy link
Copy Markdown
Member

@cjen1-msft I think this needs something like #7918 as well as a rev up of the version in pyproject.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backport of #7901 to the 6.x release branch to fix an edge case where recovery/join from a snapshot whose seqno is beyond the end of the available ledger should still resume ledger writing correctly from the snapshot boundary. This is accompanied by new regression coverage in both Python E2E tests and C++ host ledger unit tests.

Changes:

  • Update host ledger truncation logic to support a recovery-mode “forward truncate” past current ledger end, and tighten file selection when reading by index.
  • Add new recovery/join regression tests which construct ledger variants with snapshot/ledger offsets and verify recovery/join succeeds.
  • Bump release version to 6.0.28 and document the fix in the changelog.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/recovery.py Adds an E2E recovery test that recovers from snapshot/ledger offset variants.
tests/reconfiguration.py Adds an E2E join-node test that joins from snapshot/ledger offset variants.
tests/infra/utils.py Adds helper to synthesize ledger chunk files for the new tests.
src/host/test/ledger.cpp Adds unit tests covering recovery-mode truncation behavior at/beyond ledger end.
src/host/ledger.h Implements recovery-mode forward truncation and improves file lookup bounds-checking.
python/pyproject.toml Bumps Python package version to 6.0.28.
CHANGELOG.md Adds 6.0.28 entry describing the fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/infra/utils.py
Comment thread CHANGELOG.md
Comment thread src/host/ledger.h
@cjen1-msft cjen1-msft added the run-long-test Run Long Test job label Jun 4, 2026
@cjen1-msft cjen1-msft marked this pull request as draft June 7, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-long-test Run Long Test job

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants