fix(replay): dispatch source_timeline_id so handler reads edited steps#167
Conversation
When a user edits a step on a fork timeline and clicks "Run replay", the webhook payload's replay_context_timeline_id was set to the newly created replay fork's ID. The handler then fetched the step from that fork, which inherited the original unedited content from the parent chain — ignoring the user's edit entirely. Fix: send the source_timeline_id (the timeline the user was viewing when they clicked replay) so the handler reads the edited step content. Co-authored-by: Cursor <cursoragent@cursor.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
risjai
left a comment
There was a problem hiding this comment.
Review: fix(replay): dispatch source_timeline_id so handler reads edited steps
Overview
One-line change: the third tuple element in the CreateAndDispatch arm flips from fork.id (the freshly-forked timeline created by engine.fork() 15 lines above) to a.source_timeline_id.clone(). That value flows into:
timeline_for_payload→ thereplay_context_timeline_idfield of the outbound webhook payload.- The
fork_timeline_idfield of the HTTP response back to the dashboard.
The diff is small but it shifts a contract that has had several rounds of careful reasoning attached to it. A few things to confirm before merge.
Concerns
1. Semantic correctness — does the runner write the right place?
The runner SDK uses replay_context_timeline_id for ExplicitClient.attach_replay_context() so live cache misses get recorded into the fork (per the docstring on DispatchPayload.replay_context_timeline_id and PR #154 F2). If we now hand the runner source_timeline_id, doesn't that mean live recordings during replay land on the source timeline — i.e. the recording the user is trying to preserve? That would corrupt the source rather than the fork.
The replay context itself was created at fork.id (line ~180: store.create_replay_context(&ctx_id, &session.id, &fork.id, 0)), so the cursor advances on the fork — but the timeline_id passed to the runner is now the source. These two pointers used to agree; now they disagree.
If this is intentional (edits on the source need to be visible to multi-turn-history reads, and the runner doesn't actually write back to replay_context_timeline_id), the docstring on DispatchPayload.replay_context_timeline_id in python/rewind_agent/runner.py needs updating to match the new contract.
2. Inconsistency between dispatch shapes
CreateAndDispatchnow returnsSome(a.source_timeline_id.clone())— the parent timeline of the new fork.ReuseContext(line ~232) still returnsSome(ctx.timeline_id)— the fork the context is bound to.
The two branches now produce semantically different values for the same field. Either fix both (if source_timeline_id is correct, ReuseContext should resolve ctx.timeline_id's parent_timeline_id), or rename to make the branch difference explicit.
3. The variable name fork_timeline_id is now a lie
It's still bound from the local destructuring (replay_context_id, fork_timeline_id, at_step) = match req { ... } and the same name is used as the response field. After this PR it holds the source timeline in CreateAndDispatch and the fork in ReuseContext. A future reader (or a dashboard consumer) who trusts the name will draw wrong conclusions. Suggest renaming to replay_target_timeline_id or similar across both the match arm and the response struct.
4. The freshly-created fork is now orphaned from the response
engine.fork(...) returns a fork whose .id is no longer surfaced anywhere — the dashboard sees source_timeline_id echoed back as fork_timeline_id. If a UI (existing or future) wants to navigate to the fork that was just created, it can't from this response; it'd have to refetch all timelines and find the newest one with parent_timeline_id == source. Worth keeping the actual fork id in the response under a separate field (new_fork_timeline_id?) and only changing the dispatch payload field.
5. Test coverage
The PR description says "cargo test -p rewind-web passes" — but no new test reproduces the bug. A regression test would be cheap here: seed a fork with an edited step, POST /api/sessions/{sid}/replay-jobs shape A, and assert the dispatched payload's replay_context_timeline_id equals source_timeline_id. The deploy checkbox is still unchecked, which is the only proof of correctness right now.
Suggested path forward
If the fix is right, ship it with:
- A regression test that pins the new contract.
- A docstring update on the Python
DispatchPayload.replay_context_timeline_idfield reflecting that it's now the source (where reads come from), not the fork. - Either fix the ReuseContext branch to match, or split the field semantics deliberately.
- Either rename the local
fork_timeline_idbinding or keep the actual fork id surfaced in the response for the dashboard.
If there's any doubt about the runner-write semantics (concern #1 above), please walk through the runner SDK once more — the cost of getting it wrong here is silent corruption of the source recording, which won't manifest until someone replays a session and notices the original is now divergent.
| .map_err(internal)?; | ||
| } | ||
| (ctx_id, Some(fork.id), a.at_step) | ||
| (ctx_id, Some(a.source_timeline_id.clone()), a.at_step) |
There was a problem hiding this comment.
Did you intend to also flip the response's fork_timeline_id? The same tuple value is used twice: as timeline_for_payload for the webhook (the bug you're fixing), and as the fork_timeline_id field of CreateReplayJobResponse returned to the dashboard. After this change, the response's fork_timeline_id is now the source timeline, not the freshly-created fork. The actual fork.id (returned by engine.fork() two statements above) is no longer surfaced anywhere.
A dashboard that uses the response to navigate to or display the newly-created fork now reads the wrong id. Suggest splitting the two concerns — keep fork.id in the response, send source_timeline_id only in the dispatch payload — by destructuring two separate locals here:
(ctx_id, fork.id.clone(), Some(a.source_timeline_id.clone()), a.at_step)and threading both through the response + payload separately.
| .map_err(internal)?; | ||
| } | ||
| (ctx_id, Some(fork.id), a.at_step) | ||
| (ctx_id, Some(a.source_timeline_id.clone()), a.at_step) |
There was a problem hiding this comment.
Inconsistency between dispatch shapes. With this change, CreateAndDispatch returns the source timeline as the second tuple element, but ReuseContext (a few lines below, ~232) still returns ctx.timeline_id — the fork the replay context is bound to. The two paths now mean different things by the same field.
If the bug is "the runner needs to read from the timeline that has the edits", ReuseContext likely has the same bug for users who edit a step on the fork-the-context-targets and then re-dispatch via shape B. Worth resolving ctx.timeline_id's parent_timeline_id (or whichever timeline owns the edit) and using that here too — or document why the two paths legitimately differ.
| .map_err(internal)?; | ||
| } | ||
| (ctx_id, Some(fork.id), a.at_step) | ||
| (ctx_id, Some(a.source_timeline_id.clone()), a.at_step) |
There was a problem hiding this comment.
Worth confirming the runner-write semantics. The Python SDK's DispatchPayload.replay_context_timeline_id docstring says runners pass this to ExplicitClient.attach_replay_context() so "live cache misses record into the fork". After this change, the runner is told the timeline is source_timeline_id — so wouldn't a live cache miss during replay record onto the source recording the user is trying to preserve, rather than the new fork?
The replay context itself was created at fork.id (the create_replay_context call ~5 lines up), so the cursor advances on the fork — but the timeline id handed to the runner is now the source. These two pointers used to agree.
If the runner only reads from replay_context_timeline_id (e.g. to reconstruct multi-turn history at at_step-1) and writes go through the replay context, this is fine — but please update python/rewind_agent/runner.py's docstring on the field to reflect the new contract. If writes also go to this timeline, this is a regression that corrupts the source on replay.
| .map_err(internal)?; | ||
| } | ||
| (ctx_id, Some(fork.id), a.at_step) | ||
| (ctx_id, Some(a.source_timeline_id.clone()), a.at_step) |
There was a problem hiding this comment.
No regression test for the bug being fixed. A short test would lock in the contract: seed a fork timeline with an edited step, POST /api/sessions/{sid}/replay-jobs (shape A) with source_timeline_id set to that fork, and capture the dispatched webhook body — assert replay_context_timeline_id == source_timeline_id. Without it, the next refactor in this area can silently re-introduce the bug, and the deploy checkbox is the only proof this works.
Address review comments on #167: - Response `fork_timeline_id` now correctly returns the new fork ID (for dashboard navigation), not the source timeline. - Webhook payload sends both `replay_context_timeline_id` (write target = the fork, for attach_replay_context) and `source_timeline_id` (read target = timeline with user edits, for fetching step content). - Python SDK adds `source_timeline_id` field and `read_timeline_id` property (falls back to replay_context_timeline_id for back-compat). - ray-agent reads from `payload.read_timeline_id` while writing via `replay_context_timeline_id`. - Regression test asserts the two IDs differ for shape-A dispatches. - Documents why ReuseContext legitimately has both IDs pointing to the same timeline (edits already live on the context's bound fork). Co-authored-by: Cursor <cursoragent@cursor.com>
risjai
left a comment
There was a problem hiding this comment.
All four review comments addressed cleanly in d2fe59b:
- ✅ Response
fork_timeline_idpreserved — split into separate locals (fork_timeline_idfor response,dispatch_timeline_idfor the read pointer in payload). The actual fork id fromengine.fork()is back in the dashboard response. - ✅
ReuseContextsemantics documented — both branches now produce a (write, read) pair. CreateAndDispatch returns(fork.id, source_timeline_id); ReuseContext returns(ctx.timeline_id, ctx.timeline_id)because edits live on the fork the context targets. The block comment at line ~158 makes this explicit. - ✅ Python SDK updated —
DispatchPayloadgainssource_timeline_idplus aread_timeline_idproperty with documented fallback toreplay_context_timeline_idfor pre-0.15.1 servers. Docstring rewritten to spell out the read/write split. - ✅ Regression tests added on both sides —
dispatch_payload_separates_source_and_fork_timeline_idsin Rust pins the contract (write != read for shape A, write == response fork id, read == source); two Python tests cover the new field + fallback path.
The write_timeline / read_timeline rename is a nice readability bonus over the old timeline_for_payload.
One note for follow-up (not blocking): the pre-0.15.1 fallback (source_timeline_id absent → read_timeline_id returns the fork) preserves the OLD buggy behavior for OLD servers. Worth a CHANGELOG line so users running an SDK 0.16.x against a server <0.15.1 know edits-on-fork won't be honored until they upgrade the server.
LGTM. (Self-approve blocked by GitHub.)
Address review comments on #167: - Response `fork_timeline_id` now correctly returns the new fork ID (for dashboard navigation), not the source timeline. - Webhook payload sends both `replay_context_timeline_id` (write target = the fork, for attach_replay_context) and `source_timeline_id` (read target = timeline with user edits, for fetching step content). - Python SDK adds `source_timeline_id` field and `read_timeline_id` property (falls back to replay_context_timeline_id for back-compat). - ray-agent reads from `payload.read_timeline_id` while writing via `replay_context_timeline_id`. - Regression test asserts the two IDs differ for shape-A dispatches. - Documents why ReuseContext legitimately has both IDs pointing to the same timeline (edits already live on the context's bound fork). Co-authored-by: Cursor <cursoragent@cursor.com>
d2fe59b to
707ca42
Compare
risjai
left a comment
There was a problem hiding this comment.
Confirmed in 707ca42 — fallback removed entirely. DispatchPayload now treats source_timeline_id, at_step, replay_context_timeline_id, and dispatch_token as required fields (from_json does body["…"] not body.get(…), and the dataclass annotations dropped Optional[…] = None). read_timeline_id property + the two back-compat tests are gone, and _canonical_body() makes the test surface a single source of truth.
A pre-0.15.1 server would now produce KeyError rather than silently doing the wrong thing — which is the right answer when versions are pinned together. LGTM.
Summary
replay_context_timeline_id. The handler then fetched the step from that fork, which inherited the original unedited content — ignoring the user's edit.source_timeline_id(the timeline with the edit) instead of the replay fork's ID.Test plan
cargo test -p rewind-webpasses (all tests green)Made with Cursor