fix(platform-wallet): spv error propagation#3810
Conversation
📝 WalkthroughWalkthroughThis PR refactors SPV startup to separate client initialization from background loop spawning. It removes the ChangesSPV Startup and Run Loop Lifecycle
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
f339f1e to
00bc228
Compare
00bc228 to
db78106
Compare
|
✅ Review complete (commit db78106) |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/rs-platform-wallet/src/spv/runtime.rs (1)
196-215:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake the task slot reusable after a finished run loop and guard it atomically.
existing.is_some()treats a completedJoinHandleas “still running”, so afterrun()exits the next successful start can return through FFI without ever scheduling sync again. The unlocked check/store split also lets two callers race past the guard and spawn duplicate run loops on the same client.Suggested fix
pub fn spawn_run_loop(self: &Arc<Self>) { - { - let existing = self.task.lock().expect("spv task mutex poisoned"); - if existing.is_some() { - tracing::warn!( - "spawn_in_background called while a task is already running; ignoring" - ); - return; - } - } - let this = Arc::clone(self); + let mut task = self.task.lock().expect("spv task mutex poisoned"); + if matches!(task.as_ref(), Some(handle) if !handle.is_finished()) { + tracing::warn!( + "spawn_run_loop called while a task is already running; ignoring" + ); + return; + } + task.take(); let handle = tokio::spawn(async move { if let Err(e) = this.run().await { tracing::warn!("SpvRuntime background run loop exited with error: {}", e); } }); - *self.task.lock().expect("spv task mutex poisoned") = Some(handle); + *task = Some(handle); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/rs-platform-wallet/src/spv/runtime.rs` around lines 196 - 215, spawn_run_loop currently treats any Some(JoinHandle) as "running" and performs the check/store outside a single mutex hold, allowing races and preventing reuse of finished handles; change it so you hold the task mutex across the check-and-set: lock self.task, if Some(h) && !h.is_finished() then warn+return, otherwise create the tokio::spawn handle while still holding the lock and store it into *self.task; additionally wrap the spawned future so when this.run().await finishes you re-lock self.task and set it to None (so finished JoinHandles are removable); reference symbols: spawn_run_loop, task, run, JoinHandle::is_finished.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@packages/rs-platform-wallet/src/spv/runtime.rs`:
- Around line 196-215: spawn_run_loop currently treats any Some(JoinHandle) as
"running" and performs the check/store outside a single mutex hold, allowing
races and preventing reuse of finished handles; change it so you hold the task
mutex across the check-and-set: lock self.task, if Some(h) && !h.is_finished()
then warn+return, otherwise create the tokio::spawn handle while still holding
the lock and store it into *self.task; additionally wrap the spawned future so
when this.run().await finishes you re-lock self.task and set it to None (so
finished JoinHandles are removable); reference symbols: spawn_run_loop, task,
run, JoinHandle::is_finished.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2407b569-6c40-458a-a34d-74e5f4f4aac6
📒 Files selected for processing (4)
packages/rs-platform-wallet-ffi/src/spv.rspackages/rs-platform-wallet/src/error.rspackages/rs-platform-wallet/src/manager/accessors.rspackages/rs-platform-wallet/src/spv/runtime.rs
💤 Files with no reviewable changes (1)
- packages/rs-platform-wallet/src/error.rs
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
PR correctly propagates SPV start errors to the FFI caller, but breaks the SPV integration test which still uses the now-private run(config) signature — cargo test -p platform-wallet will fail to compile. The new split-lifecycle also opens a half-success window where spawn_run_loop silently no-ops on a stale JoinHandle, and the synchronous start() is now polled on the (small-stack) FFI caller thread instead of via block_on_worker. Several stale references to the old spawn_in_background name remain.
🔴 1 blocking | 🟡 2 suggestion(s) | 💬 3 nitpick(s)
4 additional finding(s) omitted (not in diff).
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `packages/rs-platform-wallet/tests/spv_sync.rs`:
- [BLOCKING] packages/rs-platform-wallet/tests/spv_sync.rs:223-227: Integration test breaks compilation: `run(config)` no longer exists
`SpvRuntime::run` is now a private, zero-argument helper, but this `#[ignore]`d test still calls `manager_for_spv.spv().run(config).await`. `#[ignore]` only skips execution — cargo still typechecks the file, so `cargo test -p platform-wallet --test spv_sync` will fail to compile after this PR. Update the test to drive the new lifecycle: `start(config).await` followed by `spv_arc().spawn_run_loop()` (and await `stop()` plus the join handle on teardown).
In `packages/rs-platform-wallet/src/spv/runtime.rs`:
- [SUGGESTION] packages/rs-platform-wallet/src/spv/runtime.rs:196-216: `spv_start` can return Ok with the client started but no run loop driving it
The new split lifecycle reintroduces the silent-success failure mode the PR is trying to eliminate. `run()` clears `self.client` when it exits naturally, but never clears `self.task`, so the `JoinHandle` of a previously error-exited run loop stays parked in `self.task`. On the next FFI call:
1. `start()` succeeds and stores a fresh client.
2. `spawn_run_loop()` observes `self.task.is_some()`, emits a `warn!`, and returns.
3. The FFI returns `PlatformWalletFFIResult::ok()` — but no future is driving `client.run()`.
Additionally, the check-and-set on `self.task` releases the mutex between the `is_some()` check and the `*self.task = Some(handle)` write, so two concurrent spawners can both observe `None` and both spawn run loops on the same client (one handle is then dropped and orphaned).
Fix both at once by: (a) clearing `self.task` (or replacing with `JoinHandle::is_finished()` + take) at the end of `run()`, and (b) holding the `self.task` mutex across the `tokio::spawn` and store so the check-and-set is atomic. Returning a `Result` from `spawn_run_loop` and surfacing it through the FFI would also let the caller observe the failure instead of relying on logs.
In `packages/rs-platform-wallet-ffi/src/spv.rs`:
- [SUGGESTION] packages/rs-platform-wallet-ffi/src/spv.rs:361-366: Poll `spv().start(config)` on the worker runtime, not the FFI caller thread
`runtime().block_on(manager.spv().start(config))` polls the new (substantial) startup future — `PeerNetworkManager::new`, `DiskStorageManager::new`, `DashSpvClient::new` — directly on the foreign caller's thread. `packages/rs-platform-wallet-ffi/src/runtime.rs` explicitly documents that iOS dispatch/concurrency threads have ~512 KB stacks and that async FFI work should use `block_on_worker`, which parks the caller on a oneshot and runs the future on the 8 MB-stack worker. The rest of this crate consistently follows that pattern. Previously this work happened inside `tokio::spawn` on the runtime; the PR's new ordering reintroduces the exact stack-sensitive pattern the runtime module was built to avoid. Switch to `block_on_worker` using the `Arc<SpvRuntime>` so the future is `Send + 'static`.
| let start_result = runtime().block_on(manager.spv().start(config)); | ||
|
|
||
| if start_result.is_ok() { | ||
| let _guard = runtime().enter(); | ||
| manager.spv_arc().spawn_run_loop(); | ||
| } |
There was a problem hiding this comment.
🟡 Suggestion: Poll spv().start(config) on the worker runtime, not the FFI caller thread
runtime().block_on(manager.spv().start(config)) polls the new (substantial) startup future — PeerNetworkManager::new, DiskStorageManager::new, DashSpvClient::new — directly on the foreign caller's thread. packages/rs-platform-wallet-ffi/src/runtime.rs explicitly documents that iOS dispatch/concurrency threads have ~512 KB stacks and that async FFI work should use block_on_worker, which parks the caller on a oneshot and runs the future on the 8 MB-stack worker. The rest of this crate consistently follows that pattern. Previously this work happened inside tokio::spawn on the runtime; the PR's new ordering reintroduces the exact stack-sensitive pattern the runtime module was built to avoid. Switch to block_on_worker using the Arc<SpvRuntime> so the future is Send + 'static.
| let start_result = runtime().block_on(manager.spv().start(config)); | |
| if start_result.is_ok() { | |
| let _guard = runtime().enter(); | |
| manager.spv_arc().spawn_run_loop(); | |
| } | |
| let spv = manager.spv_arc(); | |
| let start_result = crate::runtime::block_on_worker(async move { spv.start(config).await }); | |
| if start_result.is_ok() { | |
| let _guard = runtime().enter(); | |
| manager.spv_arc().spawn_run_loop(); | |
| } | |
| start_result |
source: ['codex']
| /// Drive the sync loop of an already-[`start`]ed client until [`stop`] | ||
| /// is called. | ||
| /// [`spawn_run_loop`](Self::spawn_run_loop). | ||
| async fn run(&self) -> Result<(), PlatformWalletError> { |
There was a problem hiding this comment.
💬 Nitpick: Malformed rustdoc on run() leaves an orphan link line
The doc comment ends with a standalone /// [spawn_run_loop](Self::spawn_run_loop). line that has no leading sentence — an editing residue from the rename. Rustdoc will render this as an orphan paragraph.
| /// Drive the sync loop of an already-[`start`]ed client until [`stop`] | |
| /// is called. | |
| /// [`spawn_run_loop`](Self::spawn_run_loop). | |
| async fn run(&self) -> Result<(), PlatformWalletError> { | |
| /// Drive the sync loop of an already-[`start`]ed client until [`stop`] | |
| /// is called. Typically invoked via | |
| /// [`spawn_run_loop`](Self::spawn_run_loop). | |
| async fn run(&self) -> Result<(), PlatformWalletError> { |
source: ['claude']
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## v3.1-dev #3810 +/- ##
============================================
+ Coverage 70.73% 71.20% +0.47%
============================================
Files 20 20
Lines 2788 2837 +49
============================================
+ Hits 1972 2020 +48
- Misses 816 817 +1
🚀 New features to boost your workflow:
|
Because start was being called inside the run method, and spawn in background was calling this run method without returning any start error, the spv client could fail to start and nobody was reciting the error. With this PR I force the start method to be called independently, this way start errors are no longer discarded.
Regression tests in PR #3712
Checklist:
For repository code-owners and collaborators only
Summary by CodeRabbit
Bug Fixes
Refactor