Skip to content

fix(platform-wallet): spv error propagation#3810

Open
ZocoLini wants to merge 1 commit into
v3.1-devfrom
fix/spv-error-propagation
Open

fix(platform-wallet): spv error propagation#3810
ZocoLini wants to merge 1 commit into
v3.1-devfrom
fix/spv-error-propagation

Conversation

@ZocoLini

@ZocoLini ZocoLini commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Because start was being called inside the run method, and spawn in background was calling this run method without returning any start error, the spv client could fail to start and nobody was reciting the error. With this PR I force the start method to be called independently, this way start errors are no longer discarded.

Regression tests in PR #3712

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

  • Bug Fixes

    • Improved SPV client startup validation to ensure proper initialization before entering the run loop.
    • Consolidated error handling for SPV client state, providing clearer error messages when the client is unavailable.
  • Refactor

    • Restructured SPV background initialization logic for better separation of concerns.

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR refactors SPV startup to separate client initialization from background loop spawning. It removes the SpvNotRunning error variant, replacing all occurrences with a consolidated SpvError, renames the background spawn API to expect pre-initialized clients, and updates the FFI layer to explicitly propagate startup results.

Changes

SPV Startup and Run Loop Lifecycle

Layer / File(s) Summary
Error variant consolidation
packages/rs-platform-wallet/src/error.rs
SpvNotRunning variant is removed; all SPV client state failures now use SpvError(String) with message "SPV Client not started".
SpvRuntime startup/loop separation
packages/rs-platform-wallet/src/spv/runtime.rs
run() becomes private and assumes the client is already initialized; spawn_in_background(config) is replaced with spawn_run_loop() that spawns the loop without accepting configuration; background task implementation updated to call the new run() signature.
Error mapping throughout SpvRuntime
packages/rs-platform-wallet/src/spv/runtime.rs
broadcast_transaction, get_quorum_public_key, clear_storage, and update_config all updated to use the new consolidated error message when SPV client is absent.
FFI layer startup propagation
packages/rs-platform-wallet-ffi/src/spv.rs
platform_wallet_manager_spv_start now calls start(config), checks the result, conditionally spawns the run loop, and propagates the startup result to the caller instead of masking errors.
Documentation reference update
packages/rs-platform-wallet/src/manager/accessors.rs
PlatformWalletManager::spv documentation updated to reference the new spawn_run_loop method.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • dashpay/platform#3729: Modifies SPV runtime lifecycle and run(config) method as part of cancellation token refactoring.
  • dashpay/platform#3730: Changes SpvRuntime lifecycle and client state handling in the same runtime file.
  • dashpay/platform#3763: Modifies the same FFI entrypoint platform_wallet_manager_spv_start with additional devnet_name validation.

Suggested labels

ready for final review

Suggested reviewers

  • QuantumExplorer
  • shumkov
  • thepastaclaw

🐰 A hop, a skip, in startup's dance,
The client starts now, given its chance,
Then spawn the loop, no config in hand,
Error messages clear, just as we planned,
SPV runs smooth 'cross the digital land!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(platform-wallet): spv error propagation' directly and specifically summarizes the main change: improving error propagation for SPV client startup failures.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/spv-error-propagation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ZocoLini ZocoLini changed the title Fix/spv error propagation fix(swift-sdk): spv error propagation Jun 8, 2026
@ZocoLini ZocoLini changed the title fix(swift-sdk): spv error propagation fix(platform-wallet): spv error propagation Jun 8, 2026
@ZocoLini ZocoLini force-pushed the fix/spv-error-propagation branch 2 times, most recently from f339f1e to 00bc228 Compare June 12, 2026 13:29
@ZocoLini ZocoLini force-pushed the fix/spv-error-propagation branch from 00bc228 to db78106 Compare June 12, 2026 13:36
@ZocoLini ZocoLini marked this pull request as ready for review June 12, 2026 13:39
@thepastaclaw

thepastaclaw commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

✅ Review complete (commit db78106)

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/rs-platform-wallet/src/spv/runtime.rs (1)

196-215: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make the task slot reusable after a finished run loop and guard it atomically.

existing.is_some() treats a completed JoinHandle as “still running”, so after run() exits the next successful start can return through FFI without ever scheduling sync again. The unlocked check/store split also lets two callers race past the guard and spawn duplicate run loops on the same client.

Suggested fix
     pub fn spawn_run_loop(self: &Arc<Self>) {
-        {
-            let existing = self.task.lock().expect("spv task mutex poisoned");
-            if existing.is_some() {
-                tracing::warn!(
-                    "spawn_in_background called while a task is already running; ignoring"
-                );
-                return;
-            }
-        }
-
         let this = Arc::clone(self);
+        let mut task = self.task.lock().expect("spv task mutex poisoned");
+        if matches!(task.as_ref(), Some(handle) if !handle.is_finished()) {
+            tracing::warn!(
+                "spawn_run_loop called while a task is already running; ignoring"
+            );
+            return;
+        }
+        task.take();
 
         let handle = tokio::spawn(async move {
             if let Err(e) = this.run().await {
                 tracing::warn!("SpvRuntime background run loop exited with error: {}", e);
             }
         });
 
-        *self.task.lock().expect("spv task mutex poisoned") = Some(handle);
+        *task = Some(handle);
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/rs-platform-wallet/src/spv/runtime.rs` around lines 196 - 215,
spawn_run_loop currently treats any Some(JoinHandle) as "running" and performs
the check/store outside a single mutex hold, allowing races and preventing reuse
of finished handles; change it so you hold the task mutex across the
check-and-set: lock self.task, if Some(h) && !h.is_finished() then warn+return,
otherwise create the tokio::spawn handle while still holding the lock and store
it into *self.task; additionally wrap the spawned future so when
this.run().await finishes you re-lock self.task and set it to None (so finished
JoinHandles are removable); reference symbols: spawn_run_loop, task, run,
JoinHandle::is_finished.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@packages/rs-platform-wallet/src/spv/runtime.rs`:
- Around line 196-215: spawn_run_loop currently treats any Some(JoinHandle) as
"running" and performs the check/store outside a single mutex hold, allowing
races and preventing reuse of finished handles; change it so you hold the task
mutex across the check-and-set: lock self.task, if Some(h) && !h.is_finished()
then warn+return, otherwise create the tokio::spawn handle while still holding
the lock and store it into *self.task; additionally wrap the spawned future so
when this.run().await finishes you re-lock self.task and set it to None (so
finished JoinHandles are removable); reference symbols: spawn_run_loop, task,
run, JoinHandle::is_finished.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2407b569-6c40-458a-a34d-74e5f4f4aac6

📥 Commits

Reviewing files that changed from the base of the PR and between 678d1f4 and db78106.

📒 Files selected for processing (4)
  • packages/rs-platform-wallet-ffi/src/spv.rs
  • packages/rs-platform-wallet/src/error.rs
  • packages/rs-platform-wallet/src/manager/accessors.rs
  • packages/rs-platform-wallet/src/spv/runtime.rs
💤 Files with no reviewable changes (1)
  • packages/rs-platform-wallet/src/error.rs

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

PR correctly propagates SPV start errors to the FFI caller, but breaks the SPV integration test which still uses the now-private run(config) signature — cargo test -p platform-wallet will fail to compile. The new split-lifecycle also opens a half-success window where spawn_run_loop silently no-ops on a stale JoinHandle, and the synchronous start() is now polled on the (small-stack) FFI caller thread instead of via block_on_worker. Several stale references to the old spawn_in_background name remain.

🔴 1 blocking | 🟡 2 suggestion(s) | 💬 3 nitpick(s)

4 additional finding(s) omitted (not in diff).

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-platform-wallet/tests/spv_sync.rs`:
- [BLOCKING] packages/rs-platform-wallet/tests/spv_sync.rs:223-227: Integration test breaks compilation: `run(config)` no longer exists
  `SpvRuntime::run` is now a private, zero-argument helper, but this `#[ignore]`d test still calls `manager_for_spv.spv().run(config).await`. `#[ignore]` only skips execution — cargo still typechecks the file, so `cargo test -p platform-wallet --test spv_sync` will fail to compile after this PR. Update the test to drive the new lifecycle: `start(config).await` followed by `spv_arc().spawn_run_loop()` (and await `stop()` plus the join handle on teardown).

In `packages/rs-platform-wallet/src/spv/runtime.rs`:
- [SUGGESTION] packages/rs-platform-wallet/src/spv/runtime.rs:196-216: `spv_start` can return Ok with the client started but no run loop driving it
  The new split lifecycle reintroduces the silent-success failure mode the PR is trying to eliminate. `run()` clears `self.client` when it exits naturally, but never clears `self.task`, so the `JoinHandle` of a previously error-exited run loop stays parked in `self.task`. On the next FFI call:

  1. `start()` succeeds and stores a fresh client.
  2. `spawn_run_loop()` observes `self.task.is_some()`, emits a `warn!`, and returns.
  3. The FFI returns `PlatformWalletFFIResult::ok()` — but no future is driving `client.run()`.

  Additionally, the check-and-set on `self.task` releases the mutex between the `is_some()` check and the `*self.task = Some(handle)` write, so two concurrent spawners can both observe `None` and both spawn run loops on the same client (one handle is then dropped and orphaned).

  Fix both at once by: (a) clearing `self.task` (or replacing with `JoinHandle::is_finished()` + take) at the end of `run()`, and (b) holding the `self.task` mutex across the `tokio::spawn` and store so the check-and-set is atomic. Returning a `Result` from `spawn_run_loop` and surfacing it through the FFI would also let the caller observe the failure instead of relying on logs.

In `packages/rs-platform-wallet-ffi/src/spv.rs`:
- [SUGGESTION] packages/rs-platform-wallet-ffi/src/spv.rs:361-366: Poll `spv().start(config)` on the worker runtime, not the FFI caller thread
  `runtime().block_on(manager.spv().start(config))` polls the new (substantial) startup future — `PeerNetworkManager::new`, `DiskStorageManager::new`, `DashSpvClient::new` — directly on the foreign caller's thread. `packages/rs-platform-wallet-ffi/src/runtime.rs` explicitly documents that iOS dispatch/concurrency threads have ~512 KB stacks and that async FFI work should use `block_on_worker`, which parks the caller on a oneshot and runs the future on the 8 MB-stack worker. The rest of this crate consistently follows that pattern. Previously this work happened inside `tokio::spawn` on the runtime; the PR's new ordering reintroduces the exact stack-sensitive pattern the runtime module was built to avoid. Switch to `block_on_worker` using the `Arc<SpvRuntime>` so the future is `Send + 'static`.

Comment on lines +361 to +366
let start_result = runtime().block_on(manager.spv().start(config));

if start_result.is_ok() {
let _guard = runtime().enter();
manager.spv_arc().spawn_run_loop();
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Poll spv().start(config) on the worker runtime, not the FFI caller thread

runtime().block_on(manager.spv().start(config)) polls the new (substantial) startup future — PeerNetworkManager::new, DiskStorageManager::new, DashSpvClient::new — directly on the foreign caller's thread. packages/rs-platform-wallet-ffi/src/runtime.rs explicitly documents that iOS dispatch/concurrency threads have ~512 KB stacks and that async FFI work should use block_on_worker, which parks the caller on a oneshot and runs the future on the 8 MB-stack worker. The rest of this crate consistently follows that pattern. Previously this work happened inside tokio::spawn on the runtime; the PR's new ordering reintroduces the exact stack-sensitive pattern the runtime module was built to avoid. Switch to block_on_worker using the Arc<SpvRuntime> so the future is Send + 'static.

Suggested change
let start_result = runtime().block_on(manager.spv().start(config));
if start_result.is_ok() {
let _guard = runtime().enter();
manager.spv_arc().spawn_run_loop();
}
let spv = manager.spv_arc();
let start_result = crate::runtime::block_on_worker(async move { spv.start(config).await });
if start_result.is_ok() {
let _guard = runtime().enter();
manager.spv_arc().spawn_run_loop();
}
start_result

source: ['codex']

Comment on lines +135 to +138
/// Drive the sync loop of an already-[`start`]ed client until [`stop`]
/// is called.
/// [`spawn_run_loop`](Self::spawn_run_loop).
async fn run(&self) -> Result<(), PlatformWalletError> {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: Malformed rustdoc on run() leaves an orphan link line

The doc comment ends with a standalone /// [spawn_run_loop](Self::spawn_run_loop). line that has no leading sentence — an editing residue from the rename. Rustdoc will render this as an orphan paragraph.

Suggested change
/// Drive the sync loop of an already-[`start`]ed client until [`stop`]
/// is called.
/// [`spawn_run_loop`](Self::spawn_run_loop).
async fn run(&self) -> Result<(), PlatformWalletError> {
/// Drive the sync loop of an already-[`start`]ed client until [`stop`]
/// is called. Typically invoked via
/// [`spawn_run_loop`](Self::spawn_run_loop).
async fn run(&self) -> Result<(), PlatformWalletError> {

source: ['claude']

@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.20%. Comparing base (f4ed60f) to head (db78106).
⚠️ Report is 15 commits behind head on v3.1-dev.

Additional details and impacted files
@@             Coverage Diff              @@
##           v3.1-dev    #3810      +/-   ##
============================================
+ Coverage     70.73%   71.20%   +0.47%     
============================================
  Files            20       20              
  Lines          2788     2837      +49     
============================================
+ Hits           1972     2020      +48     
- Misses          816      817       +1     
Components Coverage Δ
dpp ∅ <ø> (∅)
drive ∅ <ø> (∅)
drive-abci ∅ <ø> (∅)
sdk ∅ <ø> (∅)
dapi-client ∅ <ø> (∅)
platform-version ∅ <ø> (∅)
platform-value ∅ <ø> (∅)
platform-wallet ∅ <ø> (∅)
drive-proof-verifier ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants