Skip to content

Fix CLIENT-mode runlevel/sync and rec ring buffer wrap#183

Merged
cboulay merged 2 commits into
masterfrom
cboulay/fix_client_runlevel
May 8, 2026
Merged

Fix CLIENT-mode runlevel/sync and rec ring buffer wrap#183
cboulay merged 2 commits into
masterfrom
cboulay/fix_client_runlevel

Conversation

@cboulay

@cboulay cboulay commented May 7, 2026

Copy link
Copy Markdown
Collaborator

Summary

Three independent bugs that combined to make CLIENT-mode sessions unreliable on long-running streams. Each is fixable in isolation; together they make pycbsdk and simple_device produce correct runlevel, working sync(), and clean packet streams across rec ring buffer wraps.

Bugs and fixes

1. getRunLevel() returned 0 in CLIENT mode

Symptom: pycbsdk/examples/session_info.py NPLAY (and client_session.runlevel in user code) reported 0 even though the device was running. simple_device NPLAY attached as CLIENT showed Runlevel: 0 (UNKNOWN).

Cause: SdkSession::getRunLevel() only read the per-session device_runlevel atomic, which is updated by the receive thread when a SYSREP packet flows through that session's receive ring. Devices only emit SYSREP in response to runlevel-set commands, so once the STANDALONE owner finished its handshake, no further SYSREPs flowed and CLIENT-side device_runlevel stayed at 0 indefinitely.

Fix: getRunLevel() falls back to getSysInfo()->runlevel (the SYSINFO mirror in shmem that the STANDALONE owner writes via setSysInfo() on every received SYSREP).

2. CLIENT-mode sync() raced with SYSREP heartbeats

Symptom: sync() could return before the device had actually processed prior config packets, so subsequent read-back saw stale state. Worse: it sent SYSSETRUNLEV with runlevel=0 (stale atomic), which can perturb device state.

Cause:

  • sync() read device_runlevel.load() directly for the "no-op" runlevel set. In CLIENT mode that's still 0.
  • sync() waited on any 0x10..0x1F SYSREP via the boolean received_sysrep flag. Periodic SYSREP (0x10) heartbeats from nPlayServer satisfied the wait before the actual SYSREPRUNLEV (0x12) reply arrived.

Fix:

  • sync() uses getRunLevel() (with the new shmem fallback) for current.
  • New sticky received_sysrepRunlev flag in Impl, set only when the receive thread sees a 0x12 packet. setSystemRunLevel resets it before send. waitForSysrep accepts an optional expected_type argument; CLIENT-path sync() passes cbPKTTYPE_SYSREPRUNLEV. Sticky semantics are race-free against later 0x10 heartbeats arriving before the waiter wakes.

3. Rec ring buffer wrap left an unmarked gap

Symptom: ~6–17 s into any CLIENT session, the receive thread started decoding garbage as fake SYSREP-family packets, firing a flood of [runlevel change] events with nonsense runlevels and inflating delivered-packet counts. Once misaligned, the reader stayed misaligned for the rest of the session.

Cause: When the writer would not fit the next packet at head, it wrapped to offset 0 and incremented head_wrap — but left a 1+ dword gap between the previous packet's end and buflen with no marker. The reader had no way to know about this gap, so it tried to read a "packet" from inside it, got garbage dlen, advanced tail by some incorrect amount, and stayed off-aligned thereafter.

Fix:

  • writeToReceiveBuffer pads the gap with a synthetic wrap-marker packet (chid=0, type=0, dlen != 0 — a combination no real packet uses). The marker fills the gap exactly so the reader can step over it cleanly. Wrap policy also avoids leaving 1..3-dword gaps that can't fit a marker header.
  • readReceiveBuffer recognizes the marker and silently advances tailindex past it without delivering the marker to user callbacks.
  • Adds __atomic_store_n(..., __ATOMIC_RELEASE) on head_index and __atomic_load_n(..., __ATOMIC_ACQUIRE) on the reader side so packet bytes are guaranteed visible before the consumer sees the new head index. Required for correctness on ARM/Apple Silicon weak memory.

Diagnostics path

The garbage runlevel symptom drove the investigation. A debug print added to readReceiveBuffer caught the first misaligned read and showed tailindex=1 tailwrap=1 immediately after the first wrap — confirming the reader landed 1 dword off zero, not at offset 0. Tracing the writer revealed the unmarked gap.

Backward compatibility

  • The wrap-marker packet uses chid=0, type=0, dlen != 0. No real packet uses this combination (heartbeats have chid=cbPKTCHAN_CONFIGURATION=0x8000; group sample packets have chid=0 with type in 1..6).
  • Old readers running against a new writer will see the marker as a normal packet (chid=0, type=0) — they'll deliver it to user callbacks but otherwise function. Slight behavior change for old readers, not breaking.
  • New readers running against an old writer (no padding) get the same misalignment behavior as before — no regression, just no improvement.
  • Pad-marker can therefore be deployed incrementally; both processes need the new code for the bug to be fully fixed end-to-end.

cboulay and others added 2 commits May 7, 2026 19:41
* getRunLevel() falls back to the SYSINFO mirror in shmem so CLIENT
  sessions report the device runlevel even without a fresh SYSREP
  packet flowing through their receive ring (the STANDALONE owner
  only sees SYSREPs during its handshake).

* sync() CLIENT path uses getRunLevel() for the no-op runlevel set
  (previously sent runlevel=0 from the stale atomic) and waits
  specifically on cbPKTTYPE_SYSREPRUNLEV (0x12) via a sticky
  received_sysrepRunlev flag, so periodic 0x10 SYSREP heartbeats
  from nPlayServer can no longer falsely satisfy the wait.

* writeToReceiveBuffer pads end-of-buffer wrap gaps with a
  synthetic chid=0/type=0/dlen!=0 marker so CLIENT readers can
  advance through the gap cleanly. Previously the reader had no
  way to detect the gap and would land inside it after the first
  wrap, decoding garbage as fake SYSREP packets and emitting
  spurious runlevel-change events. Adds release/acquire ordering
  on head_index for ARM weak-memory correctness.

* pycbsdk regression tests in test_client_mode.py covering CLIENT
  runlevel, sync correctness, and packet integrity through a
  forced rec-buffer wrap.

* simple_device.cpp: optional duration argument and prints
  standalone/protocol/proc-ident/runlevel after session creation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The release/acquire ordering on rec ring buffer head_index/head_wrap
used GCC's __atomic_* builtins, which MSVC doesn't expose.  Wrap the
four call sites in small inline helpers that branch between
__atomic_* on GCC/Clang and std::atomic_thread_fence + volatile
load/store on MSVC.  Pre-existing __atomic_* uses in the xmt buffer
were already #ifdef _WIN32-guarded with InterlockedExchange, so this
just brings the rec-buffer additions to parity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cboulay cboulay merged commit 47a0ae2 into master May 8, 2026
16 checks passed
@cboulay cboulay deleted the cboulay/fix_client_runlevel branch May 8, 2026 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant