Fix CLIENT-mode runlevel/sync and rec ring buffer wrap by cboulay · Pull Request #183 · CerebusOSS/CereLink

cboulay · 2026-05-07T23:47:30Z

Summary

Three independent bugs that combined to make CLIENT-mode sessions unreliable on long-running streams. Each is fixable in isolation; together they make pycbsdk and simple_device produce correct runlevel, working sync(), and clean packet streams across rec ring buffer wraps.

Bugs and fixes

1. `getRunLevel()` returned 0 in CLIENT mode

Symptom: pycbsdk/examples/session_info.py NPLAY (and client_session.runlevel in user code) reported 0 even though the device was running. simple_device NPLAY attached as CLIENT showed Runlevel: 0 (UNKNOWN).

Cause: SdkSession::getRunLevel() only read the per-session device_runlevel atomic, which is updated by the receive thread when a SYSREP packet flows through that session's receive ring. Devices only emit SYSREP in response to runlevel-set commands, so once the STANDALONE owner finished its handshake, no further SYSREPs flowed and CLIENT-side device_runlevel stayed at 0 indefinitely.

Fix: getRunLevel() falls back to getSysInfo()->runlevel (the SYSINFO mirror in shmem that the STANDALONE owner writes via setSysInfo() on every received SYSREP).

2. CLIENT-mode `sync()` raced with SYSREP heartbeats

Symptom: sync() could return before the device had actually processed prior config packets, so subsequent read-back saw stale state. Worse: it sent SYSSETRUNLEV with runlevel=0 (stale atomic), which can perturb device state.

Cause:

sync() read device_runlevel.load() directly for the "no-op" runlevel set. In CLIENT mode that's still 0.
sync() waited on any 0x10..0x1F SYSREP via the boolean received_sysrep flag. Periodic SYSREP (0x10) heartbeats from nPlayServer satisfied the wait before the actual SYSREPRUNLEV (0x12) reply arrived.

Fix:

sync() uses getRunLevel() (with the new shmem fallback) for current.
New sticky received_sysrepRunlev flag in Impl, set only when the receive thread sees a 0x12 packet. setSystemRunLevel resets it before send. waitForSysrep accepts an optional expected_type argument; CLIENT-path sync() passes cbPKTTYPE_SYSREPRUNLEV. Sticky semantics are race-free against later 0x10 heartbeats arriving before the waiter wakes.

3. Rec ring buffer wrap left an unmarked gap

Symptom: ~6–17 s into any CLIENT session, the receive thread started decoding garbage as fake SYSREP-family packets, firing a flood of [runlevel change] events with nonsense runlevels and inflating delivered-packet counts. Once misaligned, the reader stayed misaligned for the rest of the session.

Cause: When the writer would not fit the next packet at head, it wrapped to offset 0 and incremented head_wrap — but left a 1+ dword gap between the previous packet's end and buflen with no marker. The reader had no way to know about this gap, so it tried to read a "packet" from inside it, got garbage dlen, advanced tail by some incorrect amount, and stayed off-aligned thereafter.

Fix:

writeToReceiveBuffer pads the gap with a synthetic wrap-marker packet (chid=0, type=0, dlen != 0 — a combination no real packet uses). The marker fills the gap exactly so the reader can step over it cleanly. Wrap policy also avoids leaving 1..3-dword gaps that can't fit a marker header.
readReceiveBuffer recognizes the marker and silently advances tailindex past it without delivering the marker to user callbacks.
Adds __atomic_store_n(..., __ATOMIC_RELEASE) on head_index and __atomic_load_n(..., __ATOMIC_ACQUIRE) on the reader side so packet bytes are guaranteed visible before the consumer sees the new head index. Required for correctness on ARM/Apple Silicon weak memory.

Diagnostics path

The garbage runlevel symptom drove the investigation. A debug print added to readReceiveBuffer caught the first misaligned read and showed tailindex=1 tailwrap=1 immediately after the first wrap — confirming the reader landed 1 dword off zero, not at offset 0. Tracing the writer revealed the unmarked gap.

Backward compatibility

The wrap-marker packet uses chid=0, type=0, dlen != 0. No real packet uses this combination (heartbeats have chid=cbPKTCHAN_CONFIGURATION=0x8000; group sample packets have chid=0 with type in 1..6).
Old readers running against a new writer will see the marker as a normal packet (chid=0, type=0) — they'll deliver it to user callbacks but otherwise function. Slight behavior change for old readers, not breaking.
New readers running against an old writer (no padding) get the same misalignment behavior as before — no regression, just no improvement.
Pad-marker can therefore be deployed incrementally; both processes need the new code for the bug to be fully fixed end-to-end.

* getRunLevel() falls back to the SYSINFO mirror in shmem so CLIENT sessions report the device runlevel even without a fresh SYSREP packet flowing through their receive ring (the STANDALONE owner only sees SYSREPs during its handshake). * sync() CLIENT path uses getRunLevel() for the no-op runlevel set (previously sent runlevel=0 from the stale atomic) and waits specifically on cbPKTTYPE_SYSREPRUNLEV (0x12) via a sticky received_sysrepRunlev flag, so periodic 0x10 SYSREP heartbeats from nPlayServer can no longer falsely satisfy the wait. * writeToReceiveBuffer pads end-of-buffer wrap gaps with a synthetic chid=0/type=0/dlen!=0 marker so CLIENT readers can advance through the gap cleanly. Previously the reader had no way to detect the gap and would land inside it after the first wrap, decoding garbage as fake SYSREP packets and emitting spurious runlevel-change events. Adds release/acquire ordering on head_index for ARM weak-memory correctness. * pycbsdk regression tests in test_client_mode.py covering CLIENT runlevel, sync correctness, and packet integrity through a forced rec-buffer wrap. * simple_device.cpp: optional duration argument and prints standalone/protocol/proc-ident/runlevel after session creation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The release/acquire ordering on rec ring buffer head_index/head_wrap used GCC's __atomic_* builtins, which MSVC doesn't expose. Wrap the four call sites in small inline helpers that branch between __atomic_* on GCC/Clang and std::atomic_thread_fence + volatile load/store on MSVC. Pre-existing __atomic_* uses in the xmt buffer were already #ifdef _WIN32-guarded with InterlockedExchange, so this just brings the rec-buffer additions to parity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cboulay and others added 2 commits May 7, 2026 19:41

cboulay merged commit 47a0ae2 into master May 8, 2026
16 checks passed

cboulay deleted the cboulay/fix_client_runlevel branch May 8, 2026 02:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix CLIENT-mode runlevel/sync and rec ring buffer wrap#183

Fix CLIENT-mode runlevel/sync and rec ring buffer wrap#183
cboulay merged 2 commits into
masterfrom
cboulay/fix_client_runlevel

cboulay commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

cboulay commented May 7, 2026

Summary

Bugs and fixes

1. getRunLevel() returned 0 in CLIENT mode

2. CLIENT-mode sync() raced with SYSREP heartbeats

3. Rec ring buffer wrap left an unmarked gap

Diagnostics path

Backward compatibility

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `getRunLevel()` returned 0 in CLIENT mode

2. CLIENT-mode `sync()` raced with SYSREP heartbeats