Skip to content

Release 1.6.0: NVCM auto-detection, TCA bus fixes#43

Merged
boringethan merged 37 commits into
mainfrom
next
Jun 10, 2026
Merged

Release 1.6.0: NVCM auto-detection, TCA bus fixes#43
boringethan merged 37 commits into
mainfrom
next

Conversation

@boringethan

Copy link
Copy Markdown
Contributor

Promotes next to main for the 1.6.0 release.

Highlights since 1.5.4-dev:

  • NVCM auto-detection via SPI/USART pin sampling — NVCM-programmed cameras skip the SRAM bitstream load
  • TCA9548A bus-wedge root-cause fixes (no mux select at power-on, channel disconnect before power-off, I2C bus recovery)
  • USART receiver reset after positive NVCM detect (fixes power-of-two histogram corruption on NVCM-booted USART cameras)
  • OW_FACTORY_I2C_WRRD off-by-one fix (unblocked NVCM programming)
  • DFU deploy script, EFT watchdog work, and incident documentation

Validated on hardware across both sensor modules; see docs/nvcm-rowdrop-incident.md and PR #42 for the validation matrix.

🤖 Generated with Claude Code

georgevigelette and others added 30 commits May 18, 2026 21:40
Adds scripts/deploy.py + CMake flash-left/flash-right target spec and
implementation plan. Driven by enter_dfu (omotion) -> dfu-util with
:leave for auto-reset -> manual power-cycle hint on stuck device.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The emoji prints (❌/✅/⚠️) crashed under Windows default cp1252 stdout.
Reconfigure stdout/stderr to UTF-8 with replace-on-error at the start
of main() so the error path is actually reachable.

Add "Deploy Sensor Left" and "Deploy Sensor Right" tasks to the
generate_vscode_files.py template so a generator run materializes
clickable VS Code tasks that invoke scripts/deploy.py with --device.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The omotion package ships dfu-util binaries under
omotion/dfu-util/{win32,win64,darwin-x86_64}/. Prefer that bundled
binary so no separate dfu-util install is needed; PATH lookup is
still the fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues mirrored from openmotion-console-fw after the first real
console flash exposed them:

1. interface.start(wait=True, wait_timeout=X) only blocks on handles
   already in CONNECTING. Poll the named sensor handle until it
   reports is_connected() or the timeout expires.

2. STM32 ROM DFU bootloaders jump to user firmware on :leave before
   dfu-util can read the final status, producing exit code 74
   despite a successful download. Treat non-zero as warning and use
   the device's comeback poll as the source of truth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add fpga_detect_nvcm() that detects NVCM-programmed CrossLink FPGAs by
releasing CRESETB without the activation key and checking whether 0x40
still responds.  If the FPGA boots from NVCM, its user design takes over
the I2C pins (I2C_PORT=DISABLE by default) and 0x40 disappears.

Detection is called only from program_fpga() and program_sram_fpga()
when force_update is false — NOT at init time.  Earlier attempts to run
detection during init_camera_sensors() upset the TCA9548A mux, so init
is restored to its original form with power_off_all_cameras() in its
original position.

Also adds engineering notebook at docs/fpga-nvcm-autodetect.md tracking
the CDONE approach (failed due to Feature Row CDONE_USER_IO default),
the I2C probe approach, and the TCA mux issues encountered.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pivot from boot-inference NVCM detection to a direct, authoritative read of
the CrossLink NVCM fuse state over I2C. This never boots the FPGA and never
touches camera power (the thing that upsets the TCA9548A) — it holds the
config port in slave mode (activation key around the CRESETB transition,
like fpga_configure does for SRAM), enters ISC access mode, and reads every
discriminator.

fpga_nvcm_probe() (crosslink.c) reads, best-effort with a step_status
bitmask: IDCODE, Status Register (0x3C), Feature Row (0xE7), Feature Bits
(0xFB), USERCODE (0xC0), and N NVCM array rows (0x73). Parameterized ISC
operand (0x08=NVCM, 0x00=SRAM) and row count so the host can vary them
without reflashing. Verbose printf at each step (gated by USB_PRINTF).

OW_FACTORY_NVCM_CHECK (0x6C, if_factory_prog.c) runs the probe on the active
camera and returns a fixed-layout blob. Mux channel is routed via
TCA9548A_SelectChannel (mux I2C only, no power).

Engineering notebook updated with the pivot rationale, the authoritative
discriminator table from the Lattice PDFs, the SDK i2c_write_read WRRD
off-by-one boundary bug (noted, not fixed), and the iteration log.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Iteration-1 hardware showed the NVCM content reads (feature row, NVCM array)
come back as floating 0xFF: a bare ISC_ENABLE 0x08 does not read-enable the
NVCM array (STATUS in NVCM mode = 0x208 lacks the Read-Enable bit, vs 0xE00 in
SRAM mode). Those reads are not trustworthy.

Add a behaviorally-definitive signal instead: after the config-mode read-back,
release CRESETB WITHOUT the activation key so the device auto-boots. A
programmed NVCM boots its user design and the config I2C port at 0x40
disappears (I2C_PORT=DISABLE); a blank part keeps ACKing at 0x40. The test
ends by re-asserting CRESETB low to halt the booted design and free the bus
(TCA-safe). Gated by a new payload byte data[2] (default on); two result bytes
(boot_probe_done, boot_0x40_responds) added to the response blob.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Boot test on cam8: config port at 0x40 is present in forced-config mode
(IDCODE ok) but GONE after auto-boot (CRESETB released without activation
key) -> the FPGA booted a valid user design from NVCM. Self-validating within
one run; rules out blank (0x40 would persist) and corrupt write (config would
fail CRC and not hand off to user mode). Records the recommended power-safe
production detector built on the boot test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Camera 1 negative-control attempt locked up the I2C bus (TCA timeout, err
0x20) on power-on — slot likely unpopulated/faulty; recovered via power cycle,
cam8 re-confirmed PROGRAMMED. Added a session resume checkpoint documenting
device state, repro steps, and the three open threads (logic2 MCP needs session
restart to register, blank-control needs a populated-blank slot, NVCM
read-enable still unsolved).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Self-contained handoff at the top of the notebook so the next session (with the
Logic 2 MCP registered) can pick up cleanly: established facts (cam8 PROGRAMMED;
content reads float 0xFF because NVCM isn't read-enabled), the immediate
analyzer task (capture cam8 SCL/SDA during OW_FACTORY_NVCM_CHECK to see whether
the FPGA drives the read data), exact repro commands, the firmware wire
sequence + response layout, the WRRD off-by-one fast-iteration tip, and power
safety rules.

Correct the camera-1 finding: cam1 IS populated (per user); the wedge was a
power-sequencing mistake (cam8 left powered while powering cam1), not an empty
slot. Documents the one-camera-at-a-time rule and how to retry the blank
control.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The previous approach released CRESETB without the activation key and
checked whether 0x40 responded. This always returned "NVCM programmed"
because the CrossLink I2C slave config port requires the activation key
to become active — without it, 0x40 never responds regardless of NVCM
state.

The new approach enters forced slave config mode (activation key +
CRESETB), reads the STATUS register, and checks the Done bit (bit 8).
Done=1 means NVCM programming completed (it's the last step burned
during NVCM programming and gates auto-boot). Done=0 means blank.

Verified on hardware: cam8 reads Done=0, all NVCM content zeros —
NVCM is not programmed on any camera on the left sensor module.

Also adds a handoff document summarizing the full NVCM investigation
findings, tools, and processes for the next investigator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix(nvcm): rewrite NVCM detection to use Done bit
The write-read handler read write_data from &cmd->data[5] but the SDK
packs it at index 4 (after write_len[2] + read_len[2]). This shifted
every I2C write-read command by one byte, breaking STATUS reads and
busy-wait loops during NVCM programming.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…NVCM burn

Camera 8 NVCM is fully programmed with Done=1 persisting across power
cycles. Root cause was a one-byte offset in the I2C write-read handler
that corrupted all STATUS reads during programming.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the I2C-based Done-bit check with GPIO pin sampling that avoids
the TCA9548A mux entirely. After toggling CRESETB, read the per-camera
CLK and DATA pins as floating GPIO inputs — a booted FPGA actively
drives both low, while a blank FPGA leaves them high-Z.

Key details:
- CameraDevice struct gains 6 fields for detect pin port/pin/AF
- SPI cameras use SCK + MOSI (FPGA master outputs); USART cameras
  use CK + RX — all are STM32 inputs driven by the FPGA
- GPIO_NOPULL avoids internal pull-up overpowering FPGA drive
- Pins restored to AF_PP after read so SPI/USART peripherals resume

Verified on hardware: cam 8 (NVCM programmed) → clk=0 data=0 → skip
SRAM load; cam 7 (blank) → clk=1 data=1 → fall through to SRAM prog.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
boringethan and others added 7 commits June 9, 2026 13:03
The TCA mux wedges when fpga_detect_nvcm() toggles CRESETB while a mux
channel is still selected — the resetting FPGA glitches SDA/SCL and the
STM32 I2C peripheral gets stuck with BUSY set permanently.

Three changes:
- TCA9548A_DisableAll() called at the top of fpga_detect_nvcm() to
  close all mux channels before any CRESETB toggle
- I2C_BusRecovery() added: DeInits the peripheral, bit-bangs 9 SCL
  clocks as GPIO to free stuck slaves, generates STOP, re-inits
- TCA retry path upgraded: 1st fail = hardware reset only, 2nd fail =
  hardware reset + full bus recovery

Also adds docs/tca9548a-bus-issues.md with symptoms, root cause
analysis, and remaining concerns for future investigation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root cause of the recurring TCA9548A bus wedge, confirmed by stress
test: enable_camera_power() connected the camera's mux channel
immediately after raising the power rail. A freshly powered CrossLink
with blank/unloaded config drives its config pins during its boot
attempt — the same physical pins as the mux channel — clamping the
shared I2C bus. The next I2C transaction then timed out (err 32),
deterministically, on every power-on.

- enable_camera_power() no longer touches the mux; every I2C consumer
  already selects its channel right before transacting
- Remove TCA9548A_EnableChannel entirely: its additive select is how
  multiple channels ended up connected at once (current=0x88/0xC0
  in failure logs); header comment documents why it must not return
- Gate the NVCM detect printf behind DEBUG_FLAG_CMD_VERBOSE
- Update docs/tca9548a-bus-issues.md with confirmed root cause,
  fix layers, and verification results

Verified on hardware: 10x {power off, power on, program_fpga} on the
USART camera with zero TCA failures (previously failed attempt 1 on
every iteration); NVCM-programmed camera still detected and skips
SRAM load.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Cameras 1-7 were burned with a row-shifted image due to missing
readback verification in the SDK's I2C ISP parser (no busy-wait
between NVCM row programs, no-op verify). Permanently non-auto-booting;
still fully usable via SRAM programming. Sensor firmware factory path
verified correct. Fix: openmotion-sdk PR #66.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The SRAM programming path has always ended with a USART disable/
re-enable ("required for the USART to work properly after FPGA
programming"), but the NVCM-detect early return skipped it. The detect
also re-attaches the USART pins to the peripheral while the freshly
booted FPGA is driving the clock line, which can clock stray bits into
the receiver. Result: every histogram bin from an NVCM-booted USART
camera arrived multiplied by a power of two (observed: exactly 4x on
every frame, 332/332 corrupted in an 8 s scan).

Mirror the SRAM path's USART reset in the detect-positive path, which
covers both program_fpga() and program_sram_fpga() callers.

Verified with openmotion-sdk scripts/validate_scan_integrity.py:
- right cam 1 (USART2, NVCM): before 332/332 frames corrupted at 4.0x;
  after 327/327 valid, zero mismatches
- left cam 8 (SPI4, NVCM): 328/328 valid, zero mismatches — SPI path
  unaffected, no SPI-side reset needed

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
feat(nvcm): NVCM auto-detection via pin sampling + TCA bus wedge root-cause fix
@boringethan boringethan merged commit 18570c7 into main Jun 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants