Goal. Specify the wire format so that the Python plant and the C controller agree byte-for-byte. Once frozen, this spec is the contract: any change must update both sides AND bump the protocol version.
| Candidate | Verdict |
|---|---|
| JSON / msgpack | Too "PC-shaped". Adds tokenisation, dynamic sizing — does not exist on a real ECU. |
| Protobuf | Heavy, needs a generator, hides the framing problem. |
| Custom packed struct | ✅ Closest to what runs on real CAN/LIN/SPI buses. Forces us to handle endianness, alignment, framing, CRC ourselves — which is exactly the skill we want to prove. |
The protocol below imitates the typical layout of an OEM-defined CAN-payload inside a tunnelling envelope (much like J1939 Transport Protocol or DoIP).
sensor_frame actuator_frame
Plant ────────────────────────► ECU ────────────────────────► Plant
(Python, server) (C, client)
- Plant is the TCP server (binds, listens). Rationale: the plant is the "world" — it exists before the ECU boots, and an ECU may restart against a live world. This also matches the SIL convention used in CarMaker / VSL.
- ECU is the TCP client (connects on startup).
- Loopback only:
127.0.0.1:9000. TCP_NODELAY=1on both sockets — disable Nagle to avoid the 40 ms aggregation delay on small frames.
Every frame on the wire — regardless of direction — is wrapped in the same envelope so that a receiver can resynchronise after dropped bytes:
+--------+--------+--------+--------+========+--------+--------+
| MAGIC | TYPE | LEN | PAYLOAD| CRC16 |
| u16 | u8 | u16 (little-end) | ... | u16 |
| 0xAA55 | 0x01/2 | payload bytes | | CRC-CCITT-FALSE |
+--------+--------+--------+--------+========+--------+--------+
2B 1B 2B N B 2B
| Field | Size | Description |
|---|---|---|
magic |
2 B | Constant 0xAA55 (little-endian) — sync marker for resync. |
type |
1 B | 0x01 = sensor_frame (P→E). 0x02 = actuator_frame (E→P). |
length |
2 B | Number of bytes in the payload (does NOT include header or CRC). |
payload |
N B | The packed C struct (see §4). |
crc16 |
2 B | CRC-16/CCITT-FALSE computed over type + length + payload. |
Total overhead per frame: 7 bytes (header) + 2 bytes (CRC) = 9 bytes.
All multi-byte fields are little-endian. Reasons:
- x86_64 and most ARM Cortex-M run little-endian natively → no swap on either side, faster.
- The packed struct can be
memcpy'd directly into the payload with no per-field conversion. - Python uses
struct.pack('<...')consistently.
A receiver MUST be able to recover from any number of dropped or corrupted bytes (e.g. after a deserialisation bug or a flaky link). The recovery algorithm:
// pseudo
while (true) {
read_until_magic_byte_pair(0xAA, 0x55); // scan one byte at a time
read_exact(header_remainder); // type + length
if (length > MAX_PAYLOAD) { drop; continue; }
read_exact(payload, length);
read_exact(crc, 2);
if (crc16(type, length, payload) != crc) { drop; continue; }
deliver(payload);
}MAX_PAYLOAD is fixed at 64 bytes — generous compared to our current
sizes (12 / 14 B) and guards against an attacker / faulty link sending
length=0xFFFF.
Mirrors what real wheel-speed sensors + a brake-pressure feedback line would deliver to an ECU through CAN.
typedef struct __attribute__((packed)) {
uint32_t timestamp_ms; // 4 B — plant clock at sample time
float v_vehicle; // 4 B — m/s (longitudinal speed)
float omega_wheel; // 4 B — rad/s (wheel angular speed)
float brake_pressure; // 4 B — bar (feedback from hydraulic line)
uint16_t fault_flags; // 2 B — bitfield set by fault injector
} sensor_payload_t; // 18 bytesEndianness: little-endian. float is IEEE-754 binary32. The Python side
packs using struct.pack('<I fff H', ...).
fault_flags bitfield (set by the Python fault injector — informative,
the ECU does NOT trust it for safety, only logs it):
| Bit | Name | Meaning |
|---|---|---|
| 0 | FF_INJECTED_STUCK |
Sensor value frozen on purpose |
| 1 | FF_INJECTED_NOISE |
Gaussian noise injected on omega |
| 2 | FF_INJECTED_RANGE |
Out-of-range value injected |
| 3 | FF_INJECTED_ICE |
Surface was hot-swapped to ice mid-run |
| 4..15 | reserved | must be 0 |
typedef struct __attribute__((packed)) {
uint32_t timestamp_ms; // 4 B — ECU clock at decision time
float brake_command; // 4 B — bar (target line pressure)
uint16_t ecu_status; // 2 B — ABS state machine (see §4.3)
uint16_t dtc_code; // 2 B — Diagnostic Trouble Code bitfield (FMEA)
} actuator_payload_t; // 12 bytes| Value | Name | Meaning |
|---|---|---|
| 0 | ECU_INIT |
Just booted, awaiting first valid frame |
| 1 | ECU_STANDBY |
Vehicle stopped or below cut-in speed |
| 2 | ECU_MONITOR |
Brake applied, slip within healthy band |
| 3 | ECU_ACTIVE |
ABS modulating brake pressure |
| 4 | ECU_FAULT_DEGRADED |
Fault detected; passing driver pedal through |
| 5 | ECU_FAULT_LATCHED |
Repeated faults; locked in fault until reset |
Aligned with docs/FMEA.md (week 3). Stable subset:
| Bit | Name | Source mode |
|---|---|---|
| 0 | DTC_SENSOR_STUCK |
F01 |
| 1 | DTC_SENSOR_NOISE |
F02 |
| 2 | DTC_COMM_TIMEOUT |
F03 |
| 3 | DTC_COMM_CRC |
F04 |
| 4 | DTC_SENSOR_RANGE |
F05 |
| 5 | DTC_PLAUSIBILITY |
F06 |
| 6..15 | reserved |
We use CRC-16/CCITT-FALSE (a.k.a. CRC-16/AUTOSAR, polynomial 0x1021)
because it is the one specified by AUTOSAR for end-to-end protection on CAN,
which is the closest real-world analogue to our scenario.
| Parameter | Value |
|---|---|
| Polynomial | 0x1021 |
| Initial value | 0xFFFF |
| RefIn | false |
| RefOut | false |
| XorOut | 0x0000 |
Check ("123456789") |
0x29B1 |
The "check" vector is the canonical KAT — both the C and Python
implementations MUST reproduce 0x29B1 on the ASCII byte string "123456789"
in their unit test.
The CRC is computed over type || length || payload (the bytes between the
magic and the CRC field itself). Magic is excluded by design — if the magic
is corrupted, framing recovery handles it, not CRC.
- ECU period: 100 Hz (every 10 ms). One
sensor_frameconsumed, oneactuator_frameproduced per cycle. - Plant produces sensor frames at 100 Hz too, lock-stepped to the same 10 ms grid.
- Watchdog at the ECU: if no valid
sensor_framearrives within 50 ms, the ECU entersECU_FAULT_DEGRADEDand passes the last known safe driver pedal through.
Note: in pure SIL, both ends share the same wall clock, but each side stamps
its OWN clock in timestamp_ms so that we can measure round-trip latency.
Both sides MUST static_assert (C) / assert (Python) these sizes:
| Frame | Total wire size |
|---|---|
| sensor frame (envelope + payload) | 9 + 18 = 27 B |
| actuator frame (envelope + payload) | 9 + 12 = 21 B |
At 100 Hz: ~2.7 KB/s up, ~2.1 KB/s down. Nowhere near saturation, even on the slowest CAN bus.
This document is PROTOCOL v1. Any breaking change (field added, type
renumbered, CRC algorithm swapped) bumps to v2 AND updates the magic to a
fresh constant (e.g. 0xAA56) so that an old receiver cannot mistake a new
frame for a valid v1 one.