
A State-of-the-Art Healthcare Companion Robot
Real-time biometric monitoring Β· Expressive face animation Β· Intelligent medical assessment
Powered by a dual-layer C++17 / Python 3.11 architecture on Raspberry Pi 5
Created by Gabriel Calderon
Requested by Elias Bautista
https://github.com/chele-s/BaymaxMini.git
- Overview
- System Architecture
- Hardware Layer
- Mathematical Foundations
- C++ Real-Time Core
- Python Brain
- IPC Protocol
- Build & Deploy
- Configuration
- Project Structure
BaymaxMini is a healthcare companion robot inspired by the fictional nurse-bot Baymax from Disney's Big Hero 6. It implements a hard real-time control loop in C++17 running at 50 Hz on bare Linux, coupled with a high-level Python brain that handles natural language understanding, computer vision, medical reasoning, and emotional intelligence.
The system runs on a Raspberry Pi 5 and communicates across layers via ZeroMQ PUB/SUB with checksummed binary telemetry frames, ensuring < 1ms IPC latency while maintaining full decoupling between the deterministic hardware layer and the non-deterministic AI layer.
| Domain | Capability | Implementation |
|---|---|---|
| Biometrics | Heart rate, SpOβ, skin temperature | MAX30102 PPG + MLX90614 IR thermometry |
| Proximity | Time-of-Flight distance sensing | VL53L1X SPAD array (up to 4m) |
| Expression | 10 emotions, blinks, gaze, micro-movements | PCA9685 12-channel servo via spring dynamics |
| Power | Real-time energy monitoring & protection | INA219 current/voltage sensing with hysteresis FSM |
| Vision | Person detection, face analysis | YOLOv8n (FP16) + Single-Pass Geometric Deduction |
| Audio | Speech recognition, TTS, intent parsing | Whisper STT + edge TTS |
| Medical | Health assessment, medication reminders | Rule-based clinical heuristics + patient history DB |
| Intelligence | State machine, event bus, scheduling | Finite automaton with priority event dispatch |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Python Brain (non-RT) β
β βββββββββββ ββββββββββββ ββββββββββ βββββββββββββββββββββ β
β β Vision β β Audio β βMedical β β Core Logic β β
β β YOLOv8 β βSTT / TTS β βAssess. β β StateMachine β β
β β FaceAn. β βIntentPrs β βHistory β β EventBus/Sched. β β
β ββββββ¬βββββ ββββββ¬ββββββ βββββ¬βββββ ββββββββββ¬βββββββββββ β
β βββββββββββββ΄βββββββββββββ΄ββββββββββββββββ β
β β ZMQ PUB/SUB β
β tcp://127.0.0.1:5555 (CMD β) β
β tcp://127.0.0.1:5556 (TELEM β) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β C++ Real-Time Core (50 Hz) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Orchestrator β β
β β tick() β readSensors β processCmd β logic β out β β
β ββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββββββββββββββββ β
β ββββββββ΄ββββ ββββββ΄ββββββ βββ΄βββββββββββ β
β β Face β β Vitals β β Power β Modules β
β βControllerβ β Monitor β β System β β
β ββββββββ¬ββββ ββββββ¬ββββββ βββ¬βββββββββββ β
β ββββββββ΄βββββββββββ΄ββββββββββ΄βββββββββββββββββββββββ β
β β IΒ²C Bus (400 kHz) β β
β β PCA9685 Β· VL53L1X Β· MAX30102 Β· MLX90614 Β· INA219β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Chip | Type | Address | Function | Key Specs |
|---|---|---|---|---|
| PCA9685 | PWM Driver | 0x40 |
12-channel servo control | 12-bit resolution, 50 Hz PWM |
| VL53L1X | ToF Sensor | 0x29 |
Proximity detection | 4m range, SPAD array, Β±3% accuracy |
| MAX30102 | Pulse Oximeter | 0x57 |
Heart rate + SpOβ | Dual LED (660nm red / 880nm IR), 18-bit ADC |
| MLX90614 | IR Thermometer | 0x5A |
Contactless body temp | Β±0.5Β°C accuracy, 0.02Β°C resolution |
| INA219 | Power Monitor | 0x41 |
Battery voltage/current | 12-bit ADC, Β±1% current accuracy |
- Chemistry: 3S LiPo (Lithium Polymer)
- Nominal: 11.1V Β· Full: 12.6V Β· Empty: 9.0V
- Capacity: 2200 mAh
- Protection: Overcurrent 3A trip / 2.5A clear, servo stall 2A / 1.2A, shutdown at 8.5V
The core DSP primitive is a Direct Form I biquad filter, implementing the general second-order transfer function:
The difference equation computed each sample:
Coefficients are derived from the analog prototype via the bilinear transform. For a lowpass Butterworth with cutoff frequency
All coefficients are normalized by
For steeper roll-off, biquad sections are cascaded to form higher-order Butterworth filters. An $N$th-order filter uses
This ensures maximally flat magnitude response in the passband β zero ripple by construction.
Used throughout for smoothing sensor data with minimal latency:
The smoothing constant
Removes DC offset from AC-coupled signals (critical for PPG processing):
The pole at
Non-linear filter for impulse noise rejection. For three samples
Implemented via a sorting network (3 conditional swaps) for branchless execution on ARM.
Constrains the rate of change to protect servo mechanisms:
To eliminate catastrophic cancellation and rounding drift in continuous real-time execution, vital trigonometry and constants strictly leverage boost::math. Phase calculations use exact boost::math::sin_pi() evaluating at the absolute bit-level limit of the hardware architecture.
Real-time PPG signal quality tracking utilizes boost::accumulators::accumulator_set, extracting the rolling mean and variance of human vitals in a strictly
The MAX30102 outputs raw photon count data from red (660nm) and infrared (880nm) LEDs. The full signal processing chain:
Raw IR/Red β Medianβ β DC Removal β Butterworth LPF (5 Hz) β Peak Detection β BPM
A composite quality score determines reading validity:
Where:
The ratio score validates adequate perfusion (AC/DC > 0.02), while the stability score penalizes high coefficient of variation.
Heart beats are detected via zero-crossing of the signal slope with a refractory period and adaptive threshold:
- Refractory guard: Ignore peaks within 300ms of the last detection
-
Slope analysis: detect sign change
$\text{slope}_{n-1} > 0 \land \text{slope}_n \leq 0$ -
Adaptive threshold:
$\tau[n] = 0.7 \cdot \tau[n-1] + 0.3 \cdot \text{peak amplitude}$ -
Threshold decay:
$\tau[n] \leftarrow 0.998 \cdot \tau[n]$ (adapts to signal loss) -
BPM computation:
$\text{BPM} = \frac{60 \cdot f_s}{\Delta_{\text{samples}}}$ , smoothed with EMA ($\alpha = 0.3$ )
All servo channels (eyelids, gaze, brows, cheeks) are driven by critically damped spring dynamics, ensuring smooth, natural motion without overshoot:
The exact closed-form solution for each timestep
This provides:
-
Zero overshoot (critically damped:
$\zeta = 1$ ) -
Exponential convergence at rate
$\omega$ (default$\omega = 12$ rad/s) - Exact integration β no numerical drift regardless of timestep
The SpringAnimator further extends this with configurable damping ratios for underdamped (
A procedural animation system providing frequency-controlled, stability-guaranteed smoothing with anticipation:
Where the constants derive from desired natural frequency
The
Cubic Hermite interpolation between two points
With basis functions:
A special case of Hermite interpolation with auto-computed tangents:
Hermite-basis polynomial interpolation with clamped input:
| Family | Mathematical Form |
|---|---|
| Sine | |
| Quadratic | |
| Cubic | |
| Quartic | |
| Quintic | |
| Exponential | |
| Circular | |
| Back | |
| Elastic | |
| Bounce | Piecewise parabolic with |
Each family provides in, out, and inOut variants (30 total), plus smoothstep and smootherstep.
CSS-style timing functions via parametric cubic BΓ©zier
Given a target
Pre-built presets: easeInOut, snappy, organic, overshoot, rubberBand, anticipate, heavyLanding.
Estimated via a piecewise-linear lookup table against cell voltage, derived from empirical Li-ion discharge curves:
| Cell V | 3.00 | 3.30 | 3.50 | 3.60 | 3.70 | 3.75 | 3.80 | 3.85 | 3.95 | 4.10 | 4.20 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| SoC % | 0 | 5 | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 90 | 100 |
Cell voltage derived from bus voltage:
All protection thresholds use Schmitt-trigger hysteresis to prevent chattering:
βββββββββββββββββββ
β INACTIVE β
β β value > trip_threshold
β βββββββββββββΊ ββββββββββββββββββββββββΊβ
β β β
βββββββββββββββββββ βΌ
β² ββββββββ΄ββββββ
β value < clear_thresholdβ ACTIVE β
βββββββββββββββββββββββββββ β
βββββββββββββββ
Blood oxygen saturation is derived from the ratio of ratios (R) of the red and infrared PPG signals:
SpOβ is then estimated via the empirically calibrated linear approximation:
This derives from the Beer-Lambert law of optical absorption:
Where
The INA219 current sense amplifier requires precise register calibration:
With the configured
Servo angle
Default range:
At $f_{\text{PWM}} = 50$Hz:
The core runs a fixed-timestep loop at 50 Hz ($\Delta t = 20$ms) with overrun detection:
while (running) {
nextTick += 20ms
readSensors() ββΊ IΒ²C reads from all 5 chips
processCommands() ββΊ ZMQ SUB (non-blocking)
updateLogic() ββΊ FSM + spring dynamics + animation
writeOutputs() ββΊ Telemetry PUB + servo PWM
sleep_until(nextTick)
if (overrun > 3 ticks) { resync; overrun_count++ }
}
BOOT ββ(ZMQ connected)βββΊ CONNECTED ββ(brain timeout 2s)βββΊ AUTONOMOUS
β² β² β
β βββββββ(CMD received)βββββββββββββββ
β
βββββββββββββββββββββ SHUTDOWN βββ(SIGINT/SIGTERM or critical battery)
Manages 12 servo channels mapped to facial features via spring-damped dynamics:
| Channel | Servo | Feature | Spring Ο |
|---|---|---|---|
| 0-1 | Left eyelids | Upper/Lower lid | 12 rad/s |
| 2-3 | Right eyelids | Upper/Lower lid | 12 rad/s |
| 4-5 | Left gaze | Horizontal/Vertical | 12 rad/s |
| 6-7 | Right gaze | Horizontal/Vertical | 12 rad/s |
| 8-9 | Cheeks | Left/Right raise | 12 rad/s |
| 10-11 | Brows | Left/Right position | 12 rad/s |
10 Expression Poses: Neutral, Happy, Sad, Surprised, Angry, Sleepy, Concerned, Curious, Love, Thinking β each defined as a target pose for all 12 channels with smooth spring-interpolated transitions.
Blink System: Asymmetric timing modeled after human physiology β close: 75ms, hold: 35ms, open: 130ms, ratio 1.73:1 (close:open speed). Auto-blinks at 2-6s intervals with 18% variance.
Micro-Movements: Continuous subtle animation to avoid the uncanny valley:
- Gaze jitter: 0.02 amplitude, 0.3 Hz
- Gaze drift: 0.05 amplitude, 0.08 Hz
- Lid twitch: 0.015 amplitude, 0.15 Hz
- Brow drift: 0.01 amplitude, 0.06 Hz
State machine for clinical-grade vital sign acquisition:
IDLE β DETECTING_FINGER β STABILIZING (2s) β MEASURING (5-10s) β COMPLETE
β
ERROR_NO_FINGER / ERROR_POOR_SIGNAL
Health assessment heuristics:
| Condition | Threshold | Assessment |
|---|---|---|
| Normal HR | 50-110 BPM | NORMAL |
| Tachycardia | > 110 BPM | ELEVATED_HR |
| Bradycardia | < 50 BPM | LOW_HR |
| Hypoxemia | SpOβ < 95% | LOW_SPO2 |
| Fever | > 38.0Β°C | FEVER |
| Hypothermia | < 36.0Β°C | HYPOTHERMIA |
| Critical | HR < 40 or > 180, SpOβ < 90% | CRITICAL |
Real-time energy management with multi-layered protection:
- EMA-filtered voltage, current, power readings
-
Coulomb counting for energy accumulation:
$E += P \cdot \Delta t$ - 7-state FSM: NOMINAL β LOW_BATTERY β CRITICAL β OVERCURRENT β SERVO_STALL β OVERTEMP β SHUTDOWN
- Callbacks for face alert mode and emergency shutdown
| Subsystem | Modules | Purpose |
|---|---|---|
| core | brain.py, event_bus.py, logger.py |
Central coordinator, pub/sub events, structured logging |
| vision | camera.py, detector.py, face_analyzer.py, object_detector.py |
Single YOLOv8n FP16 orchestrator, CPU-light geometric facial deduction |
| audio | stt_engine.py, tts_engine.py, intent_parser.py, sound_fx.py, audio_alerts.py |
Speech-to-text, text-to-speech, NLU intent extraction, sound effects |
| medical | patient_history.py, pharmacist.py, scheduler.py |
Patient records, medication database, reminder scheduling |
| communication | zmq_link.py, telemetry.py |
ZMQ bridge to C++ core, telemetry deserialization |
| logic | state_machine.py, scheduler.py |
Behavioral FSM, task scheduling |
| config | settings.py, db_migrate.py |
YAML config loader, database migrations |
| utils | time_utils.py |
Timezone-aware time helpers |
Binary packed struct with djb2 checksum verification:
| Field | Type | Offset | Description |
|---|---|---|---|
magic |
u32 | 0 | 0xBABE0001 |
version |
u32 | 4 | Protocol version |
timestamp_us |
u64 | 8 | Microsecond timestamp |
sequence |
u32 | 16 | Monotonic counter |
distance_mm |
f32 | 20 | VL53L1X range |
heart_rate_bpm |
f32 | 24 | Filtered heart rate |
spo2_percent |
f32 | 28 | Blood oxygen % |
skin_temp_c |
f32 | 32 | Body temperature |
bus_voltage_v |
f32 | 40 | Battery voltage |
battery_pct |
f32 | 52 | State of charge |
eyelid_openness |
f32 | 56 | Current lid state |
gaze_x / gaze_y |
f32 | 60, 64 | Eye direction |
state |
u8 | β | System state enum |
expression |
u8 | β | Current face expression |
alert |
u8 | β | Alert level |
checksum |
u32 | last | djb2 integrity check |
Supports: SET_EXPRESSION, SET_EYELID, TRIGGER_BLINK, SET_GAZE, SET_BREATH, SPEAK, SET_LED_COLOR, PLAY_ANIMATION, EMERGENCY_STOP, SHUTDOWN
sudo apt-get install build-essential cmake libzmq3-dev nlohmann-json3-dev
pip install pyzmq pyyaml numpy opencv-python-headless tflite-runtimemkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)./build/baymax_core &
python3 src/python_brain/main.py| File | Purpose |
|---|---|
configs/sensors_config.yaml |
IΒ²C addresses, sample rates, filter parameters |
configs/vision_params.yaml |
YOLO confidence thresholds, camera resolution |
configs/reminders.json |
Medication and appointment schedule |
BaymaxMini/
βββ CMakeLists.txt
βββ README.md
βββ requirements.txt
βββ configs/
β βββ sensors_config.yaml
β βββ vision_params.yaml
β βββ reminders.json
βββ models/
β βββ yolo_nano_int8.tflite
βββ scripts/
βββ src/
β βββ cpp_core/
β β βββ main.cpp Orchestrator + tick loop
β β βββ drivers/
β β β βββ I2C_Bus.{cpp,h} Linux IΒ²C ioctl wrapper
β β β βββ PCA9685.{cpp,h} 16-ch PWM servo driver
β β β βββ VL53L1X.{cpp,h} Time-of-Flight ranging
β β β βββ MAX30102.{cpp,h} Pulse oximetry + HR
β β β βββ MLX90614.{cpp,h} IR thermometry
β β β βββ INA219.{cpp,h} Power monitoring
β β βββ modules/
β β β βββ FaceController.{cpp,h} Expressive servo animation
β β β βββ VitalsMonitor.{cpp,h} Clinical vitals pipeline
β β β βββ PowerSystem.{cpp,h} Energy management FSM
β β βββ interfaces/
β β β βββ I_Sensor.h Abstract sensor + registry
β β βββ ipc/
β β β βββ SharedData.h Binary IPC protocol
β β βββ utils/
β β βββ MathUtils.h DSP, filters, springs, Vec2
β β βββ DigitalFilter.h PPG chain, peak detection
β β βββ Easing.h Animation, BΓ©zier, timelines
β βββ python_brain/
β βββ main.py Brain entry point
β βββ core/ Event bus, state, logging
β βββ vision/ Camera, YOLO, face analysis
β βββ audio/ STT, TTS, intent parsing
β βββ medical/ Patient history, meds
β βββ communication/ ZMQ link, telemetry decode
β βββ logic/ Behavioral state machine
β βββ config/ Settings, DB migration
β βββ utils/ Time utilities
"Hello. I am Baymax, your personal healthcare companion."