ML-powered analytics, visualization, and chatbot system for SailGP racing data. Predicts win probability per team, models dirty air effects, explains predictions with feature contributions, and visualizes races in a real-time F1-style dashboard.
Future work: We intend to integrate SailGP's existing VR visualization system for immersive 3D race replay.
/model_training/
/app/
backend/
frontend/
/data/Bermuda/
Two separate systems: Training produces model.pkl. The web app loads it for inference only - no training in the frontend.
cd model_training
pip install -r requirements.txt
python train.py
# Output: model_training/output/model.pklcd app/backend
pip install -r requirements.txt
cp .env.example .env # Add GOOGLE_AI_API_KEY for chatbot
python -m uvicorn main:app --reload --port 8000cd app/frontend
npm install
npm run dev
# Open http://localhost:5173- Algorithm: XGBoost with probability calibration
- Features: 29 telemetry features across 4 tiers + dirty air
- Output: Win probability (0–1) per team, normalized across fleet
- Positioning - rank, distance to leader/boat ahead, VMG, leg progress
- Wind - TWS, TWA, TWD, AWS, AWA, wind alignment
- Foiling - ride heights, average ride height, variance
- Stability - heel, pitch, yaw/roll/pitch rates
- Dirty Air - upwind boat distance, relative angle, composite dirty air score
For each boat, boats ahead within 50–200m upwind contribute:
dirty_air_score = Σ(distance_weight × wind_alignment_weight)
This affects speed prediction, win probability, and is visualized as a red overlay on the race map.
- 2D Race Map - boats, trails, course marks, wind arrow, dirty air overlay
- Win Probability Chart - horizontal bar chart per team
- Feature Breakdown - explainability panel with positives/negatives
- Chatbot - floating popup (Google AI Studio + local fallback)
- Change Race - switch between Race_1 … Race_8
- Playback - scrub timeline or auto-play race replay
Link
Source: SailGP Data Challenge / Hackathon 2026
Dataset: /data/Bermuda/ - 8 races, 1 Hz telemetry, marks, XML course definitions.
The app reads data_dictionary.md for column interpretation and builds feature mappings dynamically.
model_training/
train.py # Main training script
features.py # Feature engineering
dirty_air.py # Dirty air model
data_loader.py # Dataset loading
output/model.pkl # Trained model artifact
app/backend/
main.py # FastAPI app
services/
model_service.py # Inference + explainability
data_service.py # Race state
chat_service.py # Google AI chatbot
dirty_air.py # Dirty air (inference)
features.py # Feature engineering (inference)
app/frontend/
src/
App.tsx # Main dashboard layout
components/
RaceMap.tsx # 2D canvas visualization
WinProbabilityChart.tsx
FeatureBreakdown.tsx
Chatbot.tsx
Built for SailGP Data Challenge / Hackathon 2026.
Data © SailGP. Used under the terms of the competition.