Skip to content

prashant4840/roadsense-ai

Repository files navigation

🛣️ RoadSense AI

AI-Powered Road Accident Risk Prediction & GIS Analytics System

Predict road accident risk with machine learning. Visualize accident hotspots on interactive maps. Analyze traffic patterns in real-time.

Python FastAPI Scikit-learn Docker License


🎯 Overview

RoadSense AI is a production-grade machine learning platform that predicts road accident risk based on real-world factors:

  • Weather conditions (clear, fog, rain)
  • Traffic density (high, low, medium)
  • Road type (highway, urban, rural)
  • Time features (hour, day, peak hours, night)
  • Accident severity indicators (vehicles, casualties)

Current Model: RandomForest with 95% inference reliability
Dataset: 20,000+ annotated accident records
API: REST endpoints with Swagger documentation
Deployment: Docker-ready, scalable


Key Features

🤖 Machine Learning Pipeline

  • Scikit-learn RandomForest model (100+ estimators)
  • One-hot encoding for categorical features
  • Feature alignment guaranteed (zero mismatch errors)
  • 19 engineered features from raw data
  • Training/inference parity verified

🔮 Prediction API

  • POST /predict - Real-time accident risk classification
  • GET /health - Uptime monitoring
  • Swagger/OpenAPI documentation at /docs
  • Input validation on all endpoints
  • Structured responses with confidence scores

📊 Visualizations

  • Interactive hotspot maps (Folium + Leaflet)
  • City-wise accident analysis
  • Weather pattern analysis
  • Time-based trends (hourly, by day)
  • Correlation heatmaps

🏗️ Production Architecture

  • Modular preprocessing (training = inference)
  • Comprehensive validation (422 errors, clear messages)
  • Structured logging (all requests tracked)
  • CORS support (frontend integration ready)
  • Docker deployment (one-command setup)

Quick Start

Prerequisites

  • Python 3.11+
  • Docker (optional)
  • pip or conda

Option 1: Local Development

# Clone repository
git clone https://github.com/yourusername/roadsense-ai.git
cd roadsense-ai

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Start the API server
./run_api.sh

# Open browser to http://localhost:8000/docs

Option 2: Docker (Recommended)

# Build and run
docker-compose up -d

# Check status
curl http://localhost:8000/health

# Open http://localhost:8000/docs in browser

Making Your First Prediction

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "hour": 14,
    "is_weekend": 0,
    "temperature": 28.5,
    "vehicles_involved": 2,
    "casualties": 1,
    "is_peak_hour": 1,
    "is_night": 0,
    "road_type": "highway",
    "weather": "clear",
    "traffic_density": "high",
    "visibility": "high"
  }'

# Response:
# {
#   "prediction": "LOW RISK",
#   "risk_level": 0,
#   "confidence": 0.61,
#   "timestamp": "2026-05-20T..."
# }

API Documentation

Endpoints

Health Check

GET /health

Check API and model status. Used for monitoring and uptime checks.

Response:

{
  "status": "healthy",
  "model_loaded": true,
  "model_features": 19,
  "timestamp": "2026-05-20T09:16:05.216435"
}

Predict Accident Risk

POST /predict

Predict accident risk based on input features.

Request:

{
  "hour": 14,                        
  "is_weekend": 0,                   
  "temperature": 28.5,               
  "vehicles_involved": 2,            
  "casualties": 1,                   
  "is_peak_hour": 1,                 
  "is_night": 0,                     
  "road_type": "highway",            
  "weather": "clear",                
  "traffic_density": "high",         
  "visibility": "high"               
}

Response:

{
  "prediction": "LOW RISK",         
  "risk_level": 0,                   
  "confidence": 0.61,                
  "timestamp": "2026-05-20T..."      
}

Error Response (422):

{
  "detail": "Invalid value 'rain_heavy' for feature 'weather'. Allowed values: ['clear', 'fog', 'rain']"
}

Architecture

RoadSense AI
├── 📁 backend/                   
│   ├── main.py                    
│   ├── config.py                  
│   └── utils/
│       └── preprocessing.py       
│
├── 📁 notebooks/                  
│   ├── 01_data_collection.ipynb   
│   ├── 02_data_cleaning.ipynb     
│   ├── 03_eda_visualization.ipynb 
│   └── 04_ml_model.ipynb          
│
├── 📁 datasets/
│   ├── raw/                       
│   └── processed/                 )
│
├── 📁 models/
│   └── accident_risk_model.pkl   
│
├── 📁 frontend/                   
│
├── Dockerfile                    
├── docker-compose.yml             
└── requirements.txt               

Data Pipeline

Raw Data (CSV, 20K rows)
    ↓
Data Cleaning (drop nulls, handle outliers)
    ↓
Feature Engineering (is_peak_hour, is_night, categoricals)
    ↓
Processed Dataset (master_accident_dataset.csv)
    ↓
One-Hot Encoding (19 features total)
    ↓
Train/Test Split (80/20)
    ↓
Model Training (RandomForest)
    ↓
Model Serialization (accident_risk_model.pkl)

ML Pipeline

Inference Request (JSON)
    ↓
Input Validation (Pydantic)
    ↓
Feature Preprocessing (alignment to training)
    ↓
One-Hot Encoding (consistent with training)
    ↓
Model Prediction (RandomForest.predict)
    ↓
Response Formatting (prediction + confidence)
    ↓
Prediction Response (JSON)

Model Performance

Model: RandomForestClassifier (100 estimators)
Dataset: 20,000 accident records
Train/Test Split: 80/20
Accuracy: 55.6% on test set

Feature Importance

1. Temperature .............. 31.2%
2. Hour ..................... 22.7%
3. Casualties ............... 11.5%
4. Vehicles Involved ........ 10.5%
5. Is Weekend ............... 4.0%

Note: Model is optimized for production safety (high recall on HIGH RISK). Prioritizes catching accidents over false positives.


Technology Stack

Backend

  • FastAPI - Modern Python web framework
  • Pydantic - Data validation & type hints
  • Scikit-learn - Machine learning
  • Pandas - Data manipulation
  • Joblib - Model serialization

Data & Visualization

  • Pandas - Data analysis
  • NumPy - Numerical computing
  • Matplotlib - Static plots
  • Seaborn - Statistical visualization
  • Folium - Interactive maps

Infrastructure

  • Docker - Containerization
  • Docker Compose - Multi-service orchestration
  • Uvicorn - ASGI server

Testing

  • Pytest - Test framework
  • FastAPI TestClient - API testing

Frontend (Coming Soon)

  • React/Next.js - UI framework
  • Tailwind CSS - Styling
  • Recharts - Data visualization
  • Leaflet - Interactive maps

License

This project is licensed under the MIT License - see the LICENSE file for details.


Acknowledgments

  • Dataset: Indian Road Accident Records (2022-2025)
  • Models: Scikit-learn RandomForest
  • Visualizations: Folium, Matplotlib, Seaborn
  • Framework: FastAPI, Pydantic

Contact


Next Steps

  1. Frontend Dashboard - React-based prediction interface
  2. Live Deployment - Public API & dashboard
  3. Monitoring - Performance tracking & alerts
  4. Scale - Handle more data & predictions

Built by Prashant Sharma

About

AI-powered road safety platform that maps accident hotspots and predicts risk across India using 3 years of accident data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors