🛣️ RoadSense AI

AI-Powered Road Accident Risk Prediction & GIS Analytics System

Predict road accident risk with machine learning. Visualize accident hotspots on interactive maps. Analyze traffic patterns in real-time.

🎯 Overview

RoadSense AI is a production-grade machine learning platform that predicts road accident risk based on real-world factors:

Weather conditions (clear, fog, rain)
Traffic density (high, low, medium)
Road type (highway, urban, rural)
Time features (hour, day, peak hours, night)
Accident severity indicators (vehicles, casualties)

Current Model: RandomForest with 95% inference reliability
Dataset: 20,000+ annotated accident records
API: REST endpoints with Swagger documentation
Deployment: Docker-ready, scalable

Key Features

🤖 Machine Learning Pipeline

Scikit-learn RandomForest model (100+ estimators)
One-hot encoding for categorical features
Feature alignment guaranteed (zero mismatch errors)
19 engineered features from raw data
Training/inference parity verified

🔮 Prediction API

POST /predict - Real-time accident risk classification
GET /health - Uptime monitoring
Swagger/OpenAPI documentation at /docs
Input validation on all endpoints
Structured responses with confidence scores

📊 Visualizations

Interactive hotspot maps (Folium + Leaflet)
City-wise accident analysis
Weather pattern analysis
Time-based trends (hourly, by day)
Correlation heatmaps

🏗️ Production Architecture

Modular preprocessing (training = inference)
Comprehensive validation (422 errors, clear messages)
Structured logging (all requests tracked)
CORS support (frontend integration ready)
Docker deployment (one-command setup)

Quick Start

Prerequisites

Python 3.11+
Docker (optional)
pip or conda

Option 1: Local Development

# Clone repository
git clone https://github.com/yourusername/roadsense-ai.git
cd roadsense-ai

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Start the API server
./run_api.sh

# Open browser to http://localhost:8000/docs

Option 2: Docker (Recommended)

# Build and run
docker-compose up -d

# Check status
curl http://localhost:8000/health

# Open http://localhost:8000/docs in browser

Making Your First Prediction

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "hour": 14,
    "is_weekend": 0,
    "temperature": 28.5,
    "vehicles_involved": 2,
    "casualties": 1,
    "is_peak_hour": 1,
    "is_night": 0,
    "road_type": "highway",
    "weather": "clear",
    "traffic_density": "high",
    "visibility": "high"
  }'

# Response:
# {
#   "prediction": "LOW RISK",
#   "risk_level": 0,
#   "confidence": 0.61,
#   "timestamp": "2026-05-20T..."
# }

API Documentation

Endpoints

Health Check

GET /health

Check API and model status. Used for monitoring and uptime checks.

Response:

{
  "status": "healthy",
  "model_loaded": true,
  "model_features": 19,
  "timestamp": "2026-05-20T09:16:05.216435"
}

Predict Accident Risk

POST /predict

Predict accident risk based on input features.

Request:

{
  "hour": 14,                        
  "is_weekend": 0,                   
  "temperature": 28.5,               
  "vehicles_involved": 2,            
  "casualties": 1,                   
  "is_peak_hour": 1,                 
  "is_night": 0,                     
  "road_type": "highway",            
  "weather": "clear",                
  "traffic_density": "high",         
  "visibility": "high"               
}

Response:

{
  "prediction": "LOW RISK",         
  "risk_level": 0,                   
  "confidence": 0.61,                
  "timestamp": "2026-05-20T..."      
}

Error Response (422):

{
  "detail": "Invalid value 'rain_heavy' for feature 'weather'. Allowed values: ['clear', 'fog', 'rain']"
}

Architecture

RoadSense AI
├── 📁 backend/                   
│   ├── main.py                    
│   ├── config.py                  
│   └── utils/
│       └── preprocessing.py       
│
├── 📁 notebooks/                  
│   ├── 01_data_collection.ipynb   
│   ├── 02_data_cleaning.ipynb     
│   ├── 03_eda_visualization.ipynb 
│   └── 04_ml_model.ipynb          
│
├── 📁 datasets/
│   ├── raw/                       
│   └── processed/                 )
│
├── 📁 models/
│   └── accident_risk_model.pkl   
│
├── 📁 frontend/                   
│
├── Dockerfile                    
├── docker-compose.yml             
└── requirements.txt

Data Pipeline

Raw Data (CSV, 20K rows)
    ↓
Data Cleaning (drop nulls, handle outliers)
    ↓
Feature Engineering (is_peak_hour, is_night, categoricals)
    ↓
Processed Dataset (master_accident_dataset.csv)
    ↓
One-Hot Encoding (19 features total)
    ↓
Train/Test Split (80/20)
    ↓
Model Training (RandomForest)
    ↓
Model Serialization (accident_risk_model.pkl)

ML Pipeline

Inference Request (JSON)
    ↓
Input Validation (Pydantic)
    ↓
Feature Preprocessing (alignment to training)
    ↓
One-Hot Encoding (consistent with training)
    ↓
Model Prediction (RandomForest.predict)
    ↓
Response Formatting (prediction + confidence)
    ↓
Prediction Response (JSON)

Model Performance

Model: RandomForestClassifier (100 estimators)
Dataset: 20,000 accident records
Train/Test Split: 80/20
Accuracy: 55.6% on test set

Feature Importance

1. Temperature .............. 31.2%
2. Hour ..................... 22.7%
3. Casualties ............... 11.5%
4. Vehicles Involved ........ 10.5%
5. Is Weekend ............... 4.0%

Note: Model is optimized for production safety (high recall on HIGH RISK). Prioritizes catching accidents over false positives.

Technology Stack

Backend

FastAPI - Modern Python web framework
Pydantic - Data validation & type hints
Scikit-learn - Machine learning
Pandas - Data manipulation
Joblib - Model serialization

Data & Visualization

Pandas - Data analysis
NumPy - Numerical computing
Matplotlib - Static plots
Seaborn - Statistical visualization
Folium - Interactive maps

Infrastructure

Docker - Containerization
Docker Compose - Multi-service orchestration
Uvicorn - ASGI server

Testing

Pytest - Test framework
FastAPI TestClient - API testing

Frontend (Coming Soon)

React/Next.js - UI framework
Tailwind CSS - Styling
Recharts - Data visualization
Leaflet - Interactive maps

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Dataset: Indian Road Accident Records (2022-2025)
Models: Scikit-learn RandomForest
Visualizations: Folium, Matplotlib, Seaborn
Framework: FastAPI, Pydantic

Contact

📧 Email: prashantsharma4840@outlook.com

Next Steps

Frontend Dashboard - React-based prediction interface
Live Deployment - Public API & dashboard
Monitoring - Performance tracking & alerts
Scale - Handle more data & predictions

Built by Prashant Sharma

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.idea		.idea
.vscode		.vscode
backend		backend
datasets		datasets
frontend		frontend
notebooks		notebooks
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements-dev.txt		requirements-dev.txt
requirements-prod.txt		requirements-prod.txt
requirements.txt		requirements.txt
run_api.sh		run_api.sh

Folders and files

Latest commit

History

Repository files navigation

🛣️ RoadSense AI

🎯 Overview

Key Features

🤖 Machine Learning Pipeline

🔮 Prediction API

📊 Visualizations

🏗️ Production Architecture

Quick Start

Prerequisites

Option 1: Local Development

Option 2: Docker (Recommended)

Making Your First Prediction

API Documentation

Endpoints

Health Check

Predict Accident Risk

Architecture

Data Pipeline

ML Pipeline

Model Performance

Feature Importance

Technology Stack

Backend

Data & Visualization

Infrastructure

Testing

Frontend (Coming Soon)

License

Acknowledgments

Contact

Next Steps

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages