Skip to content

fameshkatre87/CricketIQ

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🏏 CricketIQ β€” AI IPL Auction Price Prediction

Infosys Springboard 6.0 Internship Project

Presented by: Famesh Katre

Mentor: Pranaya Ma'am

Duration: 8 Weeks

Internship Project β€” Dynamic IPL Player Auction Value Prediction using AI and Multi-source Data

Live Demo GitHub Python PyTorch FastAPI Streamlit


πŸ“Œ Project Overview

CricketIQ is an AI-driven system that predicts IPL player auction prices by integrating multi-source data β€” batting/bowling performance statistics, news sentiment, historical auction prices, and player profile features.

The system uses a stacked ensemble of LSTM time-series models and XGBoost/LightGBM to produce season-by-season auction price forecasts.


πŸ—οΈ Architecture

Data Sources                  Pipeline                     Output
──────────────────────────────────────────────────────────────────
Cricsheet Ball-by-Ball ──┐
IPL Auction History    ───── Preprocessing ──► LSTM ──┐
News Sentiment (VADER) ───   Feature Eng.              β”œβ”€β”€ Ensemble ──► β‚ΉCr Prediction
Player Profiles        β”€β”€β”˜                   XGBoost β”€β”€β”˜

πŸ“ Project Structure

CricketIQ/
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/
β”‚   β”‚   β”œβ”€β”€ ipl_batting.csv          # Season batting stats (50 players)
β”‚   β”‚   β”œβ”€β”€ ipl_bowling.csv          # Season bowling stats
β”‚   β”‚   └── ipl_auction.csv          # Historical auction prices 2019–2024
β”‚   β”œβ”€β”€ processed/
β”‚   β”‚   β”œβ”€β”€ cricket_feature_matrix.csv
β”‚   β”‚   β”œβ”€β”€ scaler.pkl
β”‚   β”‚   └── evaluation_report.csv
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ cricket_lstm.pt
β”‚   β”‚   β”œβ”€β”€ cricket_xgboost.pkl
β”‚   β”‚   └── cricket_ensemble.pkl
β”‚   └── sentiment/
β”‚       └── ipl_sentiment.csv
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ data_collection/
β”‚   β”‚   β”œβ”€β”€ cricsheet_loader.py       # IPL ball-by-ball + auction data
β”‚   β”‚   └── cricket_sentiment.py     # VADER NLP + cricket lexicon
β”‚   β”‚
β”‚   β”œβ”€β”€ preprocessing/
β”‚   β”‚   └── cricket_feature_engineer.py
β”‚   β”‚
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   └── cricket_models.py         # LSTM + XGBoost + Ensemble
β”‚   β”‚
β”‚   └── visualization/
β”‚       └── dashboard.py             # Streamlit dashboard
β”‚
β”œβ”€β”€ api/
β”‚   └── main.py                      # FastAPI REST API
β”‚
β”œβ”€β”€ run_pipeline.py                  # Master pipeline
β”œβ”€β”€ requirements.txt
└── README.md

πŸš€ Quick Start

# 1. Clone and setup
git clone https://github.com/fameshkatre87/CricketIQ.git
cd CricketIQ

# 2. Virtual environment
python -m venv venv
source venv/bin/activate        # Mac/Linux
venv\Scripts\activate           # Windows

# 3. Install packages
pip install -r requirements.txt

# 4. Run full pipeline
python run_pipeline.py

# 5. Launch dashboard
streamlit run src/visualization/dashboard.py

# 6. Start API (optional)
uvicorn api.main:app --reload --port 8000

πŸ“Š Data Sources

Source Data Records
Cricsheet (IPL) Ball-by-ball β†’ batting/bowling stats 50 players Γ— 6 seasons
IPL Auction DB Historical prices 2019–2024 284 auction records
News VADER NLP Cricket sentiment scores 139 weekly records
Player Profiles Role, nationality, experience 50 real IPL players

Real Players Included (50 total)

Batters: Virat Kohli, Rohit Sharma, David Warner, KL Rahul, Shubman Gill, Faf du Plessis, Jos Buttler, Suryakumar Yadav, Ruturaj Gaikwad, Yashasvi Jaiswal, MS Dhoni, AB de Villiers, Rishabh Pant, Sanju Samson, Ishan Kishan, Tilak Varma...

All-Rounders: Hardik Pandya, Ravindra Jadeja, Andre Russell, Glenn Maxwell, Ben Stokes, Sam Curran, Axar Patel, Pat Cummins...

Bowlers: Jasprit Bumrah, Yuzvendra Chahal, Rashid Khan, Bhuvneshwar Kumar, Mohammed Shami, Trent Boult, Kagiso Rabada, Kuldeep Yadav...


πŸ”§ Feature Engineering (62 total features)

Batting: runs_per_match, impact_score, milestone_score (50s + 100s), consistency_score (avg Γ— SR), boundary_pct, six_pct, dot_ball_risk

Bowling: bowling_impact, death_specialist flag, powerplay_bowler flag, wicket_taker_score, economy_score

All-Rounder: is_allrounder, allrounder_score, dual_threat_bonus

Experience: seasons_played, veteran_flag (β‰₯8 seasons), young_prospect (≀3 seasons), peak_experience flag

Market: prev_auction_price, price_trend, log_prev_price, overseas_premium

Sentiment: compound_score, positive_pct, negative_pct, value_impact_cr


πŸ“ˆ Model Results

Model RMSE (β‚ΉCr) MAE (β‚ΉCr) RΒ²
LSTM (Attention) 5.74 5.39 0.81
XGBoost 4.63 3.98 0.87
LightGBM 4.70 4.05 0.86
Ensemble (Final) 4.25 3.62 0.91

πŸ”Œ API Endpoints

GET  /                        Health check
GET  /players                 All 50 IPL players
GET  /player/{name}           Player stats + prediction
POST /predict                 Predict auction price
GET  /predict/top             Top value players for 2025
GET  /models/comparison       Model metrics
GET  /features/importance     XGBoost feature importance
GET  /sentiment/{name}        Player sentiment score

🏏 Cricket-Specific VADER Lexicon (28 terms)

Positive additions: century (+3.5), hat-trick (+3.5), six (+2.0), masterclass (+3.2), match-winning (+3.2), orange cap (+2.8)

Negative additions: duck (-2.8), golden duck (-3.2), injured (-2.8), ruled out (-3.0), poor form (-2.5), expensive (-2.0)


πŸ—“οΈ Weekly Milestones

Week Task Status
1 IPL batting/bowling/auction data collection βœ…
2 Feature engineering (62 features) βœ…
3–4 Advanced features + sentiment integration βœ…
5 LSTM with attention mechanism βœ…
6 XGBoost + LightGBM + Ensemble βœ…
7 Evaluation + comparison report βœ…
8 Streamlit dashboard + API + docs βœ…

Built with ❀️ as an Internship Project β€” CricketIQ v1.0 🏏

About

🏏 AI-powered IPL auction price prediction | LSTM + XGBoost + Ensemble | VADER Cricket NLP | Infosys Springboard 6.0 Internship Project by Famesh Katre

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages