CrediSense- A Credit Risk Prediction Application using Machine Learning

CrediSense is an intelligent credit risk prediction system that leverages machine learning algorithms to assess the creditworthiness of loan applicants. Built with Streamlit, the application provides an intuitive interface for financial institutions to make data-driven lending decisions while minimizing default risks.
🔗 Live Demo: https://credisense.streamlit.app

Key Features

Multi-Model Ensemble: Implements and compares four powerful machine learning algorithms
Interactive Dashboard: Real-time predictions through an intuitive Streamlit interface
Comprehensive Evaluation: Multiple performance metrics for robust model assessment
Data Visualization: Clear insights into model performance and feature importance
Production-Ready: Optimized for deployment in real-world financial environments

Models Implemented

Decision Trees: Baseline tree-based classifier with interpretable decision rules.
Random Forest: Ensemble method using bootstrap aggregation and random feature selection to reduce overfitting.
Extra Trees: Variant of Random Forest with additional randomization in split selection for better generalization.
XGBoost: Gradient boosting framework with regularization, typically achieving best performance on structured data.

The performance of all implemented models was evaluated using Accuracy, F1-Score, and AUC-ROC to ensure a balanced and reliable assessment.

Model	Accuracy	F1-Score	AUC-ROC
Extra Trees	0.6476	0.6838	0.6912
Random Forest	0.6190	0.6667	0.6735
XGBoost	0.6286	0.6549	0.6455
Decision Tree	0.5810	0.6207	0.5693

Best Model Selected: Extra Trees Classifier
The Extra Trees model achieved the highest AUC-ROC score, indicating superior class separation capability and overall robustness for credit risk prediction.

Tech Stack

Python 3.8+
Data Processing: Pandas, NumPy
Machine Learning: Scikit-learn, XGBoost
Visualization: Matplotlib, Seaborn
Deployment: Streamlit

Dataset Used

German Credit Dataset - 1,000 credit applications with 20 features including demographic attributes, financial indicators, and loan characteristics. Binary target variable indicates creditworthiness. Link to the dataset

Note: The dataset is not included in this repository due to Kaggle’s licensing restrictions. Please ensure the dataset is downloaded manually and placed in the correct directory before running the project.

Pipeline Workflow

Data Ingestion: Load German Credit Dataset
EDA: Feature distributions, correlations, class imbalance analysis
Preprocessing: Handle missing values, encode categorical variables, scale numerical features
Model Training: Train Decision Trees, Random Forest, Extra Trees, and XGBoost
Evaluation: Compare models using Accuracy, F1-Score, AUC-ROC, and Confusion Matrices
Deployment: Integrate best model into Streamlit application

Installation

# Clone repository
git clone https://github.com/Madhuri36/CrediSense.git
cd CrediSense

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Dataset Setup

Visit the link above and download the dataset (german_credit_data.csv).
Extract the file if it is in a ZIP format.
Create a folder named data in the project root (if not already present).
Place the downloaded CSV file inside the data/ directory.

Running Web Application

streamlit run app.py

Access the application at http://localhost:8501

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.devcontainer		.devcontainer
.gitignore		.gitignore
Checking account_encoder.pkl		Checking account_encoder.pkl
Housing_encoder.pkl		Housing_encoder.pkl
README.md		README.md
Saving accounts_encoder.pkl		Saving accounts_encoder.pkl
Sex_encoder.pkl		Sex_encoder.pkl
analysis_model.ipynb		analysis_model.ipynb
app.py		app.py
extra_trees_credit_model.pkl		extra_trees_credit_model.pkl
requirements.txt		requirements.txt
target_encoder.pkl		target_encoder.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CrediSense- A Credit Risk Prediction Application using Machine Learning

Key Features

Models Implemented

Tech Stack

Dataset Used

Pipeline Workflow

Installation

Dataset Setup

Running Web Application

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CrediSense- A Credit Risk Prediction Application using Machine Learning

Key Features

Models Implemented

Tech Stack

Dataset Used

Pipeline Workflow

Installation

Dataset Setup

Running Web Application

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages