Skip to content

waruiM/Loan-Default-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🏦 Loan Default Prediction — Streamlit Web App

An interactive data analytics and machine-learning application built with Streamlit for exploring a loan-default dataset and predicting borrower risk in real time.

Python Streamlit scikit-learn License


📋 Table of Contents


Overview

This project provides an end-to-end workflow for Loan Default Prediction:

  1. Explore — Rich exploratory data analysis (EDA) with 10+ interactive chart types.
  2. Model — Train and compare three ML algorithms (Random Forest, Gradient Boosting, Logistic Regression).
  3. Predict — Enter applicant details and receive an instant default-risk prediction with a probability gauge.

The app is designed with a premium, modern UI featuring gradient themes, animated cards, and smooth interactions.


Features

🏠 Home Page

  • Hero section with a gradient banner and animated feature cards.
  • At-a-glance dataset snapshot (total records, features, default rate).

📊 Exploratory Data Analysis (EDA)

Visualization Description
Data Overview Raw data preview, column types, missing-value audit, descriptive statistics
Histograms + Box Plots Distribution of any numerical feature with marginals
Violin Plots Density + box plot overlay for numerical features
Small Multiples All numerical distributions side-by-side
Bar & Donut Charts Counts and proportions for categorical features
Stacked Default Bars Default rate breakdown per category
Correlation Heatmap Pairwise Pearson correlation matrix
Scatter Plot Studio Interactive X/Y/Color selectors
Box Plots vs Target Feature distributions split by default status
Pair Scatter Matrix Pairwise scatter plots for top features
Grouped Mean Bars Average feature values by default status

🤖 Machine Learning Modeling

Feature Description
3 Algorithms Random Forest · Gradient Boosting · Logistic Regression
Configurable Split Adjustable test-size slider (10 %–40 %)
Evaluation Metrics Accuracy · Precision · Recall · F1 Score
Confusion Matrix Annotated heatmap
ROC Curve Area under the curve visualization
Classification Report Full precision / recall / F1 per class
Feature Importance Top-15 bar chart (tree-based models)

🔮 Prediction Playground

  • Dynamic input form with sliders and dropdowns.
  • Real-time default probability with a visual gauge indicator.
  • Clear risk verdict (✅ Low Risk / ⚠️ High Risk).

Dataset

The dataset (Loan_default.csv) contains 255,348 records and 18 columns:

Column Type Description
LoanID String Unique loan identifier (dropped during preprocessing)
Age Integer Borrower's age
Income Integer Annual income
LoanAmount Integer Requested loan amount
CreditScore Integer Credit score (300–850)
MonthsEmployed Integer Months at current job
NumCreditLines Integer Number of open credit lines
InterestRate Float Loan interest rate (%)
LoanTerm Integer Loan term in months (12, 24, 36, 48, 60)
DTIRatio Float Debt-to-income ratio
Education Category High School · Bachelor's · Master's · PhD
EmploymentType Category Full-time · Part-time · Self-employed · Unemployed
MaritalStatus Category Single · Married · Divorced
HasMortgage Category Yes / No
HasDependents Category Yes / No
LoanPurpose Category Home · Auto · Education · Business · Other
HasCoSigner Category Yes / No
Default Binary Target — 0 = No Default, 1 = Default

Project Structure

Loan_default.csv/
├── app.py              # Main Streamlit entry point & navigation
├── eda.py              # EDA visualizations (10+ chart types)
├── model.py            # ML training, evaluation & prediction
├── utils.py            # Data loading & preprocessing utilities
├── Loan_default.csv    # Source dataset (255k rows)
├── requirements.txt    # Python dependencies
└── README.md           # This file

Installation

Prerequisites

  • Python 3.10+ installed on your system.
  • (Optional) A virtual environment tool (venv, conda, etc.).

Steps

# 1. Clone or download the project
cd "path/to/Loan_default.csv"

# 2. (Recommended) Create a virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # macOS / Linux

# 3. Install dependencies
pip install -r requirements.txt

Usage

# Launch the Streamlit app
streamlit run app.py

The app will open automatically at http://localhost:8501.

Use the sidebar to navigate between pages:

Page What to do
🏠 Home Read the overview and check the dataset snapshot
📊 Exploratory Analysis Select features, switch tabs, explore visualizations
🤖 ML Modeling Pick an algorithm → Train → Evaluate → Predict

Screenshots

Launch the app and explore each page to see the premium UI in action!

  • Home — Gradient hero banner, feature cards, KPI metrics
  • EDA — Interactive Plotly charts with tabs for Overview / Univariate / Bivariate
  • ML Modeling — Model training, confusion matrix, ROC curve, feature importance, prediction gauge

Technologies Used

Technology Purpose
Streamlit Web application framework
Pandas Data manipulation and analysis
NumPy Numerical computing
Plotly Interactive visualizations
Matplotlib Static plotting
Seaborn Statistical visualizations
scikit-learn Machine learning pipeline

Contributing

Contributions are welcome! Feel free to:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License. See the LICENSE file for details.


Built with ❤️ using Streamlit • 2026

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages