Bankruptcy Prediction using Machine Learning

(https://colab.research.google.com/github/JTerZeus/classification-bankruptcy-ml/blob/main/notebooks/classification_project.ipynb)

Bankruptcy Prediction using Machine Learning

This repository contains the complete implementation and analysis for a university assignment on classification problems, focusing on corporate bankruptcy prediction using financial indicators.

The project evaluates and compares multiple machine learning classifiers under class imbalance conditions and answers specific performance-related questions defined in the assignment.

Assignment Context

The dataset consists of financial ratios, binary activity indicators, company status (healthy or bankrupt), and the corresponding year for each company. Each row represents a different company.

The implementation follows the full assignment specification, including:

data loading from Excel,
exploratory data analysis and visualization,
missing value checks,
Min–Max normalization,
Stratified K-Fold cross-validation (k=4),
class imbalance handling with undersampling (3:1 ratio),
training and evaluation of multiple classification models,
generation of confusion matrices,
storage of experimental results in CSV/Excel format,
additional analysis and visualization using pivot tables in Excel.

Repository Structure

notebooks/
Executable Jupyter notebook developed in Google Colab.
This is the main implementation and contains all experiments, figures and outputs.
data/
Input dataset provided by the assignment.
results/
Output files generated from the Python code and further analyzed in Excel (e.g. balancedDataOutcomes.csv / .xlsx).
report/
Final project report (PDF), written in Greek, answering all assignment questions.

Notebook Execution

The project was originally developed in Google Colab. The Jupyter notebook can be executed cell-by-cell in a notebook environment.

An exported .py version of the notebook is also included for reference, but the notebook is the recommended way to run the code.

Models Implemented

The following eight (8) classification models were trained and evaluated:

Linear Discriminant Analysis (LDA)
Logistic Regression
Decision Tree
Random Forest
k-Nearest Neighbors (k-NN)
Naive Bayes
Support Vector Machine (SVM)
XGBoost (additional model)

Evaluation Metrics

Model performance was evaluated on both training and test sets using:

Accuracy
Precision
Recall (Sensitivity)
F1 Score
ROC-AUC
Specificity (computed during Excel analysis)

Due to class imbalance, F1 Score was selected as the primary metric for model comparison.

Results

The Python code generates a CSV file (balancedDataOutcomes.csv) containing detailed results for all folds, models and datasets. This file was later converted to Excel and used to compute additional metrics and create comparison plots using pivot tables.

The final report summarizes the results and answers:

which model performs best overall, and
whether the required performance constraints are satisfied.

References

[1] Scikit-learn Developers. Model Evaluation: Classification Metrics.
https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics

[2] XGBoost Developers. XGBoost Documentation.
https://xgboost.readthedocs.io/en/stable/

[3] Wikipedia contributors. Confusion matrix.
https://en.wikipedia.org/wiki/Confusion_matrix

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
code		code
data		data
notebooks		notebooks
report		report
results		results
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bankruptcy Prediction using Machine Learning

Assignment Context

Repository Structure

Notebook Execution

Models Implemented

Evaluation Metrics

Results

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bankruptcy Prediction using Machine Learning

Assignment Context

Repository Structure

Notebook Execution

Models Implemented

Evaluation Metrics

Results

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages