Customer-Segmentation-using-K-Means-Clustering

📌 Overview

This project performs customer segmentation using unsupervised learning on large-scale retail transaction data. The objective is to group customers based on purchasing behavior to support targeted marketing, retention strategies, and business decision-making.

📂 Dataset

Source: UCI Machine Learning Repository
Dataset: Online Retail
Records: 500,000+ transaction records
Customers: ~4,000 unique customers

Key Columns

CustomerID
InvoiceNo
InvoiceDate
Quantity
UnitPrice
Country The raw transactional data was cleaned and aggregated to customer-level features before clustering.

⚙️ Tech Stack

Python
Pandas, NumPy
Matplotlib, Seaborn
Scikit-learn

🔄 Project Workflow

Problem Framing Segmentation of customers using unsupervised learning (no labels available).
Data Cleaning
- Removed missing CustomerID
- Removed cancelled/returned transactions
- Filtered invalid quantities and prices
Feature Engineering (RFM Analysis)
- Recency: Days since last purchase
- Frequency: Number of transactions
- Monetary: Total spending
Exploratory Data Analysis (EDA)
- Analyzed feature distributions and skewness
- Identified need for feature scaling
Clustering
- Applied K-Means clustering
- Optimized number of clusters using Elbow Method and Silhouette Score
Dimensionality Reduction
- Used PCA for 2D visualization of customer segments

📊 Results

Successfully identified distinct customer segments
Clear separation between:
- High-value loyal customers
- Frequent low-spend customers
- At-risk or churn-prone customers
- Occasional buyers

💡 Business Impact

Enables targeted marketing campaigns
Helps prioritize high-value customers
Identifies churn-risk customers for retention strategies

⚠️ Limitations

Clusters are sensitive to outliers
No ground-truth labels for validation
Segmentation quality depends on feature engineering

🚀 Future Improvements

Try alternative clustering methods (DBSCAN, Hierarchical)
Add temporal features
Evaluate cluster stability over time

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
customer_segmentation.ipynb		customer_segmentation.ipynb
data_source.txt		data_source.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer-Segmentation-using-K-Means-Clustering

📌 Overview

📂 Dataset

Key Columns

⚙️ Tech Stack

🔄 Project Workflow

📊 Results

💡 Business Impact

⚠️ Limitations

🚀 Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Customer-Segmentation-using-K-Means-Clustering

📌 Overview

📂 Dataset

Key Columns

⚙️ Tech Stack

🔄 Project Workflow

📊 Results

💡 Business Impact

⚠️ Limitations

🚀 Future Improvements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages