Skip to content

Mahajan-Sachin/Customer_Behavior

Repository files navigation

🛒 Customer Shopping Behavior Analysis

End-to-End Data Analytics Project (Python + SQL + PostgreSQL + Power BI)


📌 Project Overview

This project analyzes customer shopping behavior to extract actionable business insights using a complete analytics workflow — from raw data to an interactive dashboard.

The goal was not just visualization, but to understand how real-world data analytics works across:

data cleaning

SQL-based analysis

database integration

business-focused insights

dashboard storytelling


🧠 Business Questions Answered

The project answers 10 real business questions, including:

What is the total revenue generated by male vs female customers?

Which customers used discounts but still spent more than the average purchase amount?

What are the top 5 products with the highest average review ratings?

How does average purchase amount differ between Standard and Express shipping?

Do subscribed customers spend more? (average spend + total revenue)

Which products rely most on discounts for sales?

How many customers are New, Returning, and Loyal?

What are the top 3 most purchased products within each category?

Are repeat buyers more likely to subscribe?

How does revenue vary across different age groups?


🛠️ Tech Stack

Python (Pandas, NumPy) — data cleaning & feature engineering

SQL (PostgreSQL) — business query analysis

PostgreSQL — database storage & querying

Power BI — interactive dashboard & KPIs

VS Code + Jupyter Notebook


📂 Project Structure

alt text

🔄 Data Workflow

Raw CSV loaded into Pandas

Data cleaning & feature engineering:

handled missing values

created age groups

standardized categorical fields

Final dataset loaded into PostgreSQL

SQL queries executed using SQLAlchemy

Power BI connected directly to PostgreSQL

Dashboard built using DAX measures & slicers


📊 Dashboard Highlights

KPI Cards:

Average Purchase Amount

Average Review Rating

Number of Customers

Donut Chart:

Subscription Status Distribution

Bar Charts:

Sales by Category

Revenue by Age Group

Interactive Slicers:

Gender

Category

Shipping Type

Subscription Status

The dashboard is fully interactive and reflects live SQL-backed data.

alt text

⚠️ Challenges Faced (Real Learning)

Transitioning from MySQL-style SQL (LeetCode) to PostgreSQL

strict GROUP BY rules

WHERE vs HAVING

type casting (::numeric)

Connecting Jupyter Notebook → PostgreSQL → Power BI

Understanding database schemas (public.customer)

Learning DAX and filter context from scratch

Debugging SQL errors that initially felt confusing

These challenges significantly strengthened my understanding of real-world analytics, beyond practice platforms.


📈 Key Insights

Non-subscribers currently contribute more total revenue

Some products are highly dependent on discounts for sales

Express shipping users spend slightly more on average

Loyal customers do not necessarily subscribe more — indicating a product/marketing gap

Revenue contribution varies significantly across age groups


📌 What This Project Demonstrates

End-to-end data analytics workflow

SQL applied to business questions (not toy problems)

Database-backed dashboards

Ability to translate data → insight → story


🚀 Next Steps

Add more advanced SQL (CTEs, window functions)

Improve dashboard storytelling

Apply similar analysis to a larger dataset

Prepare this project for interviews & case discussions


🙏 Acknowledgements

LeetCode — for building SQL fundamentals

Amlan Mohanty — for project inspiration & structure

ChatGPT — for debugging guidance and conceptual clarity

📬 If you’re a recruiter or learner and want to discuss this project, feel free to connect.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors