End-to-End Data Analytics Project (Python + SQL + PostgreSQL + Power BI)
This project analyzes customer shopping behavior to extract actionable business insights using a complete analytics workflow — from raw data to an interactive dashboard.
The goal was not just visualization, but to understand how real-world data analytics works across:
data cleaning
SQL-based analysis
database integration
business-focused insights
dashboard storytelling
The project answers 10 real business questions, including:
What is the total revenue generated by male vs female customers?
Which customers used discounts but still spent more than the average purchase amount?
What are the top 5 products with the highest average review ratings?
How does average purchase amount differ between Standard and Express shipping?
Do subscribed customers spend more? (average spend + total revenue)
Which products rely most on discounts for sales?
How many customers are New, Returning, and Loyal?
What are the top 3 most purchased products within each category?
Are repeat buyers more likely to subscribe?
How does revenue vary across different age groups?
Python (Pandas, NumPy) — data cleaning & feature engineering
SQL (PostgreSQL) — business query analysis
PostgreSQL — database storage & querying
Power BI — interactive dashboard & KPIs
VS Code + Jupyter Notebook
Raw CSV loaded into Pandas
Data cleaning & feature engineering:
handled missing values
created age groups
standardized categorical fields
Final dataset loaded into PostgreSQL
SQL queries executed using SQLAlchemy
Power BI connected directly to PostgreSQL
Dashboard built using DAX measures & slicers
KPI Cards:
Average Purchase Amount
Average Review Rating
Number of Customers
Donut Chart:
Subscription Status Distribution
Bar Charts:
Sales by Category
Revenue by Age Group
Interactive Slicers:
Gender
Category
Shipping Type
Subscription Status
The dashboard is fully interactive and reflects live SQL-backed data.
Transitioning from MySQL-style SQL (LeetCode) to PostgreSQL
strict GROUP BY rules
WHERE vs HAVING
type casting (::numeric)
Connecting Jupyter Notebook → PostgreSQL → Power BI
Understanding database schemas (public.customer)
Learning DAX and filter context from scratch
Debugging SQL errors that initially felt confusing
These challenges significantly strengthened my understanding of real-world analytics, beyond practice platforms.
Non-subscribers currently contribute more total revenue
Some products are highly dependent on discounts for sales
Express shipping users spend slightly more on average
Loyal customers do not necessarily subscribe more — indicating a product/marketing gap
Revenue contribution varies significantly across age groups
End-to-end data analytics workflow
SQL applied to business questions (not toy problems)
Database-backed dashboards
Ability to translate data → insight → story
Add more advanced SQL (CTEs, window functions)
Improve dashboard storytelling
Apply similar analysis to a larger dataset
Prepare this project for interviews & case discussions
LeetCode — for building SQL fundamentals
Amlan Mohanty — for project inspiration & structure
ChatGPT — for debugging guidance and conceptual clarity

