This project provides a comprehensive analysis of customer shopping behavior using a multi-tool approach. The goal is to extract actionable insights from retail data to help businesses understand customer demographics, purchasing patterns, and the effectiveness of loyalty programs. The project covers the entire data lifecycle: from raw data cleaning in Python to advanced querying in SQL and interactive visualization in Power BI.
The analysis is based on the Customer Shopping Behavior Dataset, which contains 3,900 records of customer transactions.
- Key Features: Customer ID, Age, Gender, Item Purchased, Category, Purchase Amount (USD), Location, Size, Color, Season, Review Rating, Subscription Status, Shipping Type, Discount Applied, Previous Purchases, Payment Method, and Frequency of Purchases.
- Source:
customer_shopping_behavior.csv
- Python: Data cleaning, preprocessing, and exploratory data analysis (EDA) using
pandas. - SQL (PostgreSQL/MySQL): Advanced data querying and business logic implementation.
- Power BI: Interactive dashboarding for stakeholder reporting.
- Gamma: AI-powered presentation generation for executive summaries.
- Microsoft Word/PDF: Detailed project reporting.
- Data Loading & Cleaning (Python):
- Handled missing values (e.g., Review Rating).
- Standardized column names and data types.
- Feature engineering: Created
age_groupand convertedfrequency_of_purchasesinto numericalpurchase_frequency_days.
- Exploratory Data Analysis (EDA):
- Analyzed distributions and correlations using Python.
- Database Ingestion & SQL Analysis:
- Migrated cleaned data from Python to a PostgreSQL database using
SQLAlchemy. - Executed complex queries to segment customers and analyze revenue drivers.
- Migrated cleaned data from Python to a PostgreSQL database using
- Data Visualization:
- Developed an interactive dashboard in Power BI to track KPIs.
- Reporting:
- Compiled findings into a professional PDF report and a Gamma-powered PowerPoint presentation.

customer_behavior_dashboard.pbix in Power BI Desktop to interact with the full report.
- Customer Segmentation: Identified key segments (New, Returning, Loyal) based on purchase history to target marketing efforts.
- Revenue Drivers: Pinpointed high-performing product categories and locations.
- Discount Impact: Analyzed how promotional offers correlate with purchase frequency and total spend.
- Subscription Analysis: Evaluated the behavior of subscribed vs. non-subscribed customers to optimize loyalty programs.
- Python 3.x
- PostgreSQL or MySQL
- Power BI Desktop
Install the required libraries:
pip install pandas sqlalchemy psycopg2Run the Jupyter Notebook to clean the data and export it to your database:
jupyter notebook customer_behavior.ipynbNote: Update the database connection string in the notebook with your credentials.
Import the customer_shopping_behavior.sql file into your SQL editor (pgAdmin, MySQL Workbench, etc.) to run the analytical queries.
Open customer_behavior_dashboard.pbix using Power BI Desktop.
Deliverables: