A retail store needed better insights from sales data to understand overall trends, customer spends, and high performing products.
The goal was to analyse sales data, uploaded and stored in Databricks Catalog, to answer three key questions:
- How are sales changing over time?
- Who are our most valuable customers?
- Which product categories make us the most money?
- Data source: csv files
- Tools used: Databricks Free Edition, Notebook and Dashboard to analyse 60,000+ transactions from 2010-2013, creating queries to track monthly sales performance, classify customers into segments (New, Regular, VIP), and calculate product category contributions using aggregate functions, CTE and window functions.
- Data model
All charts generated in Databricks
- Sales trends: Revenue grew from 2010 to 2013, with peak sales in Dec 2013 at €1.87M
- Customer base: 79% are new customers, only 9% are VIPs (lifespan > 12 months and spend > €5K)
- Product mix: Bikes generate 96% of revenue; Accessories and Clothing combined contribute less than 4%
- Improve customer retention to convert New customers into Regular and VIP segments
- Reduce revenue dependency on Bikes by growing Accessories and Clothing sales
- Investigate seasonal sales patterns to optimize inventory and marketing
- Build automated dashboards for ongoing performance monitoring
Create predictive models for sales forecasting and connect sales data with marketing campaigns to identify what drives purchases.