Skip to content

Kurodataio/EDA-on-Sales-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis (EDA) on Retail Chain Sales Data

The project is to conduct an Exploratory Data Analysis (EDA) of GlobalMart's Retail Chain Sales Data

GlobalMart is a large international retail chain. The company wants to understand its sales performance across different regions, product categories, and time periods.

Notebook Link


Table of Contents


Overview

The task is to perform an exploratory data analysis on the company's sales data to uncover insights that can drive business decisions.


Dataset

  • Source of the dataset is 'Global_Superstore.csv' and sourced from ITOnlinelearning/Kaggle
  • The dataset has 1000 rows and 24 columns/features
  • Key features/columns are Product name,, Sale , Quantity, Category, Region, Segment
  • The Postal code feature was dropped due to missing values of 80% of the dataset
  • Duplicated rows, valid feature types such as dates where checked

Technologies Used

  • Languages & Libraries: Python, Pandas, NumPy, Matplotlib, Seaborn, Statsmodel
  • Tools: Jupyter Notebook, VS Code, Git, GitHub

Python Pandas NumPy Matplotlib Seaborn Statsmodels

Jupyter Notebook VS Code Git GitHub

MIT License


Installation

Step-by-step instructions to set up the project locally:

# Clone the repository
git clone https://github.com/Kurodataio/EDA-on-Sales-Data.git

# Navigate to the project folder
cd EDA-on-Sales-Data

# Launch Jupyter Notebook
jupyter notebook

Usage

Instructions for using the project:

  1. Open the main notebook (EDA-on-Sales-Data.ipynb)
  2. Run each cell sequentially to reproduce the analysis
  3. Visualizations and results will be generated automatically

Analysis & Visualizations

The total Sales was $1710971, the total Profit was $288,920

The Top global sales Category were Office Supplies and Technology.

The GBC Ibimaster 500 Manual ProClick Binding System had the highest product sales at $9892.74

Western Europe had the highest sales of $25,9576 Sales by Region

The consumer segment was the selling segment with $873512 Sales by Segment

The data surprisingy shows that the Asia Market had the highest sales, whilst the lowerst was the US/Canada market. Sales by Market

Sales by Category

Relationship between Sales and Profit

There is a positive but moderate relationship between Sales and Profit. The correlation value 0.534, indicative of this.

Sales vs Profit

Unsurprisingly, it confirms that as sales increase, profit tends to increase. However we can also see that there are sales which are not profitable as the sales volume increases.

The plot of monthly sales over time show increased sales around the months of January, July, August and October during the period 2012 to 2016.

Monthly Sales Over Time


Conclusion

  • The data shows an long term upward trend, growth
  • There are seasonaly aspects to the Sales data
    • Holiday spikes (Nov–Dec)
    • Back‑to‑school bumps (Aug–Sep)
    • Slow periods in Jan - March (Q1)
  • There is scope for further work to identify loss making products and regions for remedial actions

Credits

  • Tutorials / References: ITOnlinelearning.com
  • Dataset Source: ITOnlinelearning.com

License

This project is licensed under the MIT License. GlobalMart is a fictional company.


Thanks for visiting! 🚀

About

Exploratory Data Analysis (EDA) on Retail Chain Sales Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors