This project analyzed sales data to forecast future trends and identify factors influencing sales volatility. Time series models were developed and evaluated, providing insights into sales patterns and the challenges of accurate prediction in a dynamic business environment. The findings can inform inventory management, marketing strategies, and financial planning.
Accurate sales forecasting is crucial for businesses to optimize inventory, allocate resources effectively, and make informed strategic decisions. However, sales data can be complex and influenced by various factors, making accurate forecasting challenging.
As the data analyst, I was responsible for the entire analytical process, including data cleaning, exploratory data analysis, model selection, implementation, evaluation, and interpretation of results.
This project demonstrated my skills in:
- Time series analysis
- Data visualization
- Statistical modeling (ARIMA, Prophet)
- Data cleaning and preprocessing
- Feature engineering (TotalSales calculation)
- Communication of technical findings
The dataset consists of sales transactions, including information on:
Date: Date of the transactionSKU: Stock Keeping Unit (product ID)Price: Price of the productQuantity: Quantity sold- Other relevant columns (if any)
The analysis followed these key steps:
-
Data Loading and Cleaning:
- The data was loaded from a CSV file.
- Irrelevant columns were removed.
- The
Datecolumn was converted to datetime format. TotalSaleswas calculated (Price*Quantity).- Sales data was aggregated by date to create a daily sales time series.
-
Exploratory Data Analysis (EDA):
- Time Series Visualization:
This plot reveals an overall trend with significant spikes and increased volatility, particularly in the later period. - Stationarity Check: The Augmented Dickey-Fuller (ADF) test indicated that the time series is stationary (p-value = 0.0027), which is important for ARIMA modeling.
- ACF and PACF Analysis:
ACF and PACF plots were used to inform the initial selection of ARIMA parameters.
- Time Series Visualization:
-
Time Series Modeling:
- ARIMA:
- Prophet:
-
Results and Discussion:
- Both ARIMA and Prophet models struggled to capture the sales spikes, resulting in high RMSE values.
- This suggests that external factors (e.g., promotions, marketing campaigns) likely influence sales and are not captured by the time series alone.
- The models' limitations highlight the challenges of forecasting volatile sales data.
This project demonstrated a systematic approach to time series analysis. Future work could explore:
- Incorporating external data to improve forecast accuracy.
- Advanced time series models for volatile data.
- Outlier treatment techniques.
README.md: This file (project documentation).data/sales_data.csv: The sales dataset.images/: Folder containing visualizations.Sales_Forecasting_Notebook.ipynb: Google Colab notebook with the analysis code.
Aduragbemi Abe| abeaduragbemi@gmail.com

