Author: Jesna Menezes Date: April 30,2026
- Project Overview
- Architecture Diagram
- Step 1: Identity and Access Management(IAM) User Creation
- Step 2: Simple Storage Service(S3) Bucket
- Step 3: AWS Glue and Data Catalog
- Step 4: AWS Athena
- Step 5: Amazon Quicksight
This document outlines the step-by-step process to creating a end-to-end superstore data analysis using various AWS services.
-
Navigate to IAM and click on Create User
-
Specify the usedetails and enable Provide user access to the AWS Management Console - optional
-
Attached admin access directly to the user under set permissions.
-
The user is created as below,
-
Login using IAM user credentials
-
Navigate to S3 and click on Create bucket
-
Create a general purpose bucket with a unique bucket name in the specific AWS Region.
-
Create a orders folder in the S3 bucket.
-
Download the Superstore Dataset(https://www.kaggle.com/datasets/vivek468/superstore-dataset-final) from Kaggle.
-
Select only data with order date 2017-01-01 and upload the same to folder orders/snapshot_day=2017-01-01/
-
Navigate to AWS Glue-> Databases -> Add Database
-
Navigate to AWS Glue->Crawlers->Add crawler
-
Create a crawler on S3 bucket
-
Add a IAM user Role for the AWS services to use
-
Run the crawler on the data available to create a table definition without storing the data. This is creating the Data catalog/meteadata for the data.
-
Partition is also created based on the folder name.
-
Upload the data for next few dates and run the crawler on the same.
-
Navigate Athena->Query Editor
-
Select the appropriate Databa source and Database
-
Create a S3 folder to save the Athena results
-
Preview the orders table
- Navigate to Quicksight
- Create the Database under Athena
- Start analysing the data using quicksight and create the dashboard and publish the same.
