Task 1: Data Cleaning and Preprocessing

Internship Task – Data Analyst Role

This repository contains the solution to Task 1 of the Data Analyst Internship, focusing on data cleaning and preprocessing using Python (Pandas) in Google Colab.

Objective

Clean and prepare a raw dataset by:

Identifying and handling missing values
Removing duplicates
Standardizing column names and text data
Ensuring consistent data types and formats

Dataset

Dataset Name: Mall Customer Segmentation Data
Source: Provided during the internship task
File: Mall_Customers.csv

Cleaning Summary

Step	Description
Missing Values Check	No missing values found
Duplicate Check	No duplicate rows present
Column Renaming	Standardized to lowercase with underscores
Text Standardization	Gender values standardized to title case
Data Type Check	All data types confirmed appropriate

Files in this Repository

File	Description
`task1_data_cleaning.ipynb`	Google Colab notebook with the complete cleaning process
`Mall_Customers.csv`	Original dataset
`cleaned_mall_customers.csv`	Cleaned and processed version of the dataset
`README.md`	Summary and documentation of the task

Tools & Technologies

Python 3
Pandas
Google Colab
GitHub

Key Learnings

Hands-on experience with Pandas for cleaning real-world datasets
Techniques to detect and handle common data issues
Understanding importance of standardization and preprocessing before analysis

Task Completed

This task is submitted as part of the internship program.
To view the solution notebook or the cleaned dataset, explore the files above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Task 1: Data Cleaning and Preprocessing

Internship Task – Data Analyst Role

Objective

Dataset

Cleaning Summary

Files in this Repository

Tools & Technologies

Key Learnings

Task Completed

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Mall_Customers.csv		Mall_Customers.csv
README.md		README.md
cleaned_mall_customers.csv		cleaned_mall_customers.csv
task1_data_cleaning.ipynb		task1_data_cleaning.ipynb

Folders and files

Latest commit

History

Repository files navigation

Task 1: Data Cleaning and Preprocessing

Internship Task – Data Analyst Role

Objective

Dataset

Cleaning Summary

Files in this Repository

Tools & Technologies

Key Learnings

Task Completed

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages