Skip to content

tejasvinifulari5/Data-Science-Lab-Assignments

Repository files navigation

📊 Data Science Lab Assignments (2nd Year)

Python Jupyter Pandas Scikit-Learn Status


📌 Overview

This repository contains a comprehensive collection of my second-year Data Science lab assignments implemented using Python and Jupyter Notebook.

💡 This repository demonstrates end-to-end data science workflows from raw data preprocessing to model building, evaluation, and visualization.


📸 Preview

📊 Histogram (Fare Distribution)

image

📦 Boxplot (Age vs Gender vs Survival)

image

📚 Objectives

  • Understand and apply data preprocessing techniques
  • Perform statistical analysis and derive meaningful insights
  • Build and evaluate machine learning models
  • Implement time series forecasting methods
  • Visualize data effectively using Python libraries

🛠️ Tools & Technologies

  • Python 🐍
  • Jupyter Notebook
  • Pandas & NumPy
  • Matplotlib & Seaborn
  • Scikit-learn
  • Statsmodels

📂 Repository Structure

📁 Click to expand
Data-Science-Lab-Assignments/
│
├── 01_Data_Wrangling/
├── 02_Data_Wrangling_II/
├── 03_Descriptive_Statistics/
├── 04_Linear_Regression/
├── 05_Logistic_Regression/
├── 06_Naive_Bayes/
├── 07_Time_Series_MA/
├── 08_Auto_Regressive/
├── 09_Moving_Average/
├── 10_ARIMA/
├── 11_Data_Visualization/
│
├── datasets/
└── README.md

📌 Assignments Overview

🔹 Data Wrangling

  • Data collection from platforms like Kaggle
  • Data cleaning and preprocessing
  • Handling missing values
  • Data type conversion and normalization
  • Encoding categorical variables

🔹 Data Wrangling II

  • Academic dataset creation
  • Missing value and inconsistency handling
  • Outlier detection and treatment
  • Data transformation (scaling, normalization, skewness reduction)

🔹 Descriptive Statistics

  • Measures of central tendency (mean, median, mode)
  • Measures of variability (standard deviation, variance)
  • Grouped statistical analysis
  • Analysis using Iris dataset

🔹 Machine Learning Models

  • 📈 Linear Regression (House Price Prediction)
  • 📊 Logistic Regression (Classification)
  • 🌸 Naïve Bayes (Iris Dataset)

🔹 Time Series Analysis

  • Moving Average techniques
  • Auto-Regressive (AR) model
  • Weighted Moving Average
  • ARIMA model implementation
  • Trend and seasonality analysis

🔹 Data Visualization

  • Titanic dataset analysis
  • Histogram for fare distribution
  • Box plots (age vs gender vs survival)
  • Insight generation and pattern discovery

📊 Key Learning Outcomes

  • Hands-on experience with real-world datasets
  • Strong understanding of data preprocessing techniques
  • Ability to build and evaluate ML models
  • Knowledge of time series forecasting
  • Improved data visualization and storytelling skills

🚀 How to Run

  1. Clone the repository:
git clone https://github.com/tejasvinifulari5/Data-Science-Lab-Assignments.git
  1. Navigate to the project folder:
cd Data-Science-Lab-Assignments
  1. Open Jupyter Notebook:
jupyter notebook

✨ Future Improvements

  • Add more real-world datasets
  • Convert assignments into full-scale projects
  • Deploy machine learning models
  • Improve visualizations and dashboards

👩‍💻 Author

Tejasvini Fulari


⭐ Support

If you found this useful, consider giving it a ⭐ on GitHub!


🙏 Acknowledgment

Datasets are sourced from open platforms like Kaggle and built-in Python libraries.


About

A collection of Data Science lab assignments demonstrating end-to-end workflows including data preprocessing, statistical analysis, machine learning models, time series forecasting, and visualization using real-world datasets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors