Skip to content

belindbl/ds-data-analysis

Repository files navigation

Data Science Course Assignments

This repository contains the notebooks for the DS2024 data science course labs.

Contents

  • Lab1.ipynb - Web scraping lab focused on collecting hotel accommodation data from Booking, Trivago, and Agoda. It includes Selenium/BeautifulSoup setup, scraping utilities, data collection, exploratory data analysis, and a small interactive filtering workflow.
  • Lab2.ipynb - Modeling lab focused on exploratory analysis, normalization, transformations, dimensionality reduction, and regression model comparison. It includes PCA, ICA, linear regression, random forest, SVR, evaluation metrics, and visualizations.

Methods Used

  • Web scraping with Selenium and BeautifulSoup.
  • Data cleaning and tabular processing with pandas.
  • Exploratory data analysis using summary statistics and visualizations.
  • Interactive filtering with ipywidgets.
  • Feature scaling and normalization.
  • Correlation analysis and distribution transformation.
  • Dimensionality reduction with Principal Component Analysis and Independent Component Analysis.
  • Regression modeling with linear regression, random forest, support vector regression, and decision tree-based approaches.
  • Model evaluation using MAE, MSE, RMSE, R2, residual plots, and actual-versus-predicted visualizations.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors