This repository contains the notebooks for the DS2024 data science course labs.
Lab1.ipynb- Web scraping lab focused on collecting hotel accommodation data from Booking, Trivago, and Agoda. It includes Selenium/BeautifulSoup setup, scraping utilities, data collection, exploratory data analysis, and a small interactive filtering workflow.Lab2.ipynb- Modeling lab focused on exploratory analysis, normalization, transformations, dimensionality reduction, and regression model comparison. It includes PCA, ICA, linear regression, random forest, SVR, evaluation metrics, and visualizations.
- Web scraping with Selenium and BeautifulSoup.
- Data cleaning and tabular processing with pandas.
- Exploratory data analysis using summary statistics and visualizations.
- Interactive filtering with ipywidgets.
- Feature scaling and normalization.
- Correlation analysis and distribution transformation.
- Dimensionality reduction with Principal Component Analysis and Independent Component Analysis.
- Regression modeling with linear regression, random forest, support vector regression, and decision tree-based approaches.
- Model evaluation using MAE, MSE, RMSE, R2, residual plots, and actual-versus-predicted visualizations.