Spotify Dataset Analyzer

Overview

This Python project is designed to process and analyze Spotify dataset to predict patterns using the K-Nearest Neighbors (KNN) algorithm. The program checks for data integrity, preprocesses the dataset, encodes features, and applies KNN to predict outcomes based on provided inputs.

Features

Dataset integrity check and automatic correction.
Data loading and preprocessing with encoding.
Splitting dataset into training and testing sets.
Combining and reshaping data for analysis.
Utilization of K-Nearest Neighbors (KNN) for predictions.
Detailed output of predictions and cosine similarity results.
Advanced data retrieval and plotting of top results.

Requirements

Python 3.x
Libraries: numpy, matplotlib, seaborn, sklearn, scipy, os
A Spotify dataset file named correct_dataset.csv located in a data directory.

Installation

Clone this repository and ensure that all required Python libraries are installed by running:

pip install numpy matplotlib seaborn scikit-learn scipy

Usage

To use this program, follow these steps:

Prepare the Dataset: Ensure the Spotify dataset file named correct_dataset.csv is located in the ../data/ directory relative to the script. If the dataset is not correct, the program will attempt to automatically fix it by referencing a file named spotify_dataset.csv.
Run the Script: Execute the script in your Python environment using the command:
```
python spotify_analyzer.py
```
Follow the on-screen prompts to interact with the program.

How It Works

Data Integrity Check: Initially, the program checks if the required dataset exists and is correct. If not, it calls a function to correct the dataset.
Data Loading and Encoding: The dataset is loaded and encoded to transform raw data into a format suitable for machine learning.
Training and Testing: The data is split into training and test sets, with 75% of the data used for training.
Prediction and Analysis: KNN is used to predict the outcomes based on the test dataset. Predictions and their accuracy are then printed out.
Results Interpretation: The program allows users to input an ID to find related entries and prints a list of potential related IDs based on the predictions.
Visualization: A bar chart of the top ten successful songs is displayed, highlighting the success rates using data visualizations.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.idea		.idea
data		data
src		src
venv		venv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify Dataset Analyzer

Overview

Features

Requirements

Installation

Usage

How It Works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spotify Dataset Analyzer

Overview

Features

Requirements

Installation

Usage

How It Works

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages