Data exploration and preprocessing on Netflix Dataset
- Pandas
- Numpy
- Matplotlib
- Seaborn
In this notebook we use the dataset 'Netflix titles' available at this link
This dataset contains Unlabelled text data of around 9000 Netflix Shows and Movies along with Full details like Cast, Release Year, Rating, Description, etc.
Columns of dataset:
show_idUnique ID for every Movie / Tv ShowtypeIdentifier a Movie or TV ShowtitleTitle of the Movie / Tv ShowdirectorDirector of the MoviecastActors involved in the movie / showcountryCountry where the movie / show was produceddate_addedDate it was added on Netflixrelease_yearActual Release year of the move / showratingTV Rating of the movie / showdurationTotal Duration - in minutes or number of seasons
Example of query on dataset
- Extract min, max, mean, median, std of a column
- View in a plot TOP 10 genres and TOP 10 actors by appearance
- Rating comparation
- Duration statistics
- ...