Skip to content

yellatp/BingeMax-Personalized-Movie-Recommendation-Engine

Repository files navigation

🎬 BingeMax: Personalized Movie Recommendation Engine

An AI-powered full-stack movie recommendation system that blends content-based filtering, collaborative filtering, and hybrid techniques. Built with FastAPI, Apache Airflow, MLflow, Streamlit, and Docker.

BingeMax Network Graph UI

Watch the Demo on YouTube


πŸ“Œ Overview

BingeMax is a cutting-edge, production-grade movie recommendation engine that delivers highly personalized movie suggestions using a hybrid blend of:

  • 🎯 Content-Based Filtering (TF-IDF + cosine similarity)
  • 🀝 Collaborative Filtering (ALS algorithm on implicit feedback)
  • 🧠 Hybrid Recommendation Model combining both methods

Ideal for:

  • Machine Learning Engineers building recommender systems
  • Data Scientists practicing MLOps and full-stack deployment
  • Developers looking for real-world end-to-end AI/ML pipelines

🧠 Recommendation Engine Workflow

πŸ”Ή Content-Based Filtering

  • Converts movie metadata (title, genres, description) into TF-IDF vectors
  • Uses cosine similarity to find and recommend similar movies

$$ \text{similarity}(A, B) = \frac{A \cdot B}{|A| \cdot |B|} $$

πŸ”Ή Collaborative Filtering

  • Uses Alternating Least Squares (ALS) from the implicit library
  • Learns user-item latent factors to recommend movies based on past interactions

πŸ”Ή Hybrid Recommendation Model

  • Combines both models using a tunable parameter alpha ∈ [0, 1]
  • Produces recommendations ranked by a weighted scoring system

βš™οΈ Getting Started

πŸ“¦ Prerequisites

  • Docker + Docker Compose
  • Python 3.10+ (for development environments)

🐳 Run with Docker Compose

docker compose down -v --remove-orphans  # Optional cleanup
docker compose up --build

πŸ”— Access the Interfaces

Service URL
Streamlit App http://localhost:8501
FastAPI Docs http://localhost:8000/docs
MLflow UI http://localhost:5000
Airflow UI http://localhost:8080 (user: airflow / pw: airflow)

πŸ“‚ Data Source

  • This project uses publicly available, non-commercial movie data from IMDb Datasets.
  • The dataset includes metadata such as titles, genres, user ratings, and more.

πŸ“ Project Structure

BingeMax/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ api/               # FastAPI backend
β”‚   β”œβ”€β”€ models/            # ALS, TF-IDF, hybrid logic
β”‚   β”œβ”€β”€ data/              # Matrix builders and loaders
β”‚   β”œβ”€β”€ evaluation/        # Model evaluation scripts
β”‚   └── utils/             # Helper functions and config
β”œβ”€β”€ streamlit_app/         # Streamlit-based UI frontend
β”œβ”€β”€ airflow/               # Apache Airflow DAGs and configs
β”œβ”€β”€ mlruns/                # MLflow experiment tracking
β”œβ”€β”€ data/                  # Raw and processed datasets
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ requirements.txt
└── README.md

πŸš€ REST API Endpoints (FastAPI)

Method Endpoint Purpose
GET /titles Fetch all movie titles
GET /recommend Recommend movies based on title
GET /recommend_user Personalized recommendations by user ID
GET /top_rated Fetch top-rated movies
GET /random Get random movie suggestions
GET /genre/{genre} Get movies filtered by genre
GET /model/versions List registered ALS models
POST /model/promote Promote selected model version

πŸ’‘ Streamlit UI Features

πŸ” Smart Search

πŸ“Š Recommendations by Title

⭐ Top Picks and Genre Filters


πŸ‘€ Personalized Recommendations (ALS)


πŸ›°οΈ Airflow: Model Retraining Workflow

  • DAG: retrain_model_dag.py
  • Loads user-item interaction matrix
  • Trains ALS model
  • Logs experiments to MLflow
  • Promotes the best model to production
airflow dags trigger retrain_als_model

πŸ“ˆ MLflow Experiment Tracking

  • Experiment: Movie-Recommender-ALS
  • Logs hyperparameters: factors, regParam, iterations
  • Tracks metric: sample_score
  • Registered Model: ALSRecommender
  • MLflow UI: http://localhost:5000

πŸ“Š Model Evaluation Metrics

Metric Purpose
Sample Score Proxy for ALS quality
RMSE Collaborative filtering regression error
Precision@K Top-K recommendation accuracy
Recall@K Top-K recall on known positives

πŸ› οΈ Tech Stack

Component Tools & Libraries
Backend API FastAPI
Frontend Streamlit
ML Models Scikit-Learn, Implicit ALS
Scheduling Apache Airflow
Tracking MLflow
Containerization Docker, Docker Compose
Visualization Plotly, NetworkX

🀝 How to Contribute

We welcome your ideas and contributions!

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Commit with meaningful messages
  4. Open a Pull Request with a clear description

πŸ“œ License

Licensed under the MIT License.


πŸ‘¨β€πŸ’» Author

Pavan Yellathakota
Data Scientist | Machine Learning practitioner

πŸ“§ pavan.yellathakota.ds@gmail.com
🌐 LinkedIn

"Because picking your next great movie should be effortless."

Made with ❀️ using FastAPI + Streamlit + MLflow + Docker + Airflow

About

An AI-powered movie recommender using content-based, collaborative, and cosine similarity models. Built with Streamlit + FastAPI.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors