An AI-powered full-stack movie recommendation system that blends content-based filtering, collaborative filtering, and hybrid techniques. Built with FastAPI, Apache Airflow, MLflow, Streamlit, and Docker.
BingeMax is a cutting-edge, production-grade movie recommendation engine that delivers highly personalized movie suggestions using a hybrid blend of:
- π― Content-Based Filtering (TF-IDF + cosine similarity)
- π€ Collaborative Filtering (ALS algorithm on implicit feedback)
- π§ Hybrid Recommendation Model combining both methods
Ideal for:
- Machine Learning Engineers building recommender systems
- Data Scientists practicing MLOps and full-stack deployment
- Developers looking for real-world end-to-end AI/ML pipelines
- Converts movie metadata (title, genres, description) into TF-IDF vectors
- Uses cosine similarity to find and recommend similar movies
$$ \text{similarity}(A, B) = \frac{A \cdot B}{|A| \cdot |B|} $$
- Uses Alternating Least Squares (ALS) from the
implicitlibrary - Learns user-item latent factors to recommend movies based on past interactions
- Combines both models using a tunable parameter
alpha β [0, 1] - Produces recommendations ranked by a weighted scoring system
- Docker + Docker Compose
- Python 3.10+ (for development environments)
docker compose down -v --remove-orphans # Optional cleanup
docker compose up --build| Service | URL |
|---|---|
| Streamlit App | http://localhost:8501 |
| FastAPI Docs | http://localhost:8000/docs |
| MLflow UI | http://localhost:5000 |
| Airflow UI | http://localhost:8080 (user: airflow / pw: airflow) |
- This project uses publicly available, non-commercial movie data from IMDb Datasets.
- The dataset includes metadata such as titles, genres, user ratings, and more.
BingeMax/
βββ src/
β βββ api/ # FastAPI backend
β βββ models/ # ALS, TF-IDF, hybrid logic
β βββ data/ # Matrix builders and loaders
β βββ evaluation/ # Model evaluation scripts
β βββ utils/ # Helper functions and config
βββ streamlit_app/ # Streamlit-based UI frontend
βββ airflow/ # Apache Airflow DAGs and configs
βββ mlruns/ # MLflow experiment tracking
βββ data/ # Raw and processed datasets
βββ docker-compose.yml
βββ requirements.txt
βββ README.md
| Method | Endpoint | Purpose |
|---|---|---|
| GET | /titles |
Fetch all movie titles |
| GET | /recommend |
Recommend movies based on title |
| GET | /recommend_user |
Personalized recommendations by user ID |
| GET | /top_rated |
Fetch top-rated movies |
| GET | /random |
Get random movie suggestions |
| GET | /genre/{genre} |
Get movies filtered by genre |
| GET | /model/versions |
List registered ALS models |
| POST | /model/promote |
Promote selected model version |
- DAG:
retrain_model_dag.py - Loads user-item interaction matrix
- Trains ALS model
- Logs experiments to MLflow
- Promotes the best model to production
airflow dags trigger retrain_als_model- Experiment:
Movie-Recommender-ALS - Logs hyperparameters:
factors,regParam,iterations - Tracks metric:
sample_score - Registered Model:
ALSRecommender - MLflow UI: http://localhost:5000
| Metric | Purpose |
|---|---|
| Sample Score | Proxy for ALS quality |
| RMSE | Collaborative filtering regression error |
| Precision@K | Top-K recommendation accuracy |
| Recall@K | Top-K recall on known positives |
| Component | Tools & Libraries |
|---|---|
| Backend API | FastAPI |
| Frontend | Streamlit |
| ML Models | Scikit-Learn, Implicit ALS |
| Scheduling | Apache Airflow |
| Tracking | MLflow |
| Containerization | Docker, Docker Compose |
| Visualization | Plotly, NetworkX |
We welcome your ideas and contributions!
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Commit with meaningful messages
- Open a Pull Request with a clear description
Licensed under the MIT License.
Pavan Yellathakota
Data Scientist | Machine Learning practitioner
π§ pavan.yellathakota.ds@gmail.com
π LinkedIn
"Because picking your next great movie should be effortless."
Made with β€οΈ using FastAPI + Streamlit + MLflow + Docker + Airflow





