This is the final project for the University of San Diego’s ADS 509 Applied Text Mining course.
The objective of this project is to develop a drug classification system that uses data, data integration, and machine learning techniques to provide healthcare professionals, researchers, and patients a platform for obtaining pharmaceutical insights. We will do this by extracting comprehensive information about pharmaceutical drugs across multiple medical categories, including psychiatric and neurological. This data will be used to develop a predictive model capable of categorizing text data and recommending the top drugs in a given category. The project aims to provide valuable insights to those seeking information on drug classifications and treatment options.
- Data Ingestion
- Data Preprocessing
- Exploratory Data Analysis
- Topic Modeling
- Latent Dirichlet Allocation (LDA)
- Classification Modeling
- Fuzzy C-Means
- K-Means
- DBSCAN
- Python
- Mackenzie Carter
- Vannesa Salazar
- Christine Vu