Mental Health Predictor

Background

Defining Mental illness
- Disturbance of thought , experience, and emotion that causes functional impairment in people.
- Interpersonal difficulty, limiting the ability to work and self destructive behavior.
In US nearly 1 in 5 adults aged 18 years or older (18.5%) have experienced mental illness (US Burden of Disease Collaborator, 2013).
Data science can help us to better understand and effectively implement treatments for mental health problems
Factors causing mental illness - anxiety, depression, biological, psychological and sociological (environment) approaches.

Project Focus

Our analysis project focuses on identifying factors associated with the prevalence of poor mental health in the US.
- Population density, income, water and land features
Building a machine learning model that can predict mental health risk for an individual based on designated factors
What are the most and least significant factors (features) in predicting prevalence of poor mental health in the US?

Description of Data Sources

500 Cities: Local Data for Better Health, 2019. 500 Cities: Mental health not good for >=14 days among adults aged >=18 years ---Centers for Disease Control and Prevention (CDC), Division of Population Health, Epidemiology and Surveillance Branch
- Behavioral Risk Factor Surveillance System (BRFSS) data (2017, 2016)
  - Mental Health Severity: Respondents aged ≥18 years who report 14 or more days during the past 30 days during which their mental health was not good.
US Household Income Statistics---Golden Oak Research Group LLC, “U.S. Income Database Kaggle”. Publication: 5, August 2017
United States Cities Database---SimpleMaps.com, Pareto Software LLC, compiled data from U.S. Geological Survey and U.S. Census Bureau

Tools/Resources

Creating ERD
- QuickDBD
Creating Database
- SQLite
Analyzing Data
- Pandas
Connecting to Database
Machine Learning
- Imbalanced-learn
- Scikit-Learn
- Tensorflow
- Dashboard
- Tableau
- Flask
- HTML/CSS

Machine Learning Model

We used a binary outcome based on %poor mental health prevalence.
The binary outcome was calculated by median split:
- The median % poor mental health of the 500 cities was 13.89%. So…
  - If a city < 13.89% poor mental health → “Good Mental Health”
  - If a city >= 13.89% poor mental health → “Bad Mental Health”
Features were log-transformed and scaled to bring them into a normal distribution
We tried logistic regression, support vector machines (1-3 kernels), decision tree, gradient tree boost (learning rates .05 - 1), random forest, and 1-2 layer deep learning
We used 10-fold cross-validation - i.e., 10 machine learning instances of randomly allocating 90% of data to training and 10% to testing. We averaged the performance across the 10 instances.

Summary of Findings

Multiple machine learning models were used and most of them provided about 80% accuracy in their mental health risk prediction.
With Random Forest model the most strongest feature in predicting poor mental health was Standard Deviation of Income.
This suggests that income inequality in a city most predicted the prevalence of poor mental health.

Limitations

Mental health data
- Small sample
- Limited availability
- Subjective self-rating
Differences in time frame of the datasets

Recommendations for Further Analysis

Exploring data related to:
- Impact of COVID-19
- Climate
- Affordable healthcare
- Availability of mental healthcare providers
- Data from wearable devices can be used to identify physiological markers associated with mental illness
- Social media and internet activity can provide insight into a behavioral side of mental health

Presenation Slides

The presentation of the project will be found on a Google Slide Presenation, Here

Dashboards

Tableau Dashboard
- Data Exporation Visuals and Machine Learning Summary
  - https://nhafer88.github.io/Mental_Health_Predictor/

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Clean_data		Clean_data
Images		Images
Raw_Data		Raw_Data
sqlite		sqlite
static		static
.DS_Store		.DS_Store
MachineLearning_Shallow_and_Deep.ipynb		MachineLearning_Shallow_and_Deep.ipynb
MachineLearning_Shallow_and_Deep1.ipynb		MachineLearning_Shallow_and_Deep1.ipynb
MachineLearning_Shallow_and_Deep_split.ipynb		MachineLearning_Shallow_and_Deep_split.ipynb
README.md		README.md
final_erd.png		final_erd.png
index.html		index.html
merged_data.csv		merged_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mental Health Predictor

Background

Project Focus

Description of Data Sources

Tools/Resources

Machine Learning Model

Summary of Findings

Limitations

Recommendations for Further Analysis

Presenation Slides

Dashboards

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mental Health Predictor

Background

Project Focus

Description of Data Sources

Tools/Resources

Machine Learning Model

Summary of Findings

Limitations

Recommendations for Further Analysis

Presenation Slides

Dashboards

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages