You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Source code for the data analysis and models, including the FewSOC prompting framework for O*NET-SOC classification, used in the paper "Leveraging Large Language Models for Career Mobility Analysis: A Study of Gender, Race, and Job Change Using U.S. Online Resume Profiles."
A Machine Learning project that analyzes how socioeconomic factors influence student performance and absenteeism in the Brazilian ENEM exam (2019–2023), using predictive models to identify educational risk profiles and patterns of inequality.
Exploring the Impact of Crime and Income on Sydney House Prices. This project investigates how these factors shape property prices. It integrates clustering for structural pattern discovery with predictive modelling, demonstrating the limitations of linear regression and the advantages of tree-based methods in capturing non-linear dynamics.
Thesis Title: A Data-Driven Study of Urban Livability Through Network Modeling - Methodological Analysis of Accessibility, Network Structure, and Socioeconomic Correlation
Hierarchical clustering and cultural index analysis on World Values Survey data (97k+ observations) with MDS, SOM, and socioeconomic correlation study.
An EDA and Unsupervised Learning pipeline using KMeans clustering to segment household demographics into Rural and Urban categories based on transaction behavior and spending patterns.
Analyzed the socio-economic impact of French train stations using SNCF/INSEE data , presenting findings via an interactive Power BI dashboard for a national dataviz challenge.
Pipeline de ciência de dados e Machine Learning para analisar o impacto socioeconômico no desempenho do ENEM 2023 e prever nota média do usuário. Inclui tratamento de dados em larga escala, EDA e simulador interativo.