Data Analyst • Applied ML Builder • AI Systems Explorer
I build practical analytics and machine learning systems across healthcare, NLP, geospatial risk, SaaS intelligence, and agentic data workflows.
Repositories • Current Build • Healthcare ML • NLP Project
I’m a data and analytics professional focused on turning messy real-world data into usable systems, dashboards, and predictive products.
Right now, I’m especially interested in:
- agentic AI for analytics
- applied machine learning with clear business value
- NLP and decision-support systems
- explainable, production-minded data products
An LLM-powered analytics system that separates schema understanding, query planning, SQL generation, and evaluation into modular layers.
Why it matters: most “chat with your database” demos break on schema ambiguity, business metric confusion, and bad joins. This project is designed to reduce those failure modes with explicit grounding and validation.
Focus areas:
- schema-aware prompting
- canonical metric definitions
- tool-using agents
- evaluation-driven iteration
- cost-aware design
A modular AI analytics system for grounded SQL generation and layered evaluation.
Highlights
- layer-based architecture for schema understanding, planning, generation, and evals
- grounded data dictionary generation with canonical metrics and gotcha handling
- cost-conscious agent workflow design
Clinical data science pipeline for predicting 30-day hospital readmission using MIMIC-III ICU data.
Highlights
- ICD-9 comorbidity engineering and biomarker feature extraction
- SHAP-based interpretability and fairness audit
- Streamlit dashboard for decision support
NLP system for fan sentiment and churn-risk scoring using airline tweets.
Highlights
- HuggingFace RoBERTa + XGBoost + MLflow
- FastAPI serving layer
- explainability-first workflow for classification decisions
Machine learning pipeline for hotspot detection and high-severity crash prediction using 1M US accident records.
Highlights
- binary and multiclass severity modeling
- threshold tuning and cross-validation
- geospatial and temporal feature engineering
End-to-end customer intelligence project with churn prediction, LTV modeling, segmentation, retention, and dashboarding.
Regression-based analysis of psychological and behavioral factors associated with student stress.
Languages & Analytics
Python SQL R Pandas NumPy scikit-learn XGBoost
AI / ML
Transformers Hugging Face SHAP MLflow FastAPI
Data Apps / BI
Streamlit Tableau Power BI
Workflow
Git GitHub Jupyter VS Code
Note: these cards are generated by third-party GitHub README stat services, so they can occasionally lag or fail to load.
- machine learning projects tied to real decision problems
- analytics work with practical business framing
- end-to-end apps, not just notebooks
- experiments in LLM systems for data workflows
- GitHub: @pavankalmanoor
- Email:
kalmanoorpavan@gmail.com
Building practical AI and analytics systems that are useful, explainable, and grounded in real data.