class AnshikaMishra:
name = "Anshika Mishra"
pronouns = "she/her"
location = "Pune, Maharashtra"
university = "Savitribai Phule Pune University (SPPU)"
degree = "B.Sc. Data Science | CGPA: 8.61 / 10"
email = "anshikamishra.25000@gmail.com"
experience = [
"π Data Science Intern @ Ernst & Young LLP",
"π€ AI Virtual Intern @ Infosys Springboard",
"π¬ Project Contributor @ Dept. of Tech, SPPU",
]
strengths = [
"End-to-end ML pipeline design",
"Time-series forecasting (ARIMA, Prophet)",
"NLP & LLM integration (FinBERT, Gemini, LLaMA)",
"Multi-agent AI systems (LangGraph, ReAct)",
"Reproducible, production-grade Python workflows",
]
achievements = [
"π
Elite Certificate β NPTEL Python for Data Science",
"π Selected β Infosys Springboard Pragati Cohort 3",
"π SheFi Scholar β Cohort 1",
"ποΈ Workshop @ IIT Bombay β AI for Professionals",
]
motto = "Explore patterns β Build systems β Create impact"|
Ernst & Young LLP Β· Business Consulting & Supply Chain Ops
|
Infosys Springboard Β· Autonomous Cognitive Engine
|
|
Automated processing and validation of 10,000+ records for a Guinness World Record attempt β achieved 99% data accuracy using Python data-cleaning and EDA scripts under a tight deadline. |
|
FinBERT Β· XGBoost Β· MLflow Β· SHAP Β· Streamlit Β· SEC EDGAR
Scored 264 earnings transcripts across 40 S&P 500 tickers β built a 3-layer weighted NLP pipeline achieving 75% classification accuracy and a 2.17 Signal Sharpe Ratio.
- π§ 3-layer pipeline: 40% hedging detection Β· 35% FinBERT sentiment Β· 25% vocabulary signals
- β‘ XGBoost ensemble (WEIGHTS_V2) β +2.83pp accuracy improvement; validated across 10 ablation experiments
- π Identified hedging as critical signal with 18.5pp source-quality gap (Motley Fool vs SEC EDGAR)
- π Streamlit dashboard with MLflow tracking + SHAP explainability + percentile-based signals
Python FinBERT XGBoost MLflow SHAP Streamlit yfinance SEC EDGAR
LangGraph Β· FastAPI Β· Groq/LLaMA Β· Pinecone Β· Redis Β· SSE
Production-grade 5-agent LangGraph pipeline: Task Planner β Code Generator β Tester β Debugger β Reviewer β autonomously plans, generates, tests, and reviews code end-to-end.
- β‘ Real-time SSE streaming + Pinecone vector cache for code-pattern reuse
- π Redis pub/sub message queue with in-memory fallback + retry with exponential backoff (3 attempts)
- π Full FastAPI + Swagger UI REST API (5 endpoints incl. SSE stream + session management)
- π‘οΈ Code review scoring: security Β· performance Β· maintainability
LangGraph FastAPI Groq LLaMA-70B Pinecone Redis SSE Python
π©βπΌ Recruit IQ β Autonomous Hiring AI Platform
Flask Β· LangGraph Β· SQLAlchemy Β· TF-IDF Β· Gmail API Β· Celery
Full-stack autonomous recruiting system with 4-agent LangGraph architecture β ingests candidates, scores, tiers, and sends Gmail-based recruiter loops via OAuth2 automation.
- π Weighted NLP scoring: TF-IDF similarity + answer quality + GitHub scoring β FastTrack / Standard / Review / Reject tiers
- π Self-learning scoring loop that updates dynamic weights from hiring outcomes
- π‘οΈ Anti-cheat layer: AI-answer detection + copy-ring clustering (TF-IDF cosine) + suspicious timing flags
- π§ Flask REST API (12 endpoints) + Celery/Redis async queue + full pytest coverage
Flask LangGraph SQLAlchemy TF-IDF Gmail API Celery Redis Groq
XGBoost Β· Random Forest Β· SHAP Β· Google Earth Engine Β· GeoPandas Β· K-Means
End-to-end geospatial ML platform predicting Urban Heat Island intensity using satellite + terrain data β deployable for any global city in 5β10 minutes.
- π°οΈ GEE data ingestion: MODIS/Landsat LST, NDVI, NDBI, ESA WorldCover, SRTM DEM
- π K-Means UHI zone classification + Getis-Ord Gi* hotspot analysis + SHAP explainability
- πΊοΈ CLI outputs: publication-quality infographics + interactive multi-layer HTML map
- π― XGBoost predicts Land Surface Temperature with Heat Risk Scores (Low / Medium / High)
Python XGBoost Random Forest SHAP Google Earth Engine GeoPandas K-Means
ARIMA Β· Prophet Β· Random Forest Β· Scikit-learn Β· Streamlit Β· Plotly
Dual-model ML health platform: ARIMA + Prophet forecasting (2.3-day MAE) + Random Forest PCOS classifier (87.3% accuracy, 0.91 AUC-ROC).
- π©Ί 15+ clinical features with cross-validation, stratified sampling, automated imputation
- π 5-module Streamlit platform: period logging Β· water tracking Β· risk assessment Β· analytics Β· nutrition insights
- π Interactive Plotly dashboards + CSV export
ARIMA Prophet Random Forest Scikit-learn Streamlit Plotly
Pandas Β· NumPy Β· Scikit-learn Β· Pearson Correlation Β· Papermill
Born at EY β open-sourced for the community. Automated price segmentation across 157 retail segments Γ 3 channels β replaced 2+ weeks of manual Excel work with a sub-15-minute pipeline.
Python Pandas Pearson Clustering Papermill Excel Scikit-learn
Languages & Databases
ML / DL / Forecasting
NLP & LLMs
Data & Visualization
APIs, Deployment & DevOps
Tools
| π | Detail |
|---|---|
| π₯ | Elite Certificate β Python for Data Science, NPTEL IIT Madras (2025) |
| ποΈ | Workshop β Fundamentals of AI for Entrepreneurs & Professionals, IIT Bombay (2025) |
| π― | Selected β Infosys Springboard Pragati Cohort 3 |
| π | SheFi Scholar β Cohort 1 |
| π | What is Data Science? β IBM / Coursera |
| π | Foundations: Data, Data, Everywhere β Google Data Analytics Certificate |
| π | German Language (A1) β Dept. of Foreign Languages, SPPU (2024β25) |
| Status | Area | Details |
|---|---|---|
| β | Core ML & Feature Engineering | Scikit-learn, XGBoost, cross-validation, EDA |
| β | Time-Series Forecasting | ARIMA, Prophet, walk-forward backtesting |
| β | NLP & Transformers | FinBERT, TF-IDF, sentiment analysis, NER |
| β | Multi-Agent AI Systems | LangGraph, ReAct, LangSmith, Groq/LLaMA |
| β | REST APIs & Dashboards | FastAPI, Flask, Streamlit, Plotly |
| β | Experiment Tracking | MLflow, SHAP explainability |
| π | Geospatial ML | GeoPandas, Google Earth Engine, spatial analysis |
| π | LLM Fine-tuning & RAG | Hugging Face, vector DBs, Pinecone |
| π | MLOps & Deployment | Docker, CI/CD, model serving at scale |
| π | Deep Learning | LSTM, CNN, PyTorch workflows |
| π― | AI Research Contributions | Open-source, papers, production AI systems |
