Skip to content
View 10anshika's full-sized avatar
πŸ’­
πŸ‘©β€πŸ’»
πŸ’­
πŸ‘©β€πŸ’»

Block or report 10anshika

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
10anshika/README.md

Typing SVG


Β  Β  Β 

🧠 About Me

class AnshikaMishra:

    name        = "Anshika Mishra"
    pronouns    = "she/her"
    location    = "Pune, Maharashtra"
    university  = "Savitribai Phule Pune University (SPPU)"
    degree      = "B.Sc. Data Science  |  CGPA: 8.61 / 10"
    email       = "anshikamishra.25000@gmail.com"

    experience  = [
        "πŸ“Š Data Science Intern @ Ernst & Young LLP",
        "πŸ€– AI Virtual Intern @ Infosys Springboard",
        "πŸ”¬ Project Contributor @ Dept. of Tech, SPPU",
    ]

    strengths   = [
        "End-to-end ML pipeline design",
        "Time-series forecasting (ARIMA, Prophet)",
        "NLP & LLM integration (FinBERT, Gemini, LLaMA)",
        "Multi-agent AI systems (LangGraph, ReAct)",
        "Reproducible, production-grade Python workflows",
    ]

    achievements = [
        "πŸ… Elite Certificate β€” NPTEL Python for Data Science",
        "πŸŽ“ Selected β€” Infosys Springboard Pragati Cohort 3",
        "πŸ’Ž SheFi Scholar β€” Cohort 1",
        "πŸ›οΈ  Workshop @ IIT Bombay β€” AI for Professionals",
    ]

    motto = "Explore patterns β†’ Build systems β†’ Create impact"


πŸ’Ό Professional Experience

πŸ“Š Data Science Intern

Ernst & Young LLP Β· Business Consulting & Supply Chain Ops Dec 2025 – Mar 2026 Β· EY Asterisk, Pune

  • Built the Archetype Engine β€” 157 market segments Γ— 3 retail channels; replaced 2+ weeks of manual analyst work with a sub-15-min reproducible pipeline at 85–95% accuracy
  • Designed Pearson-correlation clustering on time-series trend vectors for automated price archetype discovery (greedy Auto-K, 70% dissimilarity threshold)
  • Delivered demand forecasting pipelines + analyst-ready Excel pivot reports with YoY comparisons via Papermill 10-notebook orchestration

Python Pandas NumPy Scikit-learn Papermill Excel

πŸ€– AI Virtual Intern

Infosys Springboard Β· Autonomous Cognitive Engine Dec 2025 – Feb 2026

  • Built a multi-agent autonomous framework using LangGraph (Python 3.11) with a ReAct reasoning loop, dynamic TODO-based task planner, and virtual file system for long-horizon tasks
  • Implemented supervisor–subagent delegation (Summarisation, Web Search agents) with full multi-step run tracing via LangSmith

LangGraph LangSmith Python 3.11 ReAct Multi-Agent

πŸ”¬ Project Contributor β€” Dept. of Technology, SPPU Dec 2024

Automated processing and validation of 10,000+ records for a Guinness World Record attempt β€” achieved 99% data accuracy using Python data-cleaning and EDA scripts under a tight deadline.


πŸš€ Featured Projects

FinBERT Β· XGBoost Β· MLflow Β· SHAP Β· Streamlit Β· SEC EDGAR

Scored 264 earnings transcripts across 40 S&P 500 tickers β€” built a 3-layer weighted NLP pipeline achieving 75% classification accuracy and a 2.17 Signal Sharpe Ratio.

  • 🧠 3-layer pipeline: 40% hedging detection Β· 35% FinBERT sentiment Β· 25% vocabulary signals
  • ⚑ XGBoost ensemble (WEIGHTS_V2) β†’ +2.83pp accuracy improvement; validated across 10 ablation experiments
  • πŸ” Identified hedging as critical signal with 18.5pp source-quality gap (Motley Fool vs SEC EDGAR)
  • πŸ“Š Streamlit dashboard with MLflow tracking + SHAP explainability + percentile-based signals

Python FinBERT XGBoost MLflow SHAP Streamlit yfinance SEC EDGAR


LangGraph Β· FastAPI Β· Groq/LLaMA Β· Pinecone Β· Redis Β· SSE

Production-grade 5-agent LangGraph pipeline: Task Planner β†’ Code Generator β†’ Tester β†’ Debugger β†’ Reviewer β€” autonomously plans, generates, tests, and reviews code end-to-end.

  • ⚑ Real-time SSE streaming + Pinecone vector cache for code-pattern reuse
  • πŸ”„ Redis pub/sub message queue with in-memory fallback + retry with exponential backoff (3 attempts)
  • 🌐 Full FastAPI + Swagger UI REST API (5 endpoints incl. SSE stream + session management)
  • πŸ›‘οΈ Code review scoring: security Β· performance Β· maintainability

LangGraph FastAPI Groq LLaMA-70B Pinecone Redis SSE Python


Flask Β· LangGraph Β· SQLAlchemy Β· TF-IDF Β· Gmail API Β· Celery

Full-stack autonomous recruiting system with 4-agent LangGraph architecture β€” ingests candidates, scores, tiers, and sends Gmail-based recruiter loops via OAuth2 automation.

  • πŸ“Š Weighted NLP scoring: TF-IDF similarity + answer quality + GitHub scoring β†’ FastTrack / Standard / Review / Reject tiers
  • πŸ” Self-learning scoring loop that updates dynamic weights from hiring outcomes
  • πŸ›‘οΈ Anti-cheat layer: AI-answer detection + copy-ring clustering (TF-IDF cosine) + suspicious timing flags
  • πŸ”§ Flask REST API (12 endpoints) + Celery/Redis async queue + full pytest coverage

Flask LangGraph SQLAlchemy TF-IDF Gmail API Celery Redis Groq


XGBoost Β· Random Forest Β· SHAP Β· Google Earth Engine Β· GeoPandas Β· K-Means

End-to-end geospatial ML platform predicting Urban Heat Island intensity using satellite + terrain data β€” deployable for any global city in 5–10 minutes.

  • πŸ›°οΈ GEE data ingestion: MODIS/Landsat LST, NDVI, NDBI, ESA WorldCover, SRTM DEM
  • πŸ“ K-Means UHI zone classification + Getis-Ord Gi* hotspot analysis + SHAP explainability
  • πŸ—ΊοΈ CLI outputs: publication-quality infographics + interactive multi-layer HTML map
  • 🎯 XGBoost predicts Land Surface Temperature with Heat Risk Scores (Low / Medium / High)

Python XGBoost Random Forest SHAP Google Earth Engine GeoPandas K-Means


ARIMA Β· Prophet Β· Random Forest Β· Scikit-learn Β· Streamlit Β· Plotly

Dual-model ML health platform: ARIMA + Prophet forecasting (2.3-day MAE) + Random Forest PCOS classifier (87.3% accuracy, 0.91 AUC-ROC).

  • 🩺 15+ clinical features with cross-validation, stratified sampling, automated imputation
  • πŸ“Š 5-module Streamlit platform: period logging Β· water tracking Β· risk assessment Β· analytics Β· nutrition insights
  • πŸ“ˆ Interactive Plotly dashboards + CSV export

ARIMA Prophet Random Forest Scikit-learn Streamlit Plotly


Pandas Β· NumPy Β· Scikit-learn Β· Pearson Correlation Β· Papermill

Born at EY β€” open-sourced for the community. Automated price segmentation across 157 retail segments Γ— 3 channels β€” replaced 2+ weeks of manual Excel work with a sub-15-minute pipeline.

Python Pandas Pearson Clustering Papermill Excel Scikit-learn


πŸ› οΈ Full Tech Stack

Languages & Databases

Python SQL C

ML / DL / Forecasting

Scikit-Learn XGBoost TensorFlow HuggingFace Prophet

NLP & LLMs

LangChain Gemini OpenAI FinBERT

Data & Visualization

Pandas NumPy Plotly Streamlit MLflow SHAP

APIs, Deployment & DevOps

FastAPI Flask Redis Pinecone GitHub Actions Docker

Tools

Jupyter VS Code Git Colab


πŸ… Certifications & Achievements

πŸ† Detail
πŸ₯‡ Elite Certificate β€” Python for Data Science, NPTEL IIT Madras (2025)
πŸ›οΈ Workshop β€” Fundamentals of AI for Entrepreneurs & Professionals, IIT Bombay (2025)
🎯 Selected β€” Infosys Springboard Pragati Cohort 3
πŸ’Ž SheFi Scholar β€” Cohort 1
πŸ“œ What is Data Science? β€” IBM / Coursera
πŸ“Š Foundations: Data, Data, Everywhere β€” Google Data Analytics Certificate
🌍 German Language (A1) β€” Dept. of Foreign Languages, SPPU (2024–25)

πŸ“Š GitHub Analytics

Β 

GitHub Streak


πŸ† GitHub Trophies

trophy


πŸ“ˆ Contribution Activity

Activity Graph


🐍 Contribution Snake

contribution snake

πŸ—ΊοΈ 2025–26 Learning Roadmap

Status Area Details
βœ… Core ML & Feature Engineering Scikit-learn, XGBoost, cross-validation, EDA
βœ… Time-Series Forecasting ARIMA, Prophet, walk-forward backtesting
βœ… NLP & Transformers FinBERT, TF-IDF, sentiment analysis, NER
βœ… Multi-Agent AI Systems LangGraph, ReAct, LangSmith, Groq/LLaMA
βœ… REST APIs & Dashboards FastAPI, Flask, Streamlit, Plotly
βœ… Experiment Tracking MLflow, SHAP explainability
πŸ”„ Geospatial ML GeoPandas, Google Earth Engine, spatial analysis
πŸ”„ LLM Fine-tuning & RAG Hugging Face, vector DBs, Pinecone
πŸ“… MLOps & Deployment Docker, CI/CD, model serving at scale
πŸ“… Deep Learning LSTM, CNN, PyTorch workflows
🎯 AI Research Contributions Open-source, papers, production AI systems

πŸ’¬ Connect With Me

GitHub LinkedIn Email Kaggle


"Exploring patterns in data and building intelligent systems that create real-world impact."

β€” Anshika Mishra


⭐ If any of my projects helped or inspired you, a star goes a long way! ⭐

Pinned Loading

  1. AdVary-AI-Creative-Automation-Pipeline AdVary-AI-Creative-Automation-Pipeline Public

    Python

  2. AI-Powered-Urban-Heat-Intelligence-and-Forecasting-Platform AI-Powered-Urban-Heat-Intelligence-and-Forecasting-Platform Public

    An autonomous recruiting intelligence system that ingests candidates, scores applications, detects suspicious responses, manages recruiter email loops, and learns from hiring outcomes.

    Python

  3. archetype-automation-engine archetype-automation-engine Public

    Jupyter Notebook

  4. EarningsEcho EarningsEcho Public

    🚨 They said "cautiously optimistic" πŸ“‰ The stock dropped 8%. Coincidence? We think not. 🎯 NLP pipeline decoding executive hedging, sentiment & doublespeak across 264 real earnings calls

    Python