Skip to content

mlnjsh/ask-my-ai-clone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿค– Ask My AI Clone

Chat with an AI trained on 15+ research publications

Live Demo Papers Powered By License


Ever wanted to ask a researcher about their work without scheduling a meeting?

This is an AI clone of Prof. Milan Amrut Joshi โ€” trained on all 15+ published research papers, book chapters, and conference publications. Ask it anything about Topological Data Analysis, Multi-Objective Optimization, Machine Learning, or Functional Analysis.

Try It Live ยท How It Works ยท Add Your Papers ยท API


๐ŸŽฏ What Can You Ask?

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    ASK ME ANYTHING ABOUT                         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ“ TDA         โ”‚  โš™๏ธ Optimization  โ”‚  ๐Ÿค– Machine Learning     โ”‚
โ”‚                 โ”‚                   โ”‚                           โ”‚
โ”‚  โ€ข Persistent   โ”‚  โ€ข NSGA-II vs     โ”‚  โ€ข SVM with noisy data   โ”‚
โ”‚    homology     โ”‚    MOALO vs MODA  โ”‚  โ€ข PSO hyperparameter    โ”‚
โ”‚  โ€ข Betti        โ”‚  โ€ข Response       โ”‚    tuning                โ”‚
โ”‚    numbers      โ”‚    surface        โ”‚  โ€ข Ensemble methods      โ”‚
โ”‚  โ€ข Simplicial   โ”‚    modeling       โ”‚  โ€ข Feature extraction    โ”‚
โ”‚    complexes    โ”‚  โ€ข Pareto         โ”‚  โ€ข Breast cancer         โ”‚
โ”‚  โ€ข Feature      โ”‚    optimal        โ”‚    classification        โ”‚
โ”‚    extraction   โ”‚    solutions      โ”‚  โ€ข Human activity        โ”‚
โ”‚  โ€ข Healthcare   โ”‚  โ€ข Composite      โ”‚    recognition           โ”‚
โ”‚    big data     โ”‚    plate design   โ”‚  โ€ข Metamodeling          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿงฎ Mathematics โ”‚  ๐Ÿ”ฌ Engineering   โ”‚  ๐Ÿ“š General Research     โ”‚
โ”‚                 โ”‚                   โ”‚                           โ”‚
โ”‚  โ€ข Almost       โ”‚  โ€ข Micro-         โ”‚  โ€ข "Explain your best    โ”‚
โ”‚    periodic     โ”‚    machining      โ”‚    paper award"          โ”‚
โ”‚    functions    โ”‚  โ€ข Turbulent      โ”‚  โ€ข "What is TDA?"        โ”‚
โ”‚  โ€ข Banach       โ”‚    flows          โ”‚  โ€ข "Compare DTM vs DJM"  โ”‚
โ”‚    algebras     โ”‚  โ€ข Fatigue        โ”‚  โ€ข "Summarize your       โ”‚
โ”‚  โ€ข Fixed point  โ”‚    strength       โ”‚    Springer chapters"    โ”‚
โ”‚    theorems     โ”‚  โ€ข Plasma PDEs    โ”‚  โ€ข "Research timeline?"  โ”‚
โ”‚  โ€ข Compact      โ”‚  โ€ข Steel          โ”‚  โ€ข "What should I cite?" โ”‚
โ”‚    operators    โ”‚    prediction     โ”‚                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ’ฌ Example Conversations

๐Ÿ”ฌ "What is persistent homology and how did you use it?"

AI Clone: Persistent homology is a central tool in Topological Data Analysis that captures the "shape" of data across multiple scales. In my work, I've applied it in several ways:

  1. Feature Extraction (ICIVC-2021, Best Paper Award): I developed a novel framework that uses persistent homology to generate topological features โ€” like persistence diagrams and Betti curves โ€” as input features for downstream ML models. These topological features capture shape-based patterns that traditional statistical features miss.

  2. Healthcare Big Data (IJAER, 2019): I surveyed how TDA methods including persistent homology can extract clinically meaningful patterns from high-dimensional medical datasets.

  3. Fatigue Strength Prediction (IOP, 2020): I showed that topologically-informed features outperform standard statistical methods when predicting steel fatigue strength.

๐Ÿ“„ Key Papers: #5, #6, #8, #14, #15 in my publication list

โš™๏ธ "Which optimization algorithm should I use for multi-objective problems?"

AI Clone: Based on my comparative study in Materials (2021), here's my guidance:

  • NSGA-II: Best overall performer โ€” robust, well-documented, excellent for 2-3 objective problems. Use this as your default choice.
  • MOALO: Good for problems with complex, non-convex Pareto fronts. The ant lion metaphor provides unique exploration dynamics.
  • MODA: Effective for certain problem structures but less consistent across different benchmark functions.

For engineering optimization specifically (like composite plate design), combining response surface modeling with multi-objective optimization (as we showed in Steel & Composite Structures, 2019) can dramatically reduce computational cost while maintaining solution quality.

๐Ÿ“„ Key Papers: #1, #2

๐Ÿงฎ "Explain your work on almost periodic functions"

AI Clone: My earliest work (2010-2011) was in pure functional analysis:

In my 2010 paper with Prof. Padhye, we proved that several seemingly different definitions of almost periodic functions on locally compact groups are actually equivalent. This is important because different authors use different definitions, and our work showed they all describe the same mathematical object.

In 2011, I extended this to characterize collectively compact sets of operators on the Banach algebra of almost periodic functions. This bridges operator theory with harmonic analysis.

These pure mathematical foundations later informed my applied work โ€” the topological thinking from functional analysis directly influenced my approach to TDA and persistent homology.

๐Ÿ“„ Key Papers: #9, #10


๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    ASK MY AI CLONE                            โ”‚
โ”‚                                                              โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚  User     โ”‚โ”€โ”€โ”€>โ”‚  Streamlit   โ”‚โ”€โ”€โ”€>โ”‚  Query Engine    โ”‚   โ”‚
โ”‚  โ”‚  Question โ”‚    โ”‚  Frontend    โ”‚    โ”‚                  โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚  1. Embed query  โ”‚   โ”‚
โ”‚                                      โ”‚  2. Vector search โ”‚   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚  3. Retrieve top  โ”‚   โ”‚
โ”‚  โ”‚  Knowledge Base              โ”‚    โ”‚     chunks        โ”‚   โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚    โ”‚  4. Build prompt  โ”‚   โ”‚
โ”‚  โ”‚  โ”‚ 15 Research Papers     โ”‚  โ”‚<โ”€โ”€โ”€โ”‚  5. Call GPT-4o   โ”‚   โ”‚
โ”‚  โ”‚  โ”‚ (PDF โ†’ Chunks โ†’ Embed) โ”‚  โ”‚    โ”‚  6. Stream answer โ”‚   โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚                            โ”‚
โ”‚  โ”‚  โ”‚ ChromaDB / FAISS       โ”‚  โ”‚    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚  โ”‚  โ”‚ Vector Store           โ”‚  โ”‚    โ”‚  Response with    โ”‚   โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚    โ”‚  ๐Ÿ“„ Citations     โ”‚   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚  ๐Ÿ“Š Confidence    โ”‚   โ”‚
โ”‚                                      โ”‚  ๐Ÿ”— Paper Links   โ”‚   โ”‚
โ”‚                                      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Tech Stack

Component Technology
Frontend Streamlit
Embeddings OpenAI text-embedding-3-small
Vector Store ChromaDB (local) / FAISS
LLM GPT-4o via OpenAI API
PDF Processing PyMuPDF + LangChain
Framework LangChain + LlamaIndex
Hosting Hugging Face Spaces / Streamlit Cloud

๐Ÿš€ Quick Start

1. Clone & Install

git clone https://github.com/mlnjsh/ask-my-ai-clone.git
cd ask-my-ai-clone
pip install -r requirements.txt

2. Add Your OpenAI Key

cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

3. Index the Papers

python scripts/index_papers.py

4. Run the Chatbot

streamlit run app/main.py

Visit http://localhost:8501 and start chatting! ๐ŸŽ‰


๐Ÿ“„ Papers Indexed

# Paper Year Topics
1 Comparison of NSGA-II, MOALO and MODA 2021 Multi-Objective Optimization, Micro-Machining
2 Response Surface for Composite Plates 2019 Response Surface, Structural Optimization
3 PSO-Tuned SVM for Turbulent Flows 2020 PSO, SVM, Metamodeling, CFD
4 SVM with Imperfect Training Data 2014 SVM Robustness, Noisy Data
5 TDA for Healthcare Big Data 2019 TDA, Healthcare, Persistent Homology
6 Persistent Homology Survey (Springer) 2020 Persistent Homology, Machine Intelligence
7 Hamiltonian PDEs in Plasma (Springer) 2020 Hamiltonian Systems, Plasma, PDEs
8 TDA Feature Selection for Steel 2020 TDA, Fatigue Strength, Feature Selection
9 Almost Periodic Functions on Groups 2010 Harmonic Analysis, Locally Compact Groups
10 Compact Operators on Banach Algebra 2011 Operator Theory, Banach Algebras
11 Fixed Point Theorems 2017 Fixed Point Theory, Best Approximation
12 DTM vs DJM for Differential Equations 2017 Numerical Methods, Functional DEs
13 Breast Cancer Classification with Ensembles 2017 Ensemble Learning, Medical AI
14 TDA in Human Activity Recognition 2021 TDA, HAR, Wearables
15 TDA for Feature Extraction ๐Ÿ† 2021 TDA, Feature Engineering, Best Paper

๐Ÿ“ Project Structure

ask-my-ai-clone/
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ .env.example
โ”œโ”€โ”€ app/
โ”‚   โ”œโ”€โ”€ main.py              # Streamlit chatbot UI
โ”‚   โ”œโ”€โ”€ config.py             # Configuration settings
โ”‚   โ””โ”€โ”€ components/
โ”‚       โ”œโ”€โ”€ chat.py           # Chat interface
โ”‚       โ”œโ”€โ”€ sidebar.py        # Paper browser sidebar
โ”‚       โ””โ”€โ”€ citations.py      # Citation display component
โ”œโ”€โ”€ scripts/
โ”‚   โ”œโ”€โ”€ index_papers.py       # PDF โ†’ chunks โ†’ embeddings
โ”‚   โ”œโ”€โ”€ build_knowledge.py    # Build vector store
โ”‚   โ””โ”€โ”€ evaluate.py           # Test question accuracy
โ”œโ”€โ”€ papers/                   # PDF files of publications
โ”‚   โ”œโ”€โ”€ 01_nsga_moalo_moda.pdf
โ”‚   โ”œโ”€โ”€ 02_composite_plates.pdf
โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ chroma_db/            # Vector store
โ”‚   โ”œโ”€โ”€ metadata.json         # Paper metadata
โ”‚   โ””โ”€โ”€ sample_questions.json # Test questions
โ””โ”€โ”€ .github/
    โ””โ”€โ”€ workflows/
        โ””โ”€โ”€ deploy.yml        # Auto-deploy to HF Spaces

๐Ÿ”ง Add Your Own Papers

Want to create your own AI clone? Fork this repo and:

  1. Replace papers: Drop your PDFs in papers/
  2. Update metadata: Edit data/metadata.json with your paper details
  3. Re-index: Run python scripts/index_papers.py
  4. Customize persona: Edit the system prompt in app/config.py
  5. Deploy: Push to your Hugging Face Space
# app/config.py โ€” Customize the AI persona
SYSTEM_PROMPT = """
You are an AI research assistant trained on the publications of
Prof. Milan Amrut Joshi. You answer questions about:
- Topological Data Analysis & Persistent Homology
- Multi-Objective Optimization (NSGA-II, MOALO, MODA)
- Machine Learning (SVM, PSO, Ensembles)
- Functional Analysis & Pure Mathematics
- Healthcare AI & Big Data

Always cite specific papers when answering. Be accurate,
helpful, and acknowledge when a question is outside your
knowledge base.
"""

๐Ÿ”Œ API Endpoint

import requests

response = requests.post(
    "https://mlnjsh-ask-my-ai-clone.hf.space/api/ask",
    json={"question": "What is persistent homology?"}
)
print(response.json()["answer"])

๐Ÿงช Evaluation

We test the clone on 50+ curated questions across all research areas:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚          ACCURACY BY RESEARCH AREA           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  TDA & Persistent Homology    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 95%
โ”‚  Multi-Objective Optimization โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ  92%
โ”‚  SVM & Machine Learning       โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ   88%
โ”‚  Pure Mathematics             โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ    85%
โ”‚  Healthcare AI                โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ   90%
โ”‚  Engineering Applications     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ    87%
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Overall Accuracy             โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ   90%
โ”‚  Citation Accuracy            โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ    85%
โ”‚  Hallucination Rate           โ–ˆ            < 5%
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿค Contributing

  • ๐Ÿ› Found an inaccurate answer? Open an issue
  • โ“ Suggest a test question by opening a PR to data/sample_questions.json
  • โญ Star this repo if you find it useful!

๐Ÿ“œ Citation

@software{joshi2026askclone,
  title   = {Ask My AI Clone: RAG-Powered Research Q\&A},
  author  = {Joshi, Milan Amrut},
  year    = {2026},
  url     = {https://github.com/mlnjsh/ask-my-ai-clone}
}

Built by Prof. Milan Amrut Joshi

"Making research accessible, one question at a time."

Star History


Contributors & Domain Experts

Milan Amrut Joshi
Milan Amrut Joshi

Project Author
LangChain
LangChain

RAG & LLM application framework
LlamaIndex
LlamaIndex

Data framework for LLM applications

About

๐Ÿค– Chat with an AI trained on my 15+ research publications โ€” Ask questions about TDA, optimization, machine learning & more. Powered by RAG + GPT-4o.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages