Skip to content

AdhithyaVar/Multi-Agent-AI-Research-Assistant

Repository files navigation

🔬 Multi-Agent AI Research Assistant

Automatically searches 200M+ academic papers, summarizes findings, detects research gaps, and generates a structured literature review — typically in 60–120 seconds (instant on cache hit).

Ranked #1 · Score 58/60 · Days 205–215

Python LLaMA Groq Streamlit License


📌 What It Does

You type a research topic. The system:

  1. Plans the research strategy using LLaMA-3.3-70B
  2. Searches Semantic Scholar across 200M+ papers with 3 parallel queries
  3. Reranks results using semantic similarity × citation impact scoring
  4. Summarizes all papers concurrently (3× faster than sequential)
  5. Analyses gaps, contradictions, and emerging trends across the corpus
  6. Writes a fully structured 7-section academic literature review
  7. Visualizes a knowledge graph of papers, authors, and keywords

🏗️ Architecture

User Query
     │
     ▼
┌─────────────────────────────────────────────────────────┐
│  Manager Agent  (LLaMA-3.3-70B · temp=0.0)              │
│  Generates 3 diverse search queries + focus points       │
└────────────────────────┬────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────┐
│  Searcher Agent  (No LLM — pure retrieval)               │
│  3 parallel Semantic Scholar API calls (staggered)       │
│  Deduplication → Reranking (70% cosine + 30% citations) │
│  Stores results in ChromaDB vector memory                │
└────────────────────────┬────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────┐
│  Summarizer Agent  (LLaMA-3.1-8B · temp=0.1)            │
│  Concurrent processing — 3 papers at once                │
│  Extracts: Contribution / Methodology / Results / Limits │
└────────────────────────┬────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────┐
│  Critic Agent  (LLaMA-3.3-70B · temp=0.1)               │
│  Cross-paper analytical reasoning                        │
│  Detects: Contradictions / Gaps / Weaknesses / Trends    │
└────────────────────────┬────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────┐
│  Writer Agent  (LLaMA-3.3-70B · temp=0.4)               │
│  7-section structured literature review                  │
│  Citation validation — removes hallucinated references   │
└────────────────────────┬────────────────────────────────┘
                         │
                         ▼
       Streamlit UI + PyVis Knowledge Graph + FastAPI

All agents share a ChromaDB vector memory powered by allenai-specter (runs locally, no API cost, trained on scientific papers).


✨ Key Features

Feature Detail
5 Specialized Agents Manager, Searcher, Summarizer, Critic, Writer — each with a distinct role
Custom Relevance Scoring 0.65 × cosine_similarity + 0.25 × citation_score + 0.10 × recency_score — not off-the-shelf search
200M+ Papers Semantic Scholar API — no key needed for basic usage
Concurrent Summarization 3 papers processed in parallel → ~3× faster than sequential
Query Cache Disk-persisted, 24h TTL — repeated queries return instantly
Citation Validation Hallucinated references (e.g. [15] when only 12 papers exist) are auto-removed
GPU / CPU Auto-Detection Uses CUDA GPU if available, falls back to CPU — zero config
Knowledge Graph Interactive PyVis graph of papers, authors, and keyword clusters
REST API FastAPI backend for programmatic access
100% Free Groq free tier + local embeddings + local ChromaDB

🤖 Model Stack

Agent Model Temp Role
Manager llama-3.3-70b-versatile 0.0 Deterministic JSON planning
Summarizer llama-3.1-8b-instant 0.1 High-volume factual extraction
Critic llama-3.3-70b-versatile 0.1 Deep cross-paper reasoning
Writer llama-3.3-70b-versatile 0.4 Structured academic prose
Embeddings allenai-specter Scientific paper embeddings (768-dim)

🛠️ Tech Stack

LLM Inference    : Groq API (LLaMA-3.3-70B + LLaMA-3.1-8B)
Embeddings       : sentence-transformers (allenai-specter) — local, scientific domain
Vector Store     : ChromaDB (persistent, cosine similarity, pre-computed embeddings)
Paper Search     : Semantic Scholar API (200M+ papers)
Web UI           : Streamlit
REST API         : FastAPI + Uvicorn
Knowledge Graph  : PyVis
Deep Learning    : PyTorch (GPU/CPU auto-detected)

📁 Project Structure

Multi-Agent AI Research Assistant/
│
├── .env                        ← API keys (edit this first)
├── config.py                   ← All configuration in one place
├── main.py                     ← Pipeline entry point + CLI
├── requirements.txt
├── setup_check.py              ← Validates your environment
│
├── agents/
│   ├── base_agent.py           ← Groq LLM wrapper (rate-limit aware retry)
│   ├── manager_agent.py        ← Query planning + search strategy
│   ├── searcher_agent.py       ← Multi-query paper retrieval
│   ├── summarizer_agent.py     ← Concurrent paper summarization
│   ├── critic_agent.py         ← Cross-paper gap & contradiction analysis
│   └── writer_agent.py         ← Literature review generation
│
├── tools/
│   └── semantic_scholar.py     ← API fetch + improved reranking
│
├── memory/
│   └── chroma_store.py         ← Shared ChromaDB vector memory
│
├── utils/
│   ├── helpers.py              ← Logger, retry, text utilities
│   ├── cache.py                ← Disk query cache (24h TTL)
│   ├── device_optimizer.py     ← GPU/CPU auto-detection
│   └── model_registry.py       ← Singleton model loader (no duplicate loads)
│
├── app/
│   ├── streamlit_app.py        ← Full Streamlit UI
│   └── knowledge_graph.py      ← PyVis graph builder
│
└── api/
    └── server.py               ← FastAPI REST backend

🚀 Quick Start

1. Clone the repository

git clone https://github.com/yourusername/multi-agent-research-assistant.git
cd "multi-agent-research-assistant"

2. Create a virtual environment

python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # Mac/Linux

3. Install dependencies

# Install numpy first (avoids compiler issues on Python 3.12+)
pip install "numpy>=2.0.0"
pip install -r requirements.txt

For NVIDIA GPU acceleration (optional):

pip install torch --index-url https://download.pytorch.org/whl/cu121

4. Get a free Groq API key

5. Configure .env

GROQ_API_KEY=gsk_your_key_here
PRIMARY_MODEL=llama-3.3-70b-versatile
FAST_MODEL=llama-3.1-8b-instant

6. Validate your setup

python setup_check.py

7. Launch

streamlit run app/streamlit_app.py

Opens at http://localhost:8501


💡 Usage Examples

Streamlit UI — type any research topic:

attention mechanisms in vision transformers
federated learning for privacy-preserving AI
CRISPR gene editing for cancer therapy
reinforcement learning from human feedback
graph neural networks for drug discovery

CLI mode:

python main.py "transformer models for natural language processing"

REST API:

# Start the API server
uvicorn api.server:app --reload --port 8000

# Make a research request
curl -X POST http://localhost:8000/research \
  -H "Content-Type: application/json" \
  -d '{"query": "BERT fine-tuning for sentiment analysis", "top_k": 10}'

API docs available at http://localhost:8000/docs


📊 Performance

Metric Value
Total pipeline time ~60–120 seconds (depends on API rate limits)
Cached query time < 1 second
Papers fetched Up to 90 (30 × 3 queries)
Papers analysed 10–20 (configurable)
Literature review length ~7,000–10,000 characters
Embedding model size ~440 MB (downloaded once, cached)

⚙️ Configuration

All settings are in .env — no code changes needed:

# Model
PRIMARY_MODEL=llama-3.3-70b-versatile
FAST_MODEL=llama-3.1-8b-instant

# Paper retrieval
MAX_PAPERS=30           # Papers fetched per query
TOP_K_RESULTS=12        # Papers sent to analysis pipeline

# Cache
CACHE_ENABLED=true
CACHE_TTL_HOURS=24

# Concurrency
SUMMARIZER_WORKERS=2    # Parallel summarization threads (2 avoids Groq 429 cascade)

# Groq rate limits
GROQ_MAX_RETRIES=4
GROQ_RETRY_DELAY=3.0

🔌 API Endpoints

Method Endpoint Description
POST /research Run the full pipeline
GET /memory/search?q=query Query vector memory
DELETE /memory/clear Clear ChromaDB
GET /health Health check

🧩 What Makes This Novel

  1. Inter-agent communication — 5 specialized agents orchestrated in sequence, each with a distinct memory, role, and tool access. Most student projects don't implement true multi-agent pipelines.

  2. Custom relevance scoring70% semantic cosine similarity + 30% normalized citation score across 200M+ papers — not keyword search, not simple vector retrieval.

  3. Critic agent with analytical reasoning — goes beyond summarization to detect contradictions between papers, identify research gaps, and flag methodological weaknesses. This cross-document reasoning capability is rare in student-level projects.

  4. Citation validation — post-generation check removes hallucinated reference numbers, a common LLM failure mode in academic writing tasks.


🐛 Troubleshooting

Problem Fix
401 Invalid API Key Get a new key at console.groq.com and update .env
429 Rate Limited Built-in staggered retry handles this automatically
No papers found Use a more specific academic phrase as the query
Slow first run allenai-specter downloads once (~440MB), cached locally after
ChromaDB error Run python -c "from memory.chroma_store import clear_memory; clear_memory()"
Pydantic warning Cosmetic warning from groq library on Python 3.14 — does not affect functionality

📄 License

MIT License — free to use, modify, and distribute.


🙏 Acknowledgements

About

5-agent AI system that autonomously searches 200M+ academic papers, summarizes findings, detects research gaps & contradictions, and generates structured literature reviews using LLaMA-3.3-70B via Groq.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors