Automatically searches 200M+ academic papers, summarizes findings, detects research gaps, and generates a structured literature review — typically in 60–120 seconds (instant on cache hit).
Ranked #1 · Score 58/60 · Days 205–215
You type a research topic. The system:
- Plans the research strategy using LLaMA-3.3-70B
- Searches Semantic Scholar across 200M+ papers with 3 parallel queries
- Reranks results using semantic similarity × citation impact scoring
- Summarizes all papers concurrently (3× faster than sequential)
- Analyses gaps, contradictions, and emerging trends across the corpus
- Writes a fully structured 7-section academic literature review
- Visualizes a knowledge graph of papers, authors, and keywords
User Query
│
▼
┌─────────────────────────────────────────────────────────┐
│ Manager Agent (LLaMA-3.3-70B · temp=0.0) │
│ Generates 3 diverse search queries + focus points │
└────────────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Searcher Agent (No LLM — pure retrieval) │
│ 3 parallel Semantic Scholar API calls (staggered) │
│ Deduplication → Reranking (70% cosine + 30% citations) │
│ Stores results in ChromaDB vector memory │
└────────────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Summarizer Agent (LLaMA-3.1-8B · temp=0.1) │
│ Concurrent processing — 3 papers at once │
│ Extracts: Contribution / Methodology / Results / Limits │
└────────────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Critic Agent (LLaMA-3.3-70B · temp=0.1) │
│ Cross-paper analytical reasoning │
│ Detects: Contradictions / Gaps / Weaknesses / Trends │
└────────────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Writer Agent (LLaMA-3.3-70B · temp=0.4) │
│ 7-section structured literature review │
│ Citation validation — removes hallucinated references │
└────────────────────────┬────────────────────────────────┘
│
▼
Streamlit UI + PyVis Knowledge Graph + FastAPI
All agents share a ChromaDB vector memory powered by allenai-specter (runs locally, no API cost, trained on scientific papers).
| Feature | Detail |
|---|---|
| 5 Specialized Agents | Manager, Searcher, Summarizer, Critic, Writer — each with a distinct role |
| Custom Relevance Scoring | 0.65 × cosine_similarity + 0.25 × citation_score + 0.10 × recency_score — not off-the-shelf search |
| 200M+ Papers | Semantic Scholar API — no key needed for basic usage |
| Concurrent Summarization | 3 papers processed in parallel → ~3× faster than sequential |
| Query Cache | Disk-persisted, 24h TTL — repeated queries return instantly |
| Citation Validation | Hallucinated references (e.g. [15] when only 12 papers exist) are auto-removed |
| GPU / CPU Auto-Detection | Uses CUDA GPU if available, falls back to CPU — zero config |
| Knowledge Graph | Interactive PyVis graph of papers, authors, and keyword clusters |
| REST API | FastAPI backend for programmatic access |
| 100% Free | Groq free tier + local embeddings + local ChromaDB |
| Agent | Model | Temp | Role |
|---|---|---|---|
| Manager | llama-3.3-70b-versatile |
0.0 | Deterministic JSON planning |
| Summarizer | llama-3.1-8b-instant |
0.1 | High-volume factual extraction |
| Critic | llama-3.3-70b-versatile |
0.1 | Deep cross-paper reasoning |
| Writer | llama-3.3-70b-versatile |
0.4 | Structured academic prose |
| Embeddings | allenai-specter |
— | Scientific paper embeddings (768-dim) |
LLM Inference : Groq API (LLaMA-3.3-70B + LLaMA-3.1-8B)
Embeddings : sentence-transformers (allenai-specter) — local, scientific domain
Vector Store : ChromaDB (persistent, cosine similarity, pre-computed embeddings)
Paper Search : Semantic Scholar API (200M+ papers)
Web UI : Streamlit
REST API : FastAPI + Uvicorn
Knowledge Graph : PyVis
Deep Learning : PyTorch (GPU/CPU auto-detected)
Multi-Agent AI Research Assistant/
│
├── .env ← API keys (edit this first)
├── config.py ← All configuration in one place
├── main.py ← Pipeline entry point + CLI
├── requirements.txt
├── setup_check.py ← Validates your environment
│
├── agents/
│ ├── base_agent.py ← Groq LLM wrapper (rate-limit aware retry)
│ ├── manager_agent.py ← Query planning + search strategy
│ ├── searcher_agent.py ← Multi-query paper retrieval
│ ├── summarizer_agent.py ← Concurrent paper summarization
│ ├── critic_agent.py ← Cross-paper gap & contradiction analysis
│ └── writer_agent.py ← Literature review generation
│
├── tools/
│ └── semantic_scholar.py ← API fetch + improved reranking
│
├── memory/
│ └── chroma_store.py ← Shared ChromaDB vector memory
│
├── utils/
│ ├── helpers.py ← Logger, retry, text utilities
│ ├── cache.py ← Disk query cache (24h TTL)
│ ├── device_optimizer.py ← GPU/CPU auto-detection
│ └── model_registry.py ← Singleton model loader (no duplicate loads)
│
├── app/
│ ├── streamlit_app.py ← Full Streamlit UI
│ └── knowledge_graph.py ← PyVis graph builder
│
└── api/
└── server.py ← FastAPI REST backend
git clone https://github.com/yourusername/multi-agent-research-assistant.git
cd "multi-agent-research-assistant"python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # Mac/Linux# Install numpy first (avoids compiler issues on Python 3.12+)
pip install "numpy>=2.0.0"
pip install -r requirements.txtFor NVIDIA GPU acceleration (optional):
pip install torch --index-url https://download.pytorch.org/whl/cu121
- Go to console.groq.com
- Sign up → API Keys → Create API Key → Copy it
GROQ_API_KEY=gsk_your_key_here
PRIMARY_MODEL=llama-3.3-70b-versatile
FAST_MODEL=llama-3.1-8b-instantpython setup_check.pystreamlit run app/streamlit_app.pyOpens at http://localhost:8501
Streamlit UI — type any research topic:
attention mechanisms in vision transformers
federated learning for privacy-preserving AI
CRISPR gene editing for cancer therapy
reinforcement learning from human feedback
graph neural networks for drug discovery
CLI mode:
python main.py "transformer models for natural language processing"REST API:
# Start the API server
uvicorn api.server:app --reload --port 8000
# Make a research request
curl -X POST http://localhost:8000/research \
-H "Content-Type: application/json" \
-d '{"query": "BERT fine-tuning for sentiment analysis", "top_k": 10}'API docs available at http://localhost:8000/docs
| Metric | Value |
|---|---|
| Total pipeline time | ~60–120 seconds (depends on API rate limits) |
| Cached query time | < 1 second |
| Papers fetched | Up to 90 (30 × 3 queries) |
| Papers analysed | 10–20 (configurable) |
| Literature review length | ~7,000–10,000 characters |
| Embedding model size | ~440 MB (downloaded once, cached) |
All settings are in .env — no code changes needed:
# Model
PRIMARY_MODEL=llama-3.3-70b-versatile
FAST_MODEL=llama-3.1-8b-instant
# Paper retrieval
MAX_PAPERS=30 # Papers fetched per query
TOP_K_RESULTS=12 # Papers sent to analysis pipeline
# Cache
CACHE_ENABLED=true
CACHE_TTL_HOURS=24
# Concurrency
SUMMARIZER_WORKERS=2 # Parallel summarization threads (2 avoids Groq 429 cascade)
# Groq rate limits
GROQ_MAX_RETRIES=4
GROQ_RETRY_DELAY=3.0| Method | Endpoint | Description |
|---|---|---|
POST |
/research |
Run the full pipeline |
GET |
/memory/search?q=query |
Query vector memory |
DELETE |
/memory/clear |
Clear ChromaDB |
GET |
/health |
Health check |
-
Inter-agent communication — 5 specialized agents orchestrated in sequence, each with a distinct memory, role, and tool access. Most student projects don't implement true multi-agent pipelines.
-
Custom relevance scoring —
70% semantic cosine similarity + 30% normalized citation scoreacross 200M+ papers — not keyword search, not simple vector retrieval. -
Critic agent with analytical reasoning — goes beyond summarization to detect contradictions between papers, identify research gaps, and flag methodological weaknesses. This cross-document reasoning capability is rare in student-level projects.
-
Citation validation — post-generation check removes hallucinated reference numbers, a common LLM failure mode in academic writing tasks.
| Problem | Fix |
|---|---|
401 Invalid API Key |
Get a new key at console.groq.com and update .env |
429 Rate Limited |
Built-in staggered retry handles this automatically |
No papers found |
Use a more specific academic phrase as the query |
| Slow first run | allenai-specter downloads once (~440MB), cached locally after |
ChromaDB error |
Run python -c "from memory.chroma_store import clear_memory; clear_memory()" |
| Pydantic warning | Cosmetic warning from groq library on Python 3.14 — does not affect functionality |
MIT License — free to use, modify, and distribute.
- Groq — Free LLaMA inference API
- Semantic Scholar — Open academic paper database
- sentence-transformers — Local embedding models
- ChromaDB — Open-source vector database
- Streamlit — Python web UI framework