PDF Chatbot 🔮

A full-stack PDF Q&A chatbot with conversational memory, built with FastAPI, LangChain, ChromaDB, and React.

Tech Stack

Backend

FastAPI — REST API framework
LangChain — agent orchestration and memory
ChromaDB — persistent vector store
HuggingFace Embeddings — sentence-transformers/all-MiniLM-L6-v2
Groq (llama-3.3-70b-versatile) — LLM inference

Frontend

React.js — UI framework
Axios — HTTP client

Project Structure

pdf-chatbot/
├── backend/
│   ├── main.py            # FastAPI app and routes
│   ├── ingest.py          # PDF → chunks → ChromaDB pipeline
│   ├── agent.py           # LangChain agent with memory
│   ├── schemas.py         # Pydantic request models
│   ├── config.py          # Environment config
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── App.jsx
│   │   ├── App.css
│   │   ├── api.js
│   │   └── components/
│   │       ├── UploadFile.jsx
│   │       └── QueryFile.jsx
│   └── package.json
├── chroma_db/             # Auto-generated, gitignored
├── .env.example
└── README.md

How It Works

User uploads PDF
      ↓
PyPDFLoader → RecursiveCharacterTextSplitter → HuggingFace Embeddings → ChromaDB
      ↓
User asks a question
      ↓
LangChain Agent → search_chroma tool → ChromaDB similarity search
      ↓
Retrieved chunks + chat history → Groq LLM → Answer

Getting Started

Prerequisites

Python 3.10+
Node.js 18+
Groq API key → https://console.groq.com

Backend Setup

# Clone the repo
git clone https://github.com/your-username/pdf-oracle.git
cd pdf-oracle

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate        # Windows: venv\Scripts\activate

# Install dependencies
pip install -r backend/requirements.txt

# Set up environment variables
cp .env.example .env
# Add your GROQ_API_KEY to .env

# Run the server
uvicorn backend.main:app --reload

Backend runs at http://localhost:8000
API docs at http://localhost:8000/docs

Frontend Setup

cd frontend

# Install dependencies
npm install

# Set up environment variables
echo "REACT_APP_API_URL=http://localhost:8000" > .env

# Start the dev server
npm start

Frontend runs at http://localhost:3000

Environment Variables

Create a .env file in the project root:

GROQ_API_KEY=your_groq_api_key_here
GROQ_MODEL_NAME=llama-3.3-70b-versatile

API Endpoints

Method	Endpoint	Description
GET	/	Welcome message
GET	/health	Health check
POST	/upload	Upload and ingest a PDF file
POST	/chat	Ask a question about the PDF

POST /upload

Content-Type: multipart/form-data
Body: file (PDF)

POST /chat

{
  "question": "What are the main findings of this paper?"
}

Backend Requirements

backend/requirements.txt:

fastapi
uvicorn
python-multipart
langchain
langchain-community
langchain-groq
chromadb
sentence-transformers
pypdf
python-dotenv
pydantic-settings

.gitignore

venv/
chroma_db/
.env
__pycache__/
*.pyc
node_modules/
frontend/build/
uploads/

Interview Notes

"I built a RAG pipeline using LangChain agents, ChromaDB as the vector store, HuggingFace embeddings, and Groq's llama-3.3-70b for inference — all wired to a React frontend. The agent uses a custom search_chroma tool to retrieve relevant PDF chunks and maintains conversational memory across turns."

Author

Built by Roshan — GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.frontend		Dockerfile.frontend
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Chatbot 🔮

Tech Stack

Project Structure

How It Works

Getting Started

Prerequisites

Backend Setup

Frontend Setup

Environment Variables

API Endpoints

POST /upload

POST /chat

Backend Requirements

.gitignore

Interview Notes

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDF Chatbot 🔮

Tech Stack

Project Structure

How It Works

Getting Started

Prerequisites

Backend Setup

Frontend Setup

Environment Variables

API Endpoints

POST /upload

POST /chat

Backend Requirements

.gitignore

Interview Notes

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages