Upload any PDF. Ask questions in plain English. Get answers grounded in the document with exact page citations.
In the screenshot above, a resume PDF was uploaded and indexed into 6 chunks inside ChromaDB. The question "Tell me about the candidate's focused domain and relevant experience?" was asked — DocChat retrieved the 4 most relevant chunks via cosine similarity and passed them as context to Llama 3.3 70B, which generated a structured, grounded answer with bullet points pulled directly from the document. No hallucination — every fact comes from the PDF.
👉 Try the live app — bring your own OpenRouter API key (free tier available)
┌─────────────────────────────────────────────────────────┐
│ PDF Upload │
└───────────────────────┬─────────────────────────────────┘
│
▼
pypdf — extract text per page
│
▼
RecursiveCharacterTextSplitter
chunk_size=1000, overlap=200
│
▼
text-embedding-3-small (OpenRouter)
→ dense vector per chunk
│
▼
ChromaDB
cosine similarity index
│
(on each query)
│
▼
top-4 chunks retrieved by similarity
│
▼
Llama 3.3 70B via OpenRouter
grounded answer + page citations
| Layer | Tool | Purpose |
|---|---|---|
| UI | Streamlit | Chat interface + PDF upload |
| PDF Parsing | pypdf | Extract text per page with metadata |
| Chunking | LangChain RecursiveCharacterTextSplitter |
1000-char chunks, 200-char overlap |
| Embeddings | text-embedding-3-small via OpenRouter |
Dense vector representation |
| Vector Store | ChromaDB | Cosine similarity retrieval |
| LLM | Llama 3.3 70B via OpenRouter | Answer generation with grounding |
git clone https://github.com/dabhiram13/DocChat.git
cd DocChat
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtcp .env.example .env
# Paste your OpenRouter key into .envGet a free key at openrouter.ai/keys.
streamlit run app.pyOpens at http://localhost:8501
- Enter your OpenRouter API key in the sidebar
- Upload a PDF (up to 200 MB)
- Wait a few seconds — the document is parsed, chunked, embedded, and indexed into ChromaDB
- Ask any question about the document in the chat
- Expand 📚 Source passages under each answer to see the exact chunks ChromaDB retrieved and the page numbers they came from
DocChat/
├── app.py # Streamlit UI — sidebar, upload, chat, citations
├── rag_pipeline.py # RAG core — parse → chunk → embed → store → retrieve → generate
├── requirements.txt # Pinned dependencies
├── runtime.txt # Python 3.11 for Streamlit Cloud
├── .env.example # API key template
├── docs/
│ └── screenshot.png # UI preview (used in README)
└── .gitignore
ChromaDB for vector storage — cosine similarity retrieval gives semantically relevant chunks rather than keyword matches. Each session creates a fresh in-memory collection that's populated when the PDF is uploaded.
Chunk overlap (200 chars) — prevents answers from being cut off at chunk boundaries when context spans two adjacent chunks.
Page-level citations — every answer surfaces the exact page number and text excerpt it was drawn from, so responses are fully verifiable against the source document.
OpenRouter for both LLM and embeddings — single API key, single provider, consistent latency. text-embedding-3-small is fast and cost-efficient; Llama 3.3 70B gives strong reading comprehension for document Q&A.
This app uses the OpenRouter API for both embeddings and LLM inference.
- Sign up at openrouter.ai — free credits on registration
- Models used:
openai/text-embedding-3-small+meta-llama/llama-3.3-70b-instruct - No data is stored — documents are processed in memory and discarded when the session ends
