A full-stack RAG (Retrieval-Augmented Generation) application that lets you upload PDFs and ask questions about them. The assistant cites the exact page numbers where it found each piece of information.
Stack: Python · FastAPI · LangChain · PostgreSQL + pgvector · OpenAI · React · Tailwind CSS
- PDF Upload — Drag-and-drop upload with automatic text extraction and embedding
- Conversational Q&A — Ask questions with streaming responses in real-time
- Page Citations — Every answer cites the exact page(s) from the source document
- Chat History — Persistent conversations with full message history
- Multi-Document — Upload multiple PDFs and query across all of them
- Python 3.11+
- Node.js 18+
- Docker (for PostgreSQL + pgvector)
- An OpenAI API key
docker compose up -dThis starts PostgreSQL 16 with the pgvector extension on port 5433.
cp .env.example .envEdit .env and add your OpenAI API key:
OPENAI_API_KEY=sk-your-actual-key-here
DATABASE_URL=postgresql+asyncpg://rag_user:rag_password@localhost:5433/rag_101
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtRun the server (from the backend/ directory):
uvicorn app.main:app --reload --port 8000The API starts at http://localhost:8000. Tables are created automatically on first startup.
cd frontend
npm install
npm run devThe frontend starts at http://localhost:5173 and proxies API calls to the backend.
- Open
http://localhost:5173 - Click the upload icon in the sidebar to upload a PDF
- Start a new chat and ask questions about your document
- Answers will include clickable citation badges showing the source page and a text snippet
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI app entry point
│ │ ├── config.py # Settings from environment
│ │ ├── database.py # Async SQLAlchemy setup
│ │ ├── models.py # ORM models (documents, chunks, conversations, messages)
│ │ ├── schemas.py # Pydantic request/response schemas
│ │ ├── routers/ # API endpoint handlers
│ │ └── services/ # PDF processing, embeddings, RAG chain
│ └── requirements.txt
├── frontend/
│ ├── src/
│ │ ├── App.tsx # Root component with state management
│ │ ├── api/client.ts # API client with SSE streaming
│ │ └── components/ # React UI components
│ └── package.json
├── docker-compose.yml # PostgreSQL + pgvector
└── .env.example
| Method | Path | Description |
|---|---|---|
| POST | /api/documents |
Upload and process a PDF |
| GET | /api/documents |
List uploaded documents |
| DELETE | /api/documents/{id} |
Delete a document |
| POST | /api/chat |
Send a message (SSE streaming response) |
| GET | /api/conversations |
List conversations |
| GET | /api/conversations/{id} |
Get conversation with messages |
| POST | /api/conversations |
Create new conversation |
| DELETE | /api/conversations/{id} |
Delete conversation |
- Upload: PDF text is extracted page-by-page using PyMuPDF, split into ~500-token chunks (preserving page numbers), and embedded via OpenAI's
text-embedding-3-smallmodel - Store: Chunks and their vector embeddings are stored in PostgreSQL using the pgvector extension
- Query: When you ask a question, it's embedded and the top-5 most similar chunks are retrieved via cosine similarity
- Generate: The retrieved context (with page attributions) is sent to GPT-4o-mini along with the chat history, producing a grounded answer with page citations
- Stream: The response streams token-by-token via Server-Sent Events for a real-time chat experience