📄 AI Document Assistant (Context-Aware RAG)

An intelligent, full-stack document analysis tool powered by Retrieval-Augmented Generation (RAG). This project allows users to upload PDF or DOCX files and have a context-aware conversation with them. It uses a Hybrid AI approach (Local Embeddings + Cloud LLM) to ensure speed, stability, and zero cost.

---

🚀 Key Features (Updated)

🗣️ Conversational Memory: The AI remembers previous context in the chat, allowing for natural follow-up questions (e.g., "Who is the author?" -> "How old is he?").
🔍 Source Citations: Every answer comes with expandable source references, showing exactly which file and page the information was pulled from.
📚 Smart Parsing: Uses pdfplumber to accurately read complex PDF layouts, tables, and unconventional fonts.
⚡ Hybrid AI Engine:
- Embeddings: Runs locally using HuggingFace (all-MiniLM-L6-v2) — Unlimited & Private.
- LLM: Powered by Google Gemini (Flash/Pro) — Fast & High Quality.
🐳 Dockerized: Fully containerized with Docker and Docker Compose for easy deployment.
🧹 Auto-Reset Memory: Automatically clears the vector database when a new file is uploaded to prevent data contamination.
📝 Grammar Analysis: A dedicated tool to check and correct grammar in texts.

🛠️ Tech Stack

Component	Technology	Description
Language	Python 3.11	Stable version for AI/ML libraries.
Backend	FastAPI	REST API to handle file uploads & logic.
Frontend	Streamlit	Interactive Chat UI with history state.
Orchestration	LangChain	Framework for RAG & Contextual Rephrasing.
Vector DB	ChromaDB	Local database to store text embeddings.
Parsing	PDFPlumber	Advanced PDF text extraction.
Deployment	Docker	Containerization for reproducible builds.

📦 Installation & Setup

You can run this project either Manually (Local Environment) or using Docker.

Option 1: Manual Setup (Local)

Clone the Repository

git clone https://github.com/YOUR_USERNAME/rag-doc-assistant.git
cd rag-doc-assistant

Create Virtual Environment (Python 3.11 Recommended)

# Windows
py -3.11 -m venv venv
.\venv\Scripts\activate

Install Dependencies

pip install -r backend/requirements.txt
pip install -r frontend/requirements.txt

Configure API Key Create a .env file in the root directory:
```
GOOGLE_API_KEY=AIzaSyDxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
```
Run the App (Two Terminals Needed)
- Terminal 1 (Backend): uvicorn backend.app.main:app --reload
- Terminal 2 (Frontend): streamlit run frontend/app.py

Option 2: Docker Setup (Recommended) 🐳

Ensure you have Docker Desktop installed and running.

Configure API Key Create a .env file in the root directory with your GOOGLE_API_KEY.
Build and Run
```
docker-compose up --build
```
Access the App Open your browser at: http://localhost:8501

🔮 Project Roadmap & Status

Feature	Status	Description
RAG Core	✅ Completed	PDF Upload + Vector DB + LLM Answer.
Grammar Check	✅ Completed	Separate tab for linguistic analysis.
Conversational Memory	✅ Completed	Backend rephrases queries based on history.
Source Citations	✅ Completed	UI displays page numbers for verification.
Docker Support	✅ Completed	`Dockerfile` and `docker-compose` added.
Smart PDF Parsing	✅ Completed	Integrated `pdfplumber` for better OCR-like reading.
Multi-File Support	🚧 Planned	Allow querying multiple PDFs simultaneously.
Table Extraction	🚧 Planned	Extract data tables to CSV/Excel.
Cloud Deployment	🚧 Planned	Deploy to Streamlit Cloud / Railway / AWS.

🤝 Contributing

Contributions are welcome! Please fork the repository and submit a Pull Request.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
model_check.py		model_check.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 AI Document Assistant (Context-Aware RAG)

🚀 Key Features (Updated)

🛠️ Tech Stack

📦 Installation & Setup

Option 1: Manual Setup (Local)

Option 2: Docker Setup (Recommended) 🐳

🔮 Project Roadmap & Status

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📄 AI Document Assistant (Context-Aware RAG)

🚀 Key Features (Updated)

🛠️ Tech Stack

📦 Installation & Setup

Option 1: Manual Setup (Local)

Option 2: Docker Setup (Recommended) 🐳

🔮 Project Roadmap & Status

🤝 Contributing

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages