Skip to content

Anirvan-07/rag-doc-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ RAG Document Assistant

A Retrieval-Augmented Generation (RAG) based AI system that answers questions from documents using semantic search and LLMs.

πŸš€ Features πŸ“š Load and process PDF documents πŸ” Semantic search using embeddings 🧠 Context-aware answer generation πŸ“ Automatic summarization of documents

πŸ›  Tech Stack Python LangChain ChromaDB (Vector Database) HuggingFace Transformers Sentence Transformers

βš™οΈ How it Works Load PDF documents Split into chunks Convert text into embeddings Store in vector database Retrieve relevant chunks Generate answer using LLM

πŸ“‚ Project Structure rag-doc-assistant/ β”‚ β”œβ”€β”€ sample_docs/ β”‚ └── rpa_blueprism.pdf β”‚ β”œβ”€β”€ rag_pipeline.ipynb β”œβ”€β”€ README.md

πŸ’‘ Example Queries What is RPA? Explain Blue Prism Summarize the document

πŸ“Œ Future Improvements Add web interface (Streamlit) Support multiple documents Use advanced LLM APIs

πŸ‘¨β€πŸ’» Author Anirvan Mohapatra

About

RAG- based AI document assistant using LangChain and Transformers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors