Skip to content

HimabinduPyata/ai-document-qa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📚 AI Document Intelligence (RAG System)

A production-style Retrieval-Augmented Generation (RAG) system that allows users to upload documents (PDFs) and ask intelligent questions with context-aware AI responses.

Built using FAISS vector search, OpenAI embeddings, and GPT-4o-mini, this project demonstrates how modern AI applications retrieve and reason over private documents.


🚀 Live Demo

👉 https://your-streamlit-app-link.streamlit.app


✨ Key Features

  • 📄 Upload and analyze PDF documents
  • 🧠 Semantic search using embeddings
  • ⚡ FAISS vector database for fast retrieval
  • 💬 Context-aware AI question answering
  • 📚 Multi-chunk retrieval (top-k context)
  • 🔍 Retrieval-Augmented Generation (RAG pipeline)
  • 📊 Clean Streamlit UI

🧠 Architecture

PDF Upload
   ↓
Text Extraction
   ↓
Chunking
   ↓
Embeddings (OpenAI)
   ↓
FAISS Vector Search
   ↓
Top-K Relevant Context
   ↓
LLM (GPT-4o-mini)
   ↓
Answer Generation

🛠️ Tech Stack

  • Python
  • Streamlit
  • OpenAI API
  • FAISS (Vector Database)
  • NumPy
  • PyPDF

🧠 What This Project Demonstrates

  • Retrieval-Augmented Generation (RAG)
  • Vector similarity search
  • Embedding-based AI systems
  • LLM orchestration
  • Real-world AI application design

🚀 Example Use Cases

  • Resume Q&A assistant
  • Legal document analysis
  • Study material assistant
  • Company knowledge base chatbot

🔥 Future Improvements

  • Page-level citations
  • Multi-document chat memory
  • Streaming responses
  • Authentication system
  • Cloud deployment (SaaS version)

🧠 What I learned:

  • How RAG systems work in production
  • Vector databases (FAISS)
  • Embedding-based semantic search
  • Designing LLM-powered applications beyond simple prompting
  • Turning AI models into real product workflows

⭐ Why this project matters

This project demonstrates how modern AI applications move beyond simple prompting into retrieval-based reasoning systems, similar to production tools like ChatGPT’s file upload, Notion AI, and enterprise knowledge assistants.

About

Production-style RAG system with FAISS vector database, OpenAI embeddings, and GPT-powered contextual question answering over PDF documents.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages