This project implements a Medical Retrieval-Augmented Generation (RAG) system using Large Language Models (LLMs) to answer medical questions based on a provided corpus of PDF documents. It leverages a vector store for efficient document retrieval and a powerful LLM for generating coherent and contextually relevant answers. The system is designed with MLOps principles in mind, focusing on maintainability and scalability.
- PDF Document Ingestion: Load and process medical documents from a specified directory.
- Text Chunking: Efficiently split large documents into smaller, manageable text chunks.
- FAISS Vector Store: Create and manage a local FAISS vector store for semantic search and retrieval of relevant document chunks.
- Groq LLM Integration: Utilize Groq's fast inference capabilities with models like LLaMA-3.1 for answer generation.
- Custom Prompting: Employ a custom prompt template for focused and concise medical answers.
- Robust Error Handling: Incorporates custom exceptions and logging for better debugging and system stability.
- Streamlined Workflow: A clear process for data loading, vector store creation, and QA chain setup.
The project follows a typical RAG pipeline to answer medical questions:
- Data Loading: PDF documents from the
data/directory are loaded. - Text Chunking: The loaded documents are split into smaller, overlapping chunks to preserve context.
- Vector Store Creation/Loading: These text chunks are then embedded and stored in a FAISS vector database. If a vector store already exists, it's loaded instead.
- Query Processing: A user's question is received.
- Retrieval: The most relevant document chunks are retrieved from the vector store based on the user's query.
- Augmentation: The retrieved chunks are used as context to augment the user's question.
- Generation: The augmented query is fed into the LLM (Groq) to generate a precise answer.
Here's a visual representation of the workflow:
- Python 3.x
- LangChain: For building LLM applications, managing chains, and integrations.
- LangChain Community: Specific integrations like
langchain_community.vectorstores.faissandlangchain_community.embeddingsfor HuggingFace. - LangChain Groq: Integration for Groq LLMs.
- FAISS: For efficient similarity search and vector storage.
- Sentence Transformers: For creating embeddings.
- PyPDFLoader: For loading PDF documents.
- Flask: Web framework for the server.
- Dotenv: For managing environment variables.
- Git: Version control.
Follow these steps to set up and run the project locally.
- Python 3.8+ installed
- Git installed
First, clone the project repository to your local machine:
git clone https://github.com/Maimuzamilhu/MEDICAL-RAG-LLOMPS-.git
cd MEDICAL-RAG-LLOMPS-It's highly recommended to use a virtual environment to manage dependencies:
python -m venv .venv
# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activateInstall the required Python packages:
pip install -r requirements.txtCreate a .env file in the root of your project directory and add your API keys:
- GROQ_API_KEY: Obtain this from the Groq Console.
- HF_TOKEN: Obtain this from HuggingFace Settings -> Access Tokens. This is crucial for downloading the embedding model.
You can modify other configurations like DB_FAISS_PATH, DATA_PATH, CHUNK_SIZE, and CHUNK_OVERLAP in app/config/config.py.
Place your medical PDF documents inside the data/ directory.
You must build the FAISS vector store before running the application. This process loads your PDFs, chunks them, creates embeddings, and saves the vector store.
python app/components/data_loader.pyYou should see log messages indicating the progress of document loading, chunking, and vector store generation. This will create the vectorstore/db_faiss directory.
After setting up and building the vector store, you can run the Flask application:
python app/application.pyThe application will typically run on http://127.0.0.1:5000. Open this URL in your web browser to interact with the Medical RAG system.
app/
├── common/
│ ├── custom_exception.py # Custom exception handling
│ └── logger.py # Logging configuration
├── components/
│ ├── data_loader.py # Loads data, chunks, and saves vector store
│ ├── embeddings.py # Handles embedding model loading
│ ├── llm.py # Loads and configures the LLM (Groq)
│ ├── pdf_loader.py # Loads and processes PDF files
│ ├── retriever.py # Sets up the RAG QA chain
│ └── vector_store.py # Manages FAISS vector store loading/saving
├── config/
│ └── config.py # Application configuration (API keys, paths)
├── templates/
│ └── index.html # HTML template for the web interface
├── application.py # Main Flask application entry point
data/ # Directory for input PDF documents
logs/ # Runtime logs (ignored by Git)
vectorstore/ # FAISS vector store (ignored by Git)
.gitignore # Specifies intentionally untracked files
requirements.txt # Project dependencies
setup.py # Project setup file
Workflow - Medical+RAG+Workflow.png # Project workflow diagram
LICENSE # Project license file
This project is licensed under the MIT License.
