Skip to content

Maimuzamilhu/MediBot

Repository files navigation

Medical RAG LLMOPS

Overview

This project implements a Medical Retrieval-Augmented Generation (RAG) system using Large Language Models (LLMs) to answer medical questions based on a provided corpus of PDF documents. It leverages a vector store for efficient document retrieval and a powerful LLM for generating coherent and contextually relevant answers. The system is designed with MLOps principles in mind, focusing on maintainability and scalability.

Features

  • PDF Document Ingestion: Load and process medical documents from a specified directory.
  • Text Chunking: Efficiently split large documents into smaller, manageable text chunks.
  • FAISS Vector Store: Create and manage a local FAISS vector store for semantic search and retrieval of relevant document chunks.
  • Groq LLM Integration: Utilize Groq's fast inference capabilities with models like LLaMA-3.1 for answer generation.
  • Custom Prompting: Employ a custom prompt template for focused and concise medical answers.
  • Robust Error Handling: Incorporates custom exceptions and logging for better debugging and system stability.
  • Streamlined Workflow: A clear process for data loading, vector store creation, and QA chain setup.

Workflow

The project follows a typical RAG pipeline to answer medical questions:

  1. Data Loading: PDF documents from the data/ directory are loaded.
  2. Text Chunking: The loaded documents are split into smaller, overlapping chunks to preserve context.
  3. Vector Store Creation/Loading: These text chunks are then embedded and stored in a FAISS vector database. If a vector store already exists, it's loaded instead.
  4. Query Processing: A user's question is received.
  5. Retrieval: The most relevant document chunks are retrieved from the vector store based on the user's query.
  6. Augmentation: The retrieved chunks are used as context to augment the user's question.
  7. Generation: The augmented query is fed into the LLM (Groq) to generate a precise answer.

Here's a visual representation of the workflow:

Medical RAG Workflow

Technologies Used

  • Python 3.x
  • LangChain: For building LLM applications, managing chains, and integrations.
  • LangChain Community: Specific integrations like langchain_community.vectorstores.faiss and langchain_community.embeddings for HuggingFace.
  • LangChain Groq: Integration for Groq LLMs.
  • FAISS: For efficient similarity search and vector storage.
  • Sentence Transformers: For creating embeddings.
  • PyPDFLoader: For loading PDF documents.
  • Flask: Web framework for the server.
  • Dotenv: For managing environment variables.
  • Git: Version control.

Setup and Installation

Follow these steps to set up and run the project locally.

Prerequisites

  • Python 3.8+ installed
  • Git installed

1. Clone the Repository

First, clone the project repository to your local machine:

git clone https://github.com/Maimuzamilhu/MEDICAL-RAG-LLOMPS-.git
cd MEDICAL-RAG-LLOMPS-

2. Create a Virtual Environment (Recommended)

It's highly recommended to use a virtual environment to manage dependencies:

python -m venv .venv
# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate

3. Install Dependencies

Install the required Python packages:

pip install -r requirements.txt

4. API Keys and Configuration

Create a .env file in the root of your project directory and add your API keys:

You can modify other configurations like DB_FAISS_PATH, DATA_PATH, CHUNK_SIZE, and CHUNK_OVERLAP in app/config/config.py.

5. Prepare Your Data

Place your medical PDF documents inside the data/ directory.

6. Build the Vector Store

You must build the FAISS vector store before running the application. This process loads your PDFs, chunks them, creates embeddings, and saves the vector store.

python app/components/data_loader.py

You should see log messages indicating the progress of document loading, chunking, and vector store generation. This will create the vectorstore/db_faiss directory.

Running the Application

After setting up and building the vector store, you can run the Flask application:

python app/application.py

The application will typically run on http://127.0.0.1:5000. Open this URL in your web browser to interact with the Medical RAG system.

Project Structure

app/
├── common/
│   ├── custom_exception.py     # Custom exception handling
│   └── logger.py               # Logging configuration
├── components/
│   ├── data_loader.py          # Loads data, chunks, and saves vector store
│   ├── embeddings.py           # Handles embedding model loading
│   ├── llm.py                   # Loads and configures the LLM (Groq)
│   ├── pdf_loader.py           # Loads and processes PDF files
│   ├── retriever.py            # Sets up the RAG QA chain
│   └── vector_store.py         # Manages FAISS vector store loading/saving
├── config/
│   └── config.py               # Application configuration (API keys, paths)
├── templates/
│   └── index.html              # HTML template for the web interface
├── application.py              # Main Flask application entry point
data/                            # Directory for input PDF documents
logs/                            # Runtime logs (ignored by Git)
vectorstore/                     # FAISS vector store (ignored by Git)
.gitignore                       # Specifies intentionally untracked files
requirements.txt                 # Project dependencies
setup.py                         # Project setup file
Workflow - Medical+RAG+Workflow.png # Project workflow diagram
LICENSE                          # Project license file

This project is licensed under the MIT License.

About

A Medical Retrieval-Augmented Generation (RAG) system using LLMs to answer medical questions from PDFs. Features include FAISS vector store, Groq's LLaMA-3.1, and a Flask web interface, all managed with Docker and Jenkins for MLOps

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors