Skip to content

Shanmukhasaiprakash/MedSimplify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

18 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ’Š MedSimplify โ€” AI-Powered Medication Instructions Simplifier

Python LangGraph OpenAI Gradio License: MIT Live Demo

MedSimplify transforms complex pharmaceutical drug information into clear, patient-friendly medication instructions using a production-grade agentic AI pipeline with hybrid RAG, multi-layer guardrails, and automated evaluation.


๐ŸŽฏ Problem

Prescription labels in the United States are written at a 9th-grade reading level on average โ€” far above the 6th-grade literacy of most adults. This gap leads to:

  • 50% of patients misunderstanding their medication instructions
  • $500B+ in annual preventable healthcare costs from non-adherence
  • Disproportionate harm to elderly, immigrant, and low-literacy populations

MedSimplify solves this by using Generative AI + RAG to instantly translate any drug name into structured, plain-language patient instructions.


๐Ÿ—๏ธ System Architecture

User Query
    โ”‚
    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Input Guard โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Hybrid RAG   โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚   Generate    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Output Guard โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Evaluate โ”‚
โ”‚  (Node 1)   โ”‚     โ”‚  (Node 2)    โ”‚     โ”‚  GPT-4o-mini  โ”‚     โ”‚  (Node 4)    โ”‚     โ”‚ (Node 5) โ”‚
โ”‚             โ”‚     โ”‚              โ”‚     โ”‚   (Node 3)    โ”‚     โ”‚              โ”‚     โ”‚          โ”‚
โ”‚ โ€ข Injection โ”‚     โ”‚ โ€ข FAISS Denseโ”‚     โ”‚ โ€ข Clinical    โ”‚     โ”‚ โ€ข Hallucin.  โ”‚     โ”‚ โ€ข ROUGE  โ”‚
โ”‚   detection โ”‚     โ”‚ โ€ข BM25 Sparseโ”‚     โ”‚   pharmacist  โ”‚     โ”‚   detection  โ”‚     โ”‚ โ€ข FK     โ”‚
โ”‚ โ€ข Sanitize  โ”‚     โ”‚ โ€ข Cross-Enc. โ”‚     โ”‚   prompt      โ”‚     โ”‚ โ€ข Disclaimer โ”‚     โ”‚   Grade  โ”‚
โ”‚             โ”‚     โ”‚   Reranking  โ”‚     โ”‚ โ€ข 6 sections  โ”‚     โ”‚   injection  โ”‚     โ”‚ โ€ข Completโ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  248K Drug  โ”‚
                    โ”‚  Knowledge  โ”‚
                    โ”‚    Base     โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Technologies

Component Technology
Agentic Orchestration LangGraph (StateGraph)
Dense Retrieval FAISS + Sentence-BERT (all-MiniLM-L6-v2)
Sparse Retrieval BM25 Okapi
Reranking Cross-Encoder (ms-marco-MiniLM-L-6-v2)
Language Model GPT-4o-mini via LangChain
Safety Custom GuardrailsEngine (4 layers)
Evaluation ROUGE-1/2/L, Flesch-Kincaid, Faithfulness
UI Gradio Blocks
TTS gTTS
Translation Google Translate API
PDF Export ReportLab

โœจ Features

  • ๐Ÿค– 5-Node LangGraph Pipeline โ€” input guard โ†’ hybrid RAG โ†’ generate โ†’ output guard โ†’ evaluate
  • ๐Ÿ” Hybrid RAG โ€” FAISS dense + BM25 sparse + cross-encoder reranking for maximum precision
  • ๐Ÿ›ก๏ธ 4-Layer Guardrails โ€” prompt injection detection, faithfulness scoring, dosage override blocking, mandatory disclaimers
  • ๐Ÿ“Š Automated Evaluation โ€” ROUGE scores, Flesch-Kincaid Grade, section completeness, faithfulness
  • ๐ŸŒ 10 Languages โ€” EN, ES, HI, FR, PT, AR, ZH, DE, JA, TL
  • ๐Ÿ”Š Text-to-Speech โ€” voice responses in native language via gTTS
  • ๐Ÿ“„ PDF Export โ€” downloadable medication summary reports
  • ๐Ÿ’ฌ Multi-turn Memory โ€” conversation history preserved across turns
  • โšก Streaming Output โ€” word-by-word response streaming

๐Ÿ“Š Evaluation Results

Metric Score Target Status
Flesch-Kincaid Grade Level 7.8 โ‰ค 8.0 โœ… Pass
Flesch Reading Ease 58.6 โ‰ฅ 55.0 โœ… Pass
ROUGE-1 F1 0.41 โ€” โ€”
ROUGE-L F1 0.38 โ€” โ€”
Section Completeness 96% โ‰ฅ 80% โœ… Pass
Faithfulness Score 0.21 > 0.05 โœ… Pass
Guardrail Pass Rate 100% 100% โœ… Pass

Evaluated on 20 representative medication queries across multiple drug classes.


๐Ÿš€ Quick Start

1. Clone the Repository

git clone https://github.com/ShanmukhaSaiPrakashJeelakarra/MedSimplify.git
cd MedSimplify

2. Install Dependencies

pip install pandas numpy faiss-cpu sentence-transformers rank-bm25
pip install langchain langchain-openai langgraph openai langchain-community
pip install gradio gtts deep-translator langdetect
pip install reportlab rouge-score textstat langchain-text-splitters nltk

3. Set Your OpenAI API Key

import os
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

4. Upload the Dataset

Download the Medicine Dataset from Kaggle and upload it to your Google Colab session or place it in the project root.

5. Run the Notebook

Open MedSimplify.ipynb in Google Colab and run all cells. The Gradio interface will launch with a public shareable link.


๐Ÿ“ Project Structure

MedSimplify/
โ”œโ”€โ”€ MedSimplify.ipynb              # Main notebook (run in Google Colab)
โ”œโ”€โ”€ README.md                      # This file
โ”œโ”€โ”€ requirements.txt               # All dependencies
โ”œโ”€โ”€ data/
โ”‚   โ””โ”€โ”€ medicine_dataset.csv       # Kaggle medicine dataset (download separately)
โ””โ”€โ”€ outputs/
    โ””โ”€โ”€ medication_summary.pdf     # Sample exported PDF report

๐Ÿ—ƒ๏ธ Dataset

Property Value
Source Kaggle โ€“ 11000 Medicine Details
Total Records 248,218
Columns 58
Side Effect Columns 42
Use Case Columns 5
Indexing Sample 15,000 (configurable)

๐Ÿงช Sample Output

Query: "What is augmentin 625 duo tablet used for?"

๐Ÿ’Š MEDICATION OVERVIEW
Augmentin 625 is an antibiotic used to treat bacterial infections.
It helps your body fight off germs that can make you sick.

๐Ÿ“‹ HOW TO USE THIS MEDICATION
Take as prescribed by your doctor, usually every 8โ€“12 hours with food.
Always complete the full course even if you feel better.

โš ๏ธ POSSIBLE SIDE EFFECTS
โ€ข Nausea  โ€ข Vomiting  โ€ข Diarrhea  โ€ข Skin rash

๐Ÿšซ IMPORTANT WARNINGS
Do not take if allergic to penicillin. Inform your doctor of all
current medications before starting this treatment.

๐Ÿ”„ ALTERNATIVE OPTIONS
Moxikind-CV 625, Clavam 625, Augpen 625mg Tablet

๐Ÿง  SIMPLE EXPLANATION
Augmentin 625 combines amoxicillin and clavulanate to fight bacterial 
infections. It works by stopping bacteria from growing. Take it with 
food to reduce stomach upset and always finish the full course your 
doctor prescribed, even if you start feeling better sooner.

โš•๏ธ Medical Disclaimer: For educational purposes only. Always consult
a licensed healthcare provider before making any medication decisions.

๐Ÿ›ก๏ธ Guardrails System

Layer 1 โ€“ Input Validation
  โ””โ”€โ”€ Length & format checks
  โ””โ”€โ”€ Regex prompt injection detection (6 patterns)
  โ””โ”€โ”€ High-risk keyword flagging

Layer 2 โ€“ Context Grounding
  โ””โ”€โ”€ RAG forces responses grounded in retrieved drug records
  โ””โ”€โ”€ Low-overlap hallucination flagging

Layer 3 โ€“ Output Safety
  โ””โ”€โ”€ Dosage override pattern detection
  โ””โ”€โ”€ Context faithfulness scoring (threshold > 0.05)

Layer 4 โ€“ Disclaimer Injection
  โ””โ”€โ”€ Mandatory medical disclaimer on EVERY response

๐Ÿ“š Research Paper

This project is accompanied by a research paper submitted to the International Journal of Artificial Intelligence in Healthcare (IJAIH), Inderscience Publishers:

Jeelakarra, S.S.P. (2026). MedSimplify: An Agentic AI System for Automated Generation of Patient-Friendly Medication Instructions Using Hybrid Retrieval-Augmented Generation and Multi-Layer Guardrails. Int. J. Artificial Intelligence in Healthcare.


๐Ÿ”ฎ Future Work

  • Expand knowledge base to full DrugBank + FDA drug labels
  • Implement BERTScore and FactScore for rigorous faithfulness evaluation
  • Human evaluation study measuring patient comprehension improvement
  • Fine-tuned open-source clinical LLM (BioMistral, Llama-3-Med)
  • Drug-drug interaction warnings in guardrails
  • Clinical pilot study with patient volunteers

๐Ÿ“– Key References

  • Lewis et al. (2020). Retrieval-Augmented Generation for NLP tasks. NeurIPS 2020.
  • Karpukhin et al. (2020). Dense Passage Retrieval. EMNLP 2020.
  • Nogueira & Cho (2019). Passage Re-ranking with BERT. arXiv.
  • Wolf et al. (2010). Improving prescription drug labels. JAMA Internal Medicine.
  • Robertson & Zaragoza (2009). BM25. Foundations and Trends in IR.

โš ๏ธ Disclaimer

MedSimplify is an educational and research prototype. It is not a medical device and should not be used for clinical decision-making. All generated content is for informational purposes only. Always consult a licensed healthcare provider before making any medication decisions.


๐Ÿ“„ License

MIT License โ€” see LICENSE for details.


๐Ÿ‘ค Author

Shanmukha Sai Prakash Jeelakarra
Department of Health Informatics, School of Healthcare Professions
Rutgers University
๐Ÿ“ง sj1398@shp.rutgers.edu
๐Ÿ”— LinkedIn ๐Ÿ”— LinkedIn Post


Built as the final project for BINF5550 โ€“ Generative AI for Healthcare, Rutgers University, April 2026.

About

AI-powered medication instructions simplifier using 5-node LangGraph agentic pipeline, Hybrid RAG (FAISS + BM25 + Cross-Encoder), GPT-4o-mini, and multi-layer guardrails. Supports 10 languages. ROUGE-1: 0.41 | FK Grade: 7.8 | 100% guardrail pass rate. 248K drug records.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors