DrugReasoner: Interpretable Drug Approval Prediction with a Reasoning-augmented Language Model

DrugReasoner is an AI-powered system for predicting drug approval outcomes using reasoning-augmented Large Language Models (LLMs) and molecular feature analysis. By combining advanced machine learning with interpretable reasoning, DrugReasoner provides transparent predictions that can accelerate pharmaceutical research and development.

✨ Key Features

🤖 LLM-Powered Predictions: Utilizes fine-tuned Llama model for drug approval prediction
🧬 Molecular Analysis: Advanced SMILES-based molecular structure analysis
🔍 Interpretable Results: Clear reasoning behind predictions for better decision-making
📊 Similarity Analysis: Identifies similar approved/non-approved compounds for context
⚡ Flexible Inference: Support for both single molecule and batch predictions

🛠️ Installation

To use DrugReasoner, you must first request access to the base model Llama-3.1-8B-Instruct on Hugging Face by providing your contact information. Once access is granted, you can run DrugReasoner either through the command-line interface (CLI) or integrate it directly into your Python workflows.

Prerequisites

Python 3.8 or higher
CUDA-compatible GPU (recommended for training and inference)
Git

Setup Instructions

Clone the repository

git clone https://github.com/mohammad-gh009/DrugReasoner.git
cd DrugReasoner

Create and activate virtual environment

Windows:

cd src
python -m venv myenv
myenv\Scripts\activate

Mac/Linux:

cd src
python -m venv myenv
source myenv/bin/activate

Install dependencies
```
pip install -r requirements.txt
```
Login to your Huggingface account You can use this instruction on how to make an account and this on how to get the token
```
huggingface-cli login --token YOUR_TOKEN_HERE
```

🚀 How to use

Note: GPU is required for inference. If unavailable, use our Kaggle Notebook.

CLI Inference

python inference.py \
    --smiles "CC(C)CC1=CC=C(C=C1)C(C)C(=O)O" "CC1=CC=C(C=C1)C(=O)O" \
    --output results.csv \
    --top-k 9 \
    --top-p 0.9 \
    --max-length 4096 \
    --temperature 1.0

Python API Usage

from inference import DrugReasoner

predictor = DrugReasoner()

results = predictor.predict_molecules(
    smiles_list=["CC(C)CC1=CC=C(C=C1)C(C)C(=O)O"],
    save_path="results.csv",
    print_results=True,
    top_k=9,
    top_p=0.9,
    max_length=4096,
    temperature=1.0
)

📊 Dataset & Model

Dataset:
Model:

📈 Performance

DrugReasoner demonstrates superior performance compared to traditional baseline models across multiple evaluation metrics. Detailed performance comparisons are available in our paper.

📝 Citation

If you use DrugReasoner in your research, please cite our work:

@misc{ghaffarzadehesfahani2025drugreasonerinterpretabledrugapproval,
      title={DrugReasoner: Interpretable Drug Approval Prediction with a Reasoning-augmented Language Model}, 
      author={Mohammadreza Ghaffarzadeh-Esfahani and Ali Motahharynia* and Nahid Yousefian and Navid Mazrouei and Jafar Ghaisari and Yousof Gheisari},
      year={2025},
      eprint={2508.18579},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2508.18579}, 
}

📜 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Accelerating drug discovery through AI-powered predictions

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
datasets		datasets
models		models
outputs		outputs
properties		properties
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DrugReasoner: Interpretable Drug Approval Prediction with a Reasoning-augmented Language Model

✨ Key Features

🛠️ Installation

Prerequisites

Setup Instructions

🚀 How to use

CLI Inference

Python API Usage

📊 Dataset & Model

📈 Performance

📝 Citation

📜 License

About

Uh oh!

Releases 1

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DrugReasoner: Interpretable Drug Approval Prediction with a Reasoning-augmented Language Model

✨ Key Features

🛠️ Installation

Prerequisites

Setup Instructions

🚀 How to use

CLI Inference

Python API Usage

📊 Dataset & Model

📈 Performance

📝 Citation

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Contributors

Uh oh!

Languages