Skip to content

CVML-CFU/MMAF

Repository files navigation

MMAF: Multimodal Attention Fusion for Molecular Toxicity Prediction

This repository contains the code for the paper "MMAF: Multimodal Attention Fusion for Molecular Toxicity Prediction" accepted at ICPR'26. The code is organized as a standard research-code release with one code/ directory, small entry-point scripts, configuration files, setup checks, and separated output folders.

Repository layout

code/
  train.py              # main entry point for all datasets
  train_tox21.py        # Tox21 entry point
  train_bace.py         # BACE entry point
  train_bbbp.py         # BBBP entry point
  train_hiv.py          # HIV entry point
  train_clintox.py      # ClinTox entry point
  train_sider.py        # SIDER entry point
  run_all.py            # run all non-Tox21 datasets
  check_setup.py        # verify dataset files
  dataset.py            # dataset metadata
  graph.py              # graph/fingerprint metadata
  model.py              # model-family metadata
  metrics.py            # reported metric names
  utility_functions.py  # command helpers
  pipelines/            # experiment implementations
config/
  tox21.yaml
  molnet_datasets.yaml
data/
  README.md
results/
output/
script/
  train.sh
  run_all.sh
  check_data.sh

Installation

pip install -r requirements.txt

For CUDA machines, install PyTorch and PyTorch Geometric for the target CUDA version first if needed.

Data

Create data/ and put the CSV files with lowercase names:

data/tox21.csv
data/bace.csv
data/bbbp.csv
data/hiv.csv
data/clintox.csv
data/sider.csv

Check setup:

python code/check_setup.py

Training

Tox21:

python code/train_tox21.py --config config/tox21.yaml

Other datasets:

python code/train_bbbp.py --split random
python code/train_bbbp.py --split scaffold
python code/train_bace.py --split random
python code/train_bace.py --split scaffold
python code/train_hiv.py --split scaffold
python code/train_hiv.py --split random
python code/pipelines/multi_random.py --dataset clintox
python code/pipelines/multi_scaffold.py --dataset clintox
python code/pipelines/multi_random.py --dataset sider`
python code/pipelines/multi_scaffold.py --dataset sider``

Single unified command:

```bash
python code/train.py --dataset bbbp --split random
python code/train_hiv.py --split scaffold

Run all non-Tox21 experiments:

python code/run_all.py

Quick smoke test:

python code/train.py --dataset bbbp --split random --quick

Outputs

Results are written to:

results/
output/checkpoints/
output/splits/
output/invalid_smiles/

Reproducibility notes

Use the same dataset CSV files, the same split type, and the same environment/GPU type when comparing with the paper table. Small changes in PyTorch/CUDA/RDKit versions can slightly change results.

About

Code for Multimodal Attention Fusion for Molecular Toxicity Prediction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors