MMAF: Multimodal Attention Fusion for Molecular Toxicity Prediction

This repository contains the code for the paper "MMAF: Multimodal Attention Fusion for Molecular Toxicity Prediction" accepted at ICPR'26. The code is organized as a standard research-code release with one code/ directory, small entry-point scripts, configuration files, setup checks, and separated output folders.

Repository layout

code/
  train.py              # main entry point for all datasets
  train_tox21.py        # Tox21 entry point
  train_bace.py         # BACE entry point
  train_bbbp.py         # BBBP entry point
  train_hiv.py          # HIV entry point
  train_clintox.py      # ClinTox entry point
  train_sider.py        # SIDER entry point
  run_all.py            # run all non-Tox21 datasets
  check_setup.py        # verify dataset files
  dataset.py            # dataset metadata
  graph.py              # graph/fingerprint metadata
  model.py              # model-family metadata
  metrics.py            # reported metric names
  utility_functions.py  # command helpers
  pipelines/            # experiment implementations
config/
  tox21.yaml
  molnet_datasets.yaml
data/
  README.md
results/
output/
script/
  train.sh
  run_all.sh
  check_data.sh

Installation

pip install -r requirements.txt

For CUDA machines, install PyTorch and PyTorch Geometric for the target CUDA version first if needed.

Data

Create data/ and put the CSV files with lowercase names:

data/tox21.csv
data/bace.csv
data/bbbp.csv
data/hiv.csv
data/clintox.csv
data/sider.csv

Check setup:

python code/check_setup.py

Training

Tox21:

python code/train_tox21.py --config config/tox21.yaml

Other datasets:

python code/train_bbbp.py --split random
python code/train_bbbp.py --split scaffold
python code/train_bace.py --split random
python code/train_bace.py --split scaffold
python code/train_hiv.py --split scaffold
python code/train_hiv.py --split random
python code/pipelines/multi_random.py --dataset clintox
python code/pipelines/multi_scaffold.py --dataset clintox
python code/pipelines/multi_random.py --dataset sider`
python code/pipelines/multi_scaffold.py --dataset sider``

Single unified command:

```bash
python code/train.py --dataset bbbp --split random
python code/train_hiv.py --split scaffold

Run all non-Tox21 experiments:

python code/run_all.py

Quick smoke test:

python code/train.py --dataset bbbp --split random --quick

Outputs

Results are written to:

results/
output/checkpoints/
output/splits/
output/invalid_smiles/

Reproducibility notes

Use the same dataset CSV files, the same split type, and the same environment/GPU type when comparing with the paper table. Small changes in PyTorch/CUDA/RDKit versions can slightly change results.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
code		code
config		config
data		data
docs		docs
script		script
tests		tests
.gitignore		.gitignore
HOW_TO_RUN_MMAF.txt		HOW_TO_RUN_MMAF.txt
HOW_TO_USE.txt		HOW_TO_USE.txt
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements-cpu.txt		requirements-cpu.txt
requirements-cuda.txt		requirements-cuda.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MMAF: Multimodal Attention Fusion for Molecular Toxicity Prediction

Repository layout

Installation

Data

Training

Outputs

Reproducibility notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MMAF: Multimodal Attention Fusion for Molecular Toxicity Prediction

Repository layout

Installation

Data

Training

Outputs

Reproducibility notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages