GPU-accelerated Self-Organizing Maps in PyTorch with a scikit-learn API, rich visualization, and clustering --- from dimensionality reduction to Just-In-Time Learning.
Paper | Documentation | Quick Start | Examples | Contributing
⭐ If you find torchsom valuable, please consider starring this repository ⭐
Self-Organizing Maps (SOMs) remain highly relevant in modern machine learning due to their interpretability, topology preservation, and computational efficiency. They are widely used in energy systems, biology, IoT, environmental science, and industrial applications.
Despite their utility, the Python SOM ecosystem is fragmented - existing implementations are often outdated, unmaintained, and lack GPU acceleration or integration with modern deep learning frameworks.
torchsom addresses these gaps as a reference PyTorch library for SOMs, providing:
- GPU-accelerated training via PyTorch CUDA backend
- Advanced clustering (K-Means, GMM, HDBSCAN) on the SOM latent space
- A scikit-learn-style API for ease of use and extensibility
- Rich visualization tools for both rectangular and hexagonal topologies
- Just-In-Time Learning (JITL) for supervised regression and classification
This library accompanies the paper: torchsom: The Reference PyTorch Library for Self-Organizing Maps (Berthier et al., 2025). If you use torchsom in academic or industrial work, please cite both the paper and the software (see Citation).
Benchmarked against MiniSom on synthetic datasets (240–16,000 samples, 4–300 features) with identical hyperparameters:
| Metric | Improvement |
|---|---|
| Training speed | Up to 99% faster (GPU) and 77–98% faster (CPU) |
| Topographic Error | 34–81% lower — better topology preservation |
| Quantization Error | Comparable fidelity across all configurations |
Hardware: Intel Xeon Platinum 8370C (CPU), NVIDIA Tesla T4 (GPU). See the paper for full benchmark tables.
Reproducing the JMLR benchmarks. All scripts, configurations, and the exact MiniSom pin (
v2.3.5/65b6ba6) used to produce these numbers are released underbenchmark/— seebenchmark/README.mdfor a step-by-step walkthrough. Two annotated tags pin the version of record:jmlr-submission-v1(original October 2025 submission) andjmlr-revision-v1(accepted revised version).git checkout <tag>reproduces the corresponding Table 2.
A SOM is an unsupervised neural network that maps high-dimensional data onto a low-dimensional grid (typically 2D) while preserving topological relationships. At each training step, the Best Matching Unit (BMU) — the neuron closest to the input — is identified, and its weights along with its neighbors are updated:
where
Training quality is assessed via Quantization Error (representation fidelity) and Topographic Error (topology preservation). See the documentation for the full mathematical background.
| torchsom | MiniSom | SimpSOM | SOMPY | somoclu | som-pbc | |
|---|---|---|---|---|---|---|
| Framework | PyTorch | NumPy | NumPy | NumPy | C++/CUDA | NumPy |
| GPU Acceleration | ✅ CUDA | ❌ | ✅ CuPy/CUML | ❌ | ✅ CUDA | ❌ |
| API Design | scikit-learn | Custom | Custom | MATLAB | Custom | Custom |
| Maintenance | ✅ Active | ✅ Active | ❌ | |||
| Documentation | ✅ Rich | ❌ | ❌ | |||
| Test Coverage | ✅ 90% | ❌ | ~53% | ❌ | Minimal | ❌ |
| Visualization | ✅ Advanced | ❌ | Moderate | Moderate | Basic | Basic |
| Clustering | ✅ Advanced | ❌ | ❌ | ❌ | ❌ | ❌ |
| JITL Support | ✅ Built-in | ❌ | ❌ | ❌ | ❌ | ❌ |
| SOM Variants | PBC, Growing*, Hierarchical* | ❌ | PBC | ❌ | PBC | PBC |
* Work in progress
Just-In-Time Learning (JITL): Given an online query, JITL collects relevant samples by topology and distance to form a local buffer. A lightweight local model is then trained on this buffer, enabling efficient supervised learning (regression or classification).
import torch
from torchsom.core import SOM
from torchsom.visualization import SOMVisualizer
som = SOM(x=10, y=10, num_features=3, epochs=50)
X = torch.randn(1000, 3)
som.initialize_weights(data=X, mode="pca")
q_errors, t_errors = som.fit(data=X)
visualizer = SOMVisualizer(som=som)
visualizer.plot_training_errors(
quantization_errors=q_errors, topographic_errors=t_errors
)
visualizer.plot_hit_map(data=X, batch_size=256)
visualizer.plot_distance_map(
distance_metric=som.distance_fn_name,
neighborhood_order=som.neighborhood_order,
scaling="sum",
)Explore our collection of Jupyter notebooks:
| Notebook | Task | Dataset |
|---|---|---|
iris.ipynb |
Multiclass classification | Iris |
wine.ipynb |
Multiclass classification | Wine |
boston_housing.ipynb |
Regression | Boston Housing |
energy_efficiency.ipynb |
Multi-output regression | Energy Efficiency |
clustering.ipynb |
Clustering analysis | Synthetic blobs |
This project uses uv for fast, reproducible dependency management.
uv add torchsomWith optional FAISS acceleration for BMU search:
uv add torchsom[faiss]git clone https://github.com/michelin/TorchSOM.git
cd TorchSOM
uv sync --all-extras # creates .venv and installs everythingAll Make targets use uv run so the correct environment is always activated:
make help # see all available commands
make cov # run tests with coverage
make check # lint / type-check
make fix # auto-format
make docs # build documentationComprehensive documentation is available at opensource.michelin.io/TorchSOM, including:
- Getting Started: installation, quick start, SOM concepts
- User Guide: visualization, architecture, benchmarks
- API Reference: core, utils, visualization, configs
- Additional Resources: FAQ, troubleshooting, changelog
If you use torchsom in your academic, research, or industrial work, please cite both the paper and the software:
@misc{berthier2025torchsom,
title={torchsom: The Reference PyTorch Library for Self-Organizing Maps},
author={Berthier, Louis and Shokry, Ahmed and Moreaud, Maxime
and Ramelet, Guillaume and Moulines, Eric},
year={2025},
eprint={2510.11147},
archivePrefix={arXiv},
primaryClass={stat.ML},
note={Preprint submitted to Journal of Machine Learning Research},
url={https://arxiv.org/abs/2510.11147}
}
@software{berthier2025torchsom_software,
author={Berthier, Louis},
title={torchsom: The Reference PyTorch Library for Self-Organizing Maps},
year={2025},
version={1.1.1},
url={https://github.com/michelin/TorchSOM},
note={Documentation available at \url{https://opensource.michelin.io/TorchSOM/}}
}For more details, see the CITATION file.
We welcome contributions from the community! See our Contributing Guide and Code of Conduct for details.
- GitHub Issues: Report bugs or request features
- Centre de Mathématiques Appliquées (CMAP) at École Polytechnique
- Manufacture Française des Pneumatiques Michelin for collaboration
- Giuseppe Vettigli for MiniSom inspiration
- The PyTorch team for the amazing framework
torchsom is licensed under the Apache License 2.0. See the LICENSE file for details.
- Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43(1), 59–69.
- Kohonen, T. (1990). The self-organizing map. Proceedings of the IEEE, 78(9), 1464–1480.
- Kohonen, T. (2001). Self-Organizing Maps. Springer.
- MiniSom: Minimalistic Python SOM
- SimpSOM: Simple Self-Organizing Maps
- SOMPY: Python SOM library
- somoclu: Massively Parallel Self-Organizing Maps
- som-pbc: SOM with periodic boundary conditions
- SOM Toolbox: MATLAB implementation









