PANDA is a Python-based workflow for the automated analysis of 4D Scanning Transmission Electron Microscopy (4D-STEM) data. It was specifically developed for analyzing Perovskite Quantum Dots and Superlattices (e.g., CsPbBr3), focusing on particle detection, structural clustering, and internal strain mapping.
This repository contains Jupyter Notebooks that perform end-to-end analysis of 4D-STEM datasets. The code allows researchers to move from raw datacubes to quantified strain maps and domain classification.
The workflow includes:
- py4DSTEM + skimage Blob Analysis: Automatically identifies nanoparticle locations in spatial images using Laplacian of Gaussian (LoG) blob detection from scikit-image, integrated with py4DSTEM data structures.
- Diffraction Feature Extraction: Extracts and processes diffraction patterns from the 4D datacube for every detected particle.
- Unsupervised Clustering: Classifies particles based on crystallographic orientation and structural similarity using PCA (Principal Component Analysis), K-Means clustering, and UMAP.
- Virtual Imaging: Generates Bright Field (BF) and Dark Field (DF) images from the 4D data.
- Automated Crystal Orientation Mapping (ACOM): Matches experimental diffraction patterns to simulated crystal structures to determine orientation at every pixel.
-
Strain & Rotation Analysis: Quantifies relative lattice rotation and strain components (
$\epsilon_{xx}, \epsilon_{yy}, \epsilon_{xy}$ ) to identify internal twist domains within superlattices.
This code relies heavily on the py4DSTEM open-source toolkit for processing 4D-STEM data and scikit-image for image analysis.
To run the notebooks, you will need the following Python packages:
- py4DSTEM (Core analysis library)
- scikit-image (Image processing and blob detection)
- numpy & matplotlib (Data manipulation and plotting)
- scikit-learn (PCA and K-Means clustering)
- umap-learn (Dimensionality reduction for visualization)
- h5py (Handling HDF5 data files)
- pandas (Data organization)
- scipy (Statistical fitting)
This code was originally written to analyze experimental data for lead halide perovskite superlattices.
Note: The raw experimental datasets (.h5 and .prz files) referenced in the notebooks are not included in this repository due to size constraints and proprietary restrictions.
The notebooks are provided as a comprehensive reference for the analysis workflow. To use this code, you will need to adapt the file paths to point to your own 4D-STEM datasets formatted compatible with py4DSTEM.
- Clone this repository.
- Ensure your environment has the required dependencies installed.
- Open
PANDA.ipynbin Jupyter Notebook or Jupyter Lab. - Update the
file_pathanddirpathvariables to point to your local data.
